# Bioinformatics Questions and Answers – Motif and Domain Databases Using Regular Expressions

This set of Bioinformatics Interview Questions and Answers focuses on “Motif and Domain Databases Using Regular Expressions”.

1. While scanning for similarities in motifs, how regular expressions’ techniques work?
a) It represents a sequence family by a string of characters and further compares them
b) An algorithm similar to dynamic programming is used
c) Dot matrix analysis is used in this type of sequence analysis
d) Matrix analysis methods are used in this type

Explanation: In regular expressions’ techniques Pattern matching is defined as true or false in answer or outcome. In other words, if the pattern described in regex is found in a string of letters, the answer is true.

2. Which of the following best defines regular expressions?
a) They are made up of terms, operators and modifiers
b) They describe string or set of strings to find matching patterns
c) They are strictly restricted to alignment and corresponding score
d) They consist of set of rules for the connotations of various amino acid residues

Explanation: Regular expressions are powerful notable algebra that describe string or set of strings to find matching patterns. Pattern matching is defined as true or false in answer or outcome. And it is true that they are made up of terms, operators and modifiers but they are terminologies further used in matching process.

3. In regular expressions, which of the following pair of pattern is wrongly matched with its significance?
a) [ ] – Or
b) { } – Not
c) ( ) – Repeats
d) Z – Any

Explanation: Regular Expression Symbols have their own significances in regular expressions system means [GA] .g.e rFo ‘G or A’, {V,P} means not P or V, x(4) means (xxxx). Likewise, X denotes any character.

4. In terminologies related to regular expressions which of the following is false about terms and operators?
a) Terms are strings or substrings
b) Operators combine terms and expressions
c) Operators do not have precedence
d) Operators have precedence like arithmetic operators

Explanation: For harmonious, efficient and error-free functioning of the matching preocess, operators have precedence in order to set the priority of the operations to be carried out during the alignment.

5. In regular expressions, which of the following pair of pattern is wrongly matched with its significance?
a) ‘-’ – separator
b) < – N-terminal
c) > – C-terminal
d) ‘>>’ – end

Explanation: Regular Expression Symbols have their own significances in regular expressions’ system. For e.g. x(2,3) means x-x or x-x-x. Similarly, ‘.’ means end.

6. Emotif uses which databases for alignment of sequences?
a) BLOCKS and PRINTS databases
b) PROSITE
c) BLOCKS
d) PRINTS

Explanation: Emotif is a motif database that uses multiple sequence alignments from both the BLOCKS and PRINTS databases with an alignment collection much larger than PROSITE. It identifies patterns by allowing fuzzy matching of regular expressions. Therefore, it produces fewer false negatives than PROSITE.

7. While analysing motif sequences, what is the major disadvantageous feature of PROSITE?
a) The database constructs profiles to complement some of the sequence patterns
b) The functional information of these patterns is primarily based on published literature
c) Some of the sequence patterns are too short to be specific
d) Lack of specificity about probability and variation and relation between them

Explanation: The major pitfall with the PROSITE patterns is that some of the sequence patterns are too short to be specific. Rest of the options are advantages. The problem with these short sequence patterns is that the resulting match is very likely to be a result of random events. Overall, PROSITE has a greater than 20% error rate. Thus, either a match or non-match in PROSITE should be treated with caution.

8. Which of the following is not a characteristic of Fuzzy or approximate matches in regular expression?
a) This method is able to include more variant forms of a motif with a conserved function
b) the rule of matching is based on observations, not actual assumptions
c) with the more relaxed matching, there is increase of the noise level and false positives
d) the rule of matching is based on assumptions not actual observations

Explanation: The rule of matching is based on assumptions not actual observations in Fuzzy or approximate matches in regular expression. This provides more permissive matching by allowing more flexible matching of residues of similar biochemical properties. For example, if an original alignment only contains phenylalanine at a particular position, fuzzy matching allows other aromatic residues (including unobserved tyrosine and tryptophan) in a sequence to match with the expression.

9. Which of the following is not a characteristic of exact matches in regular expression?
a) There must be a strict match of sequence patterns
b) Any variations in the query sequence from the predefined patterns are not allowed
c) Provide more permissive matching by allowing more flexible matching of residues of similar biochemical properties
d) Searching a motif database using this approach results in either a match or non-match

Explanation: In this type of matching, there has to be a strict match of sequence patterns. This way of searching has a good chance of missing truly relevant motifs that have slight variations, thus generating false-negative results. As new sequences of motif are being accumulated, the rigid regular expression tends to become obsolete if not updated regularly to reflect the changes.

10. What does this representation mean- R.L.[EQD]?
a) An arginine- Amino acid- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
b) An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine
c) An arginine- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
d) An arginine- Leucine- Apartic acid and glutamic acid and glutamine

Explanation: This is an example of pexel motif. Here, the ‘.’ represents the ‘end’ i.e. the amino acid as mentioned in the answer and the [ ] means ‘or’ i.e. either of the mentioned residue is present in the given position.

Sanfoundry Global Education & Learning Series – Bioinformatics.

To practice all areas of Bioinformatics for Interviews, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]