Bioinformatics Questions and Answers – Position – Specific Scoring Matrices

«
»

This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Position – Specific Scoring Matrices”.

1. Analysis of s for conserved blocks of sequence leads to production of the position-specific scoring matrix.
a) True
b) False
View Answer

Answer: a
Explanation: The analysis of MSAs (Multiple Sequence Alignment) for conserved blocks of sequence leads to production of the position-specific scoring matrix or PSSM. The PSSM may be used to search a sequence to obtain the most probable location or locations of the motif represented by the PSSM. Alternatively, the PSSM may be used to search an entire database to identify additional sequences that also have the same motif.
advertisement

2. The quality and quantity of information provided by the PSSM also varies for ________ in the motif.
a) each row
b) each column
c) rows and columns
d) neither the rows nor the columns
View Answer

Answer: b
Explanation: The quality and quantity of information provided by the PSSM also varies for each column in the motif, and this variation profoundly influences the matches found with sequences. This situation can be accurately described by information theory, and the results can be displayed by a colored graph called a sequence logo.

3. Two considerations arise in trying to tune the PSSM so that it adequately represents the training sequences. Which of the following is not their description?
a) If a given column in 20 sequences has only isoleucine, it is not very likely that different amino acid will be found in other sequences with that motif because the residue is probably important for function
b) If a given column in 20 sequences has only isoleucine, it is very likely that different amino acid will be found in other sequences with that motif because the residue is probably important for function
c) If the number of sequences with the found motif is large and reasonably diverse, the sequences represent a good statistical sampling of all sequences that are ever likely to be found with that same motif
d) Another column in the motif from the 20 sequences may have several amino acids, and some amino acids may not be represented at all
View Answer

Answer: b
Explanation: The PSSM is constructed by a simple logarithmic transformation of a matrix giving the frequency of each amino acid in the motif. Even more variation may be expected at that position in other sequences, although the more abundant amino acids already found in that column would probably be favored.

4. If a good sampling of sequences is _______ the number of sequences is _________ and the motif structure is ________ it should, in principle, be possible to obtain frequencies highly representative of the same motif in other sequences also.
a) available, sufficiently large, not too complex
b) unavailable, sufficiently large, not too complex
c) unavailable, sufficiently small, not too complex
d) available, sufficiently large, too complex
View Answer

Answer: a
Explanation: The more abundant amino acids already found in that column would probably be favored. Thus, if a good sampling of sequences is available, the number of sequences is sufficiently large, and the motif structure is not too complex, it should, in principle, be possible to obtain frequencies highly representative of the same motif in other sequences also (Henikoff and Henikoff 1996).

5. If the data set is _______ then unless the motif has __________ amino acids in each column, the column frequencies in the motif may not be highly representative of all other occurrences of the motif.
a) small, distinct
b) small, almost identical
c) large, almost identical
d) large, distinct
View Answer

Answer: b
Explanation: The number of sequences for producing the motif may be small, highly diverse, or complex, giving rise to a second level of consideration. If the data set is small, then unless the motif has almost identical amino acids in each column, the column frequencies in the motif may not be highly representative of all other occurrences of the motif. In such cases, it is desirable to improve the estimates of the amino acid frequencies by adding extra amino acid counts, called pseudocounts, to obtain a more reasonable distribution of amino acid frequencies in the column.
advertisement

6. Even if many pseudocounts are added in comparison to real sequence counts, the amino acid frequencies will not have any effect or influence.
a) True
b) False
View Answer

Answer: b
Explanation: Knowing how many counts to add is a difficult but fortunately solvable problem. On the one hand, if too many pseudocounts are added in comparison to real sequence counts, the pseudocounts will become the dominant influence in the amino acid frequencies and searches using the motif will not work. On the other hand, if there are relatively few real counts, many amino acid variations may not be present because of the small sample of sequences.

7. Which of the following is not a feature of editors and formatters?
a) provision for displaying the sequence on a color monitor with residue colors to aid in a clear visual representation of the alignment
b) recognition of the multiple sequence format that was output by the MSA (Multiple Sequence Alignment) program
c) maintenance of the alignment in a suitable format when the editing is completed
d) disallowing shading conserved residues in the alignment
View Answer

Answer: d
Explanation: In addition to this, provision of a suitable windows interface, allowing use of the mouse to add, delete, or move sequence followed by an updated display of the alignment, is a feature. In addition, there are other types of editing that are commonly performed on MSAs (Multiple Sequence Alignment) program such as, for example, shading conserved residues in the alignment.

8. GDE (Genetic Data Environment) provides a general interface on UNIX machines for sequence analysis, sequence alignment editing, and display.
a) True
b) False
View Answer

Answer: a
Explanation: It is available from several anonymous FTP sites. This interface requires communication with a host UNIX machine running the Genetics Computer Group software. Interface with MS-DOS or Macintosh is possible if the computer is equipped with the appropriate X-Windows client software.

9. MACAW is a local multiple sequence alignment program only.
a) True
b) False
View Answer

Answer: b
Explanation: MACAW is both a local multiple sequence alignment program and a sequence editing tool. Given a set of sequences, the program finds ungapped blocks in the sequences and gives their statistical significance. Later versions of the program find blocks by one of three user-chosen methods.
advertisement

10. Two commonly encountered examples are the Genetics Computer Group’s MSF format and the CLUSTALW ALN format.
a) True
b) False
View Answer

Answer: a
Explanation: This is because these formats follow a precise outline, one may be readily converted to another by computer programs. READSEQ by D.G.Gilbert at Indiana University at Bloomington is one such program.

Sanfoundry Global Education & Learning Series – Bioinformatics.

To practice all areas of Bioinformatics, here is complete set of 1000+ Multiple Choice Questions and Answers.

Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

advertisement
advertisement
advertisement
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He is Linux Kernel Developer & SAN Architect and is passionate about competency developments in these areas. He lives in Bangalore and delivers focused training sessions to IT professionals in Linux Kernel, Linux Debugging, Linux Device Drivers, Linux Networking, Linux Storage, Advanced C Programming, SAN Storage Technologies, SCSI Internals & Storage Protocols such as iSCSI & Fiber Channel. Stay connected with him @ LinkedIn