This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Bayesian Statistics”.
1. By whom and when were the Bayesian methods applied first?
a) Smith-Waterman, 1981
b) Agarwal and States, 1996
c) Smith-Waterman, 1996
d) Agarwal and States, 1981
Explanation: Agarwal and States, in1996, have applied Bayesian methods to provide the best estimate of the evolutionary distance between two DNA sequences. For example, sequences of the same length that have a certain level of mismatches.
2. With the application of Bayesian methods, the most probable repeat length and evolutionary time since the repeat was formed may be derived.
Explanation: Sequences of this type originated from gene duplication events in the yeast and Caenorhabditis elegans genomes. When there are multiple mismatches between such repeated sequences, it is difficult to determine the most likely length of the repeats. Here the methods can be used.
3. If the purpose is to calculate the probability of one event AND a second event, the odds scores for the events are _________
c) multiplied and added
Explanation: An example is the calculation of the odds of an alignment of two sequences from the alignment scores for each of the matched pairs of bases or amino acids in the alignment. The odds scores for the pairs are multiplied. Usually, the log odds score for the first pair is added to that for the second, etc., until the scores for every pair have been added.
4. In a type of probability, analysis is to calculate the odds score for one event OR a second event, or of a series of events. In this case, the odds scores are _______
c) added and multiplied
Explanation: An example is the calculation of the odds score for a given sequence alignment using a series of alternative PAM scoring matrices. The alignment scores are calculated in log odds units and then converted into odds scores.
5. In Bayesian methods, difficulty with making estimations is that the estimate depends on the
Assumption– The mutation rate in sequences has been constant with time and that the rate of mutation of all nucleotides is the same.
Explanation: The assumption mentioned above (the molecular clock hypothesis) is made to reduce the complications. Such problems may be solved by scoring different portions of a sequence with a different scoring matrix, and then using the above Bayesian methods to calculate the best evolutionary distance.
6. Another difficulty in Bayesian methods is deciding on the length of sequence that was duplicated
Explanation: In genomes, the presence of repeats may be revealed by long regions of matched sequence positions dispersed among regions of sequence positions that do not match. However, as the frequency of mismatches is increased, it becomes difficult to determine the extent of the repeated region.
7. A length and distance that gives the highest overall probability may then be determined. Such alignments are initially found using ________
a) a particular scoring matrix only
b) an alignment algorithm only
c) an alignment algorithm and a particular scoring matrix
d) dot method
Explanation: Analysis of the yeast and C. elegans genomes for such repeats has underscored the importance of using a range of DNA scoring matrices such as PAM1 to
PAM120 if most repeats are to be found. The application of the above Bayesian analysis allows a determination of the probability distributions as a function of both length of the repeated region and evolutionary distance.
8. Which of the following feature of Bayesian methods is the disadvantage of it?
a) A length and distance that gives the highest overall probability may be determined
b) They are used to calculate evolutionary distance
c) Computationally Bayesian methods are better
d) A specific mutational model is required
Explanation: One disadvantage of the Bayesian approach is that a specific mutational model is required, whereas other methods, such as the maximum likelihood approach, can be used to estimate the best mutational model as well as the distance. Computationally, however, the Bayesian method is much more practical.
9. Zhu (1998) have devised a computer program called the Bayes block aligner which in effect slides ____ sequences along each other to find the ______ ungapped regions or blocks.
a) two, least scoring
b) two, highest scoring
c) multiple, highest scoring
d) multiple, least scoring
Explanation: These blocks are then joined in various combinations to produce alignments.
There is no need for gap penalties because only the aligned sequence positions in blocks are scored. Instead of using a given substitution matrix and gap scoring system to find the highest scoring alignment, a Bayesian statistical approach is used.
10. Unlike the commonly used methods for aligning a pair of sequences, the Bayesian method _______ using a particular scoring matrix or designated gap penalties.
a) does not depend on
b) depends on
c) is based on
Explanation: Because it doesn’t depend on the mentioned techniques, there is no need to choose a particular scoring system or gap penalty. Instead, a number of different scoring matrices and range of block numbers up to some reasonable maximum are examined, and the most probable alignments are determined. The Bayesian method provides a distribution of alignments weighted according to probability and can also provide an estimate of the evolutionary distance between the sequences that is independent of scoring matrix and gaps.
Sanfoundry Global Education & Learning Series – Bioinformatics.
To practice all areas of Bioinformatics, here is complete set of 1000+ Multiple Choice Questions and Answers.