This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Gene Prediction in Prokaryotes”.
1. Which of the following is a wrong statement?
a) Prokaryotes include bacteria and Archaea
b) Prokaryotes have relatively large genomes
c) Prokaryotes have relatively small genomes
d) In Prokaryotes, The gene density in the genomes is high, with more than 90% of a genome sequence containing coding sequence
Explanation: Prokaryotes have relatively small genomes with sizes ranging from0.5 to 10Mbp (1Mbp=106 bp). Each prokaryotic gene is composed of a single contiguous stretch of ORF coding for a single protein or RNA with no interruptions within a gene.
2. In bacteria, the majority of genes have a start codon ATG (orAUG in mRNA; because prediction is done at the DNA level, T is used in place of U), which codes for methionine.
Explanation: Occasionally, GTG and TTG are used as alternative start codons. But methionine is still the actual amino acid inserted at the first position.
3. The presence of these codons at The beginning of the frame _____ give a clear indication of the translation initiation site.
b) does not necessarily
c) does not
Explanation: Because there may be multiple ATG, GTG, or TGT codons in a frame, the presence of these codons at the beginning of the frame does not necessarily give a clear indication of the translation initiation site. Instead, to help identify this initiation codon, other features associated with translation are used.
4. Shine-Delgarno sequence, which is a stretch of purine-rich sequence complementary to 16S rRNA in the ribosome.
Explanation: It is located immediately downstream of the transcription initiation site and slightly upstream of the translation start codon. In many bacteria, it has a consensus motif of AGGAGGT. Identification of the ribosome binding site can help locate the start codon.
5. There are ____ possible stop codons, identification of which is straightforward.
Explanation: At the end of the protein coding region is a stop codon that causes translation to stop. There are three possible stop codons, identification of which is straightforward. Many prokaryotic genes are transcribed together as one operon.
6. Which of the following is a wrong statement regarding the conventional determination of open reading frames?
a) Without the use of specialized programs, prokaryotic gene identification can rely on manual determination of ORFs and major signals related to prokaryotic genes
b) Prokaryotic DNA is first subject to conceptual translation in all six possible frames, two frames forward and four frames reverse
c) A stop codon occurs in about every twenty codons by chance in a noncoding region
d) Prokaryotic DNA is first subject to conceptual translation in all six possible frames, three frames forward and three frames reverse
Explanation: Prokaryotic DNA is first subject to conceptual translation in all six possible frames, three frames forward and three frames reverse. Because a stop codon occurs in about every twenty codons by chance in a noncoding region, a frame longer than thirty codons without interruption by stop codons is suggestive of a gene coding region, although the threshold for an ORF is normally set even higher at fifty or sixty codons.
7. The putative ORF can be translated into a protein sequence, which is then used to search against a protein database.
Explanation: The putative frame is further manually confirmed by the presence of other signals such as a start codon and Shine–Delgarno sequence. Detection of homologs from this search is probably the strongest indicator of a protein-coding frame.
8. Which of the following is a wrong statement regarding TESTCODE method?
a) This is based on the nucleotide composition of the third position of a codon
b) In practice, because genes can be in any of the six frames, the statistical patterns are computed for all possible frames
c) It is implemented in the commercial GCG package
d) It exploits the fact that the third codon nucleotides in a coding region fails to repeat themselves
Explanation: In a coding sequence, it has been observed that this position has a preference to use G or C over A or T. By plotting the GC composition at this position, regions with values significantly above the random level can be identified, which are indicative of the presence of ORFs. This method exploits the fact that the third codon nucleotides in a coding region tend to repeat themselves.
9. The conventional determination of open reading methods identify only typical genes and tend to miss atypical genes in which the rule of codon bias is not strictly followed.
Explanation: These statistical methods, which are based on empirical rules, examine the statistics of a single nucleotide (either G or C). To improve the prediction accuracies, the new generation of prediction algorithms uses more sophisticated statistical models.
10. Which of the following is a wrong statement regarding Gene Prediction Using Markov Models and Hidden Markov Models?
a) Markov models and HMMs can be very helpful in providing finer statistical description of a gene
b) A Markov model describes the probability of the distribution of nucleotides in a DNA sequence
c) In a Markov model the conditional probability of a particular sequence position depends on k alternate positions
d) A zero-order Markov model assumes each base occurs independently with a given probability
Explanation: In a Markov model the conditional probability of a particular sequence position depends on k previous positions. In this case, k is the order of a Markov model. In a zero-order Markov model, it is often the case for noncoding sequences. A first-order Markov model assumes that the occurrence of a base depends on the base preceding it. A second-order model looks at the preceding two bases to determine which base follows, which is more characteristic of codons in a coding sequence.
11. The use of Markov models in gene finding exploits the fact that oligonucleotide distributions in the coding regions are different from those for the noncoding regions.
Explanation: These can be represented with various orders of Markov models. Since a fixed-order Markov chain describes the probability of a particular nucleotide that depends on previous k nucleotides, the longer the oligomer unit, the more non-randomness can be described for the coding region. Therefore, the higher the order of a Markov model, the more accurately it can predict a gene.
12. Because a protein-encoding gene is composed of nucleotides in triplets as codons, more effective Markov models are built in sets of three nucleotides, describing nonrandom distributions of trimers or hexamers, and so on.
Explanation: The parameters of a Markov Model have to be trained using a set of sequences with known gene locations. Once the parameters of the model are established, it can be used to compute the nonrandom distributions of trimers or hexamers in a new sequence to find regions that are compatible with the statistical profiles in the learning set.
Sanfoundry Global Education & Learning Series – Bioinformatics.
To practice all areas of Bioinformatics, here is complete set of 1000+ Multiple Choice Questions and Answers.