This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Prediction Algorithms – 1”.
1. Ab initio type of algorithm predicts prokaryotic and eukaryotic promoters and regulatory elements based on characteristic sequences patterns for promoters and regulatory elements.
Explanation: Some ab initio programs are signal based, relying on characteristic promoter sequences such as the TATA box. Other programs rely on content information such as hexamer frequencies.
2. The advantage of the ab initio method is that the sequence can be applied as such without having to obtain experimental information.
Explanation: The limitation is the need for training, which makes the prediction programs species specific. In addition, this type of method has a difficulty in discovering new, unknown motifs.
3. Which of the following is incorrect regarding the ab initio approaches?
a) The conventional approach to detecting a promoter or regulatory site is through matching a consensus sequence pattern represented by regular expressions
b) The conventional approach to detecting a promoter or regulatory site is through matching a position-specific scoring matrix constructed from well-characterized binding sites
c) The consensus sequences or the matrices are relatively short, covering 6 to 10 bases
d) The consensus sequences or the matrices are relatively large, covering 700 to 1000 bases
Explanation: To determine whether a query sequence matches a weight matrix, the sequence is scanned through the matrix. Scores of matches and mismatches at all matrix positions are summed up to give a log odds score, which is then evaluated for statistical significance. This simple approach, however, often has difficulty differentiating true promoters from random sequence matches and generates high rates of false positives as a result.
4. To improve the specificity of prediction, some algorithms selectively ______ coding regions and focus on the upstream regions________, which are most likely to contain promoters. In that sense, promoter prediction and gene prediction are coupled.
a) include, (0.5 to 2.0 kb) only
b) include, (0.5 to 2.0 Mb) only
c) exclude, (0.5 to 2.0 Mb) only
d) exclude, (0.5 to 2.0 kb) only
Explanation: To better discriminate true motifs from background noise, a new generation of algorithms has been developed that take into account the higher order correlation of multiple subtle features by using discriminant functions, neural networks, or hidden Markov models (HMMs) that are capable of incorporating more neighboring sequence information.
5. Operon prediction is less important in prokaryotic promoter prediction.
Explanation: One of the unique aspects in prokaryotic promoter prediction is the determination of operon structures, because genes within an operon share a common promoter located upstream of the first gene of the operon. Hence, operon prediction is the key in prokaryotic promoter prediction.
6. Once an operon structure is known, ______ for the presence of a promoter and regulatory elements, _____ in the operon do not possess such DNA elements.
a) only the first gene is predicted, whereas other genes
b) only the first hundred genes are predicted, whereas next few genes
c) only first two genes are predicted, whereas next few genes
d) only first ten genes are predicted, whereas next few genes
Explanation: Only the first gene is predicted for the presence of a promoter and regulatory elements, whereas other genes in the operon do not possess such DNA elements. There are a number of methods available for prokaryotic operon prediction. The most accurate is a set of simple rules developed.
7. Which of the following is correct regarding the method for prokaryotic operon prediction?
a) It relies on two kinds of information: gene orientation and intergenic distances of a pair of genes of interest and conserved linkage of the genes based on comparative genomic analysis
b) It relies only on the gene orientation and intergenic distances of a pair of genes of interest
c) It relies only on the conserved linkage of the genes based on comparative genomic analysis
d) The prediction cannot be done manually using the rules
Explanation: A scoring scheme is developed to assign operons with different levels of Confidence. This method is claimed to produce accurate identification of an operon structure, which in turn facilitates the promoter prediction. The prediction can be done manually using the rules. The few dedicated programs for prokaryotic promoter prediction do not apply the rule for historical reasons. The most frequently used program is BPROM.
8. Which of the following is incorrect regarding BPROM?
a) It is a web-based program for prediction of bacterial promoters
b) It is a web-based program only for prediction of eukarotic promoters
c) It uses a linear discriminant function
d) The linear discriminant function is combined with signal and content information
Explanation: The linear discriminant function is combined with signal and content Information such as consensus promoter sequence and oligonucleotide composition of the promoter sites. This program first predicts a given sequence for bacterial operon structures by using an intergenic distance of 100 bp as basis for distinguishing genes to be in an operon.
9. In BPROM, once the operons are assigned, the program is able to predict putative promoter sequences.
Explanation: The most bacterial promoters are located within 200 bp of the protein coding region. Hence, the program is most effectively used when about 200 bp of upstream sequence of the first gene of an operon is supplied as input to increase specificity.
10. Which of the following is incorrect regarding FindTerm?
a) It is a program for searching bacterial ρ-independent termination signals located at the end of operons
b) It is a program for searching bacterial ρ-dependent termination signals located within the operons
c) The predictions are made based on matching of known profiles of the termination signals combined with energy calculations
d) It is available from the same site as FGENES and BPROM.
Explanation: The predictions are made based on matching of known profiles of the termination signals combined with energy calculations for the derived RNA secondary structures for the putative hairpin-loop structure. The sequence region that scores best in features and energy terms is chosen as the prediction. The information can sometimes be useful in defining an operon.
11. Which of the following is incorrect regarding the Prediction for Eukaryotes?
a) The consensus patterns are only derived from bioinformatics studies
b) The experimentally determined DNA binding sites are compiled into profiles and stored in a database for scanning an unknown sequence to find similar conserved patterns
c) The consensus patterns are derived from experimentally determined DNA binding sites
d) The ab initio method for predicting eukaryotic promoters and regulatory elements relies on searching the input sequences for matching of consensus patterns of known promoters and regulatory elements
Explanation: This approach tends to generate very high rate of false positives owing to nonspecific matches with the short sequence patterns. Furthermore, because of the high variability of transcription factor binding sites, the simple sequence matching often misses true promoter sites, creating false negatives.
12. To increase the specificity of prediction, a unique feature of eukaryotic promoter is employed, which is the presence of CpG islands.
Explanation: It is known that many vertebrate genes are characterized by a high density of CG dinucleotides near the promoter region overlapping the transcription start site. By identifying the CpG islands, promoters can be traced on the immediate upstream region from the islands.
By combining CpG islands and other promoter signals, the accuracy of prediction can be improved. Several programs have been developed based on the combined features to predict the transcription start sites in particular.
Sanfoundry Global Education & Learning Series – Bioinformatics.
To practice all areas of Bioinformatics, here is complete set of 1000+ Multiple Choice Questions and Answers.