Bioinformatics Questions and Answers – Prediction Algorithms – 2

This set of Bioinformatics Assessment Questions and Answers focuses on “Prediction Algorithms – 2”.

1. The eukaryotic transcription initiation is less dependent on transcription factors.
a) True
b) False
View Answer

Answer: b
Explanation: The eukaryotic transcription initiation requires cooperation of a large number of transcription factors. Co-operativity means that the promoter regions tend to contain a high density of protein-binding sites. Thus, finding a cluster of transcription factor binding sites often enhances the probability of individual binding site prediction.

2. CpGProD is a web-based program that predicts promoters containing a high density of CpG islands _______
a) in archea genomic sequences
b) in mammalian genomic sequences
c) in eukaryotic and bacterial genomic sequences
d) only in bacterial genomic sequences
View Answer

Answer: b
Explanation: It calculates moving averages of GC% and CpG ratios (observed/expected) over a window of a certain size (usually 200 bp). When the values are above a certain threshold, the region is identified as a CpG island.

3. Which of the following is incorrect regarding Eponine?
a) It is a web-based program that predicts transcription start sites
b) It is a web-based program that particularly predicts tranpososons and retropososons
c) The regulatory sites include the TATA box, the CCAAT box, and CpG islands
d) It is based on a series of pre-constructed PSSMs of several regulatory sites
View Answer

Answer: b
Explanation: The query sequence from a mammalian source is scanned through the PSSMs. The sequence stretches with high-score matching to all the PSSMs, as well as matching of the spacing between the elements, are declared transcription start sites. A Bayesian method is also used in decision making.
advertisement
advertisement

4. Which of the following is incorrect regarding Cluster-Buster?
a) It is an HMM-based web-based program
b) A query sequence is scanned with a window size of 1 kb for putative regulatory motifs using motif HMMs
c) It works by detecting a region of high concentration of unknown transcription factor binding sites and regulatory motifs at the initiation
d) It is designed to find clusters of regulatory binding sites
View Answer

Answer: c
Explanation: It works by detecting a region of high concentration of known transcription factor binding sites and regulatory motifs. If multiple motifs are detected within a window, a positive score is assigned to each motif found. The total score of the window is the sum of each motif score subtracting a gap penalty, which is proportional to the distances between motifs. If the score of a certain region is above a certain threshold, it is predicted to contain a regulatory cluster.

5. Which of the following is incorrect regarding First EF?
a) It is a program that predicts promoters for bacterial DNA
b) It is a web-based program that predicts promoters for human DNA
c) It stands for First Exon Finder
d) It integrates gene prediction with promoter prediction
View Answer

Answer: a
Explanation: It uses quadratic discriminant functions (see Chapter 8) to calculate the probabilities of the first exon of a gene and its boundary sites. A segment of DNA (15 kb) upstream of the first exon is subsequently extracted for promoter prediction on the basis of scores for CpG islands.
Sanfoundry Certification Contest of the Month is Live. 100+ Subjects. Participate Now!

6. McPromoter, a web-based program, uses a neural network to make promoter predictions.
a) True
b) False
View Answer

Answer: a
Explanation: It has a unique promoter model containing six scoring segments. The program scans a window of 300 bases for the likelihoods of being in each of the coding, noncoding, and promoter regions.

7. The input for the neural network includes parameters for sequence physical properties, such as ______
a) DNA bendability
b) Signals such as the TATA box
c) Signals such as initiator box
d) Signals such as CpAA islands
View Answer

Answer: d
Explanation: As seen, the correct answer is CpG in option d. The hidden layer combines all the features to derive an overall likelihood for a site being a promoter. Another unique feature is that McPromoter does not require that certain patterns must be present, but instead the combination of all features is important. For instance, even if the TATA box score is very low, a promoter prediction can still be made if the other features score highly. The program is currently trained for Drosophila and human sequences.
advertisement

8. TSSW is a web program that distinguishes promoter sequences from non-promoter sequences based on a combination of unique content information such as hexamer/trimer frequencies and signal information such the TATA box in the promoter region.
a) True
b) False
View Answer

Answer: a
Explanation: As mentioned here, TSSW uses unique content information such as hexamer/trimer frequencies and signal information such the TATA box in the promoter region. The values are fed to a linear discriminant function to separate true motifs from background noise.

9. Which of the following is incorrect regarding CONPRO?
a) It is a web-based program that uses a consensus method
b) It is used to identify promoter elements for human DNA
c) cDNA does not play a role in prediction
d) The program uses the information to search the human genome database for the position of the gene
View Answer

Answer: c
Explanation: To use the program, a user supplies the transcript sequence of a gene (cDNA). It then uses the GENSCAN program to predict 5’ untranslated exons in the upstream region. Once the 5’-most exon is located, a further upstream region (1.5 kb) is used for promoter prediction, which relies on a combination of five promoter prediction programs, TSSG, TSSW, NNPP, PROSCAN, and PromFD.
advertisement

10. In CONPRO, for each program, the highest score prediction is taken as the promoter in the region.
a) True
b) False
View Answer

Answer: a
Explanation: If three predictions fall within a 100-bp region, this is considered a consensus prediction If no three-way consensus is achieved, TSSG and PromFD predictions are taken. Because no coding sequence is used in prediction, specificity is improved relative to each individual program.

Sanfoundry Global Education & Learning Series – Bioinformatics.

To practice all areas of Bioinformatics Assessment Questions, here is complete set of 1000+ Multiple Choice Questions and Answers.

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.