This set of Bioinformatics Problems focuses on “Prediction Algorithms – 3”.
1. Which of the following is incorrect regarding Phylogenetic Footprinting–Based Method?
a) It is possible to obtain promoter sequences for a particular gene through comparative analysis
b) The conservation from closely related organisms is both at the sequence level and at the level of organization of the elements
c) The conservation from closely related organisms is most at the sequence level
d) It has been observed that promoter and regulatory elements from closely related organisms such as human and mouse are highly conserved
Explanation: The identification of conserved noncoding DNA elements that serve crucial functional roles is referred to as phylogenetic footprinting; the elements are called phylogenetic footprints. This type of method can apply to both prokaryotic and eukaryotic sequences.
2. A caveat of phylogenetic footprinting is to extract noncoding sequences Upstream of corresponding genes and focus the comparison to this region only, which helps to prevent false positives.
Explanation: The predictive value of this method also depends on the quality of the subsequent sequence alignments. The advanced alignment programs can be used. Even more sophisticated expectation maximization (EM) and Gibbs sampling algorithms can be used in detecting weakly conserved motifs.
3. Which of the following is untrue about ConSite?
a) It is a web server that finds putative promoter elements
b) It includes comparing two orthologous sequences
c) The program does not accept pre-computed alignment
d) The program accepts pre-computed alignment
Explanation: The user provides two individual sequences which are aligned by ConSite using a global alignment algorithm. Conserved regions are identified by calculating identity scores, which are then used to compare against a motif database of regulatory sites (TRANSFAC). High-scoring sequence segments upstream of genes are returned as putative regulatory elements.
4. rVISTA uses two orthologous sequences as input and first identifies all putative regulatory motifs based on TRANSFAC matches.
Explanation: rVISTA is a cross-species comparison tool for promoter recognition. It aligns the two sequences using a local alignment strategy. The motifs that have the highest percent identity in the pairwise comparison are presented graphically as regulatory elements.
5. Which of the following is untrue about Bayes Aligner?
a) Posterior probability values, which are considered estimates of the true alignment, are calculated for each alignment.
b) The method generates a single best alignment
c) It aligns two sequences using a Bayesian algorithm which is a unique sequence alignment method
d) It is a web-based footprinting program
Explanation: Instead of returning a single best alignment, the method generates a distribution of a large number of alignments using a full range of scoring matrices and gap penalties. By studying the distribution, the alignment that has the highest likelihood score, which is in the extreme margin of the distribution, is chosen. Based on this unique alignment searching algorithm, weakly conserved motifs can be identified with high probability scores.
6. Which of the following is untrue about FootPrinter?
a) It is a web-based program for phylogenetic footprinting using multiple input sequences
b) The motifs from organisms spanning over the widest evolutionary distances are identified as promoter or regulatory motifs
c) The program performs multiple alignment of the input sequences to identify conserved motifs
d) The user does not necessarily provides a phylogenetic tree that defines the evolutionary relationship of the input sequences
Explanation: The user also needs to provide a phylogenetic tree that defines the evolutionary relationship of the input sequences. One may obtain the tree information from the “Tree of Life” web site, which archives known phylogenetic trees using ribosomal RNAs as gene markers. It identifies unusually well-conserved motifs across a set of orthologous sequences.
7. Which of the following is untrue?
a) MEME is the EM based program only for protein motif discovery
b) AlignACE is a web-based program using the Gibbs sampling algorithm to find common motifs
c) AlignACE is optimized for DNA sequence motif extraction
d) Melina stands for Motif Elucidator In Nucleotide sequence Assembly
Explanation: The use of MEME is similar to that for protein sequences and DNA motif finding. AlignACE automatically determines the optimal number and lengths of motifs from the input sequences. Melina is a web-based program that runs four individual motif-finding algorithms – MEME, GIBBS sampling, CONSENSUS, and Core search – simultaneously. The user compares the results to determine the consensus of motifs predicted by all four prediction methods.
8. Which of the following is untrue about Expression Profiling–Based Method?
a) Genes with similar expression profiles are considered coexpressed, which can be identified through a clustering approach
b) This approach appears to be less effective for finding transcription factor binding sites.
c) An advanced alignment-independent profile construction method such as EM and Gibbs motif sampling is often used in finding the subtle sequence motifs
d) The basis for coexpression is thought to be due to common promoters and regulatory elements.
Explanation: This approach is essentially experimentally based and appears to be robust for finding transcription factor binding sites. The problem is that the regulatory elements of coexpressed genes are usually short and weak. Their patterns are difficult to discern using simple multiple sequence alignment approaches.
9. INCLUSive is a suite of web based tools designed to streamline the process of microarray data collection and sequence motif detection.
Explanation: The pipeline processes microarray data, automatically clusters genes according expression patterns, retrieves upstream sequences of coregulated genes and detects motifs using a Gibbs sampling approach (Motif Sampler). To further avoid the problem of getting stuck in a local optimum, each sequence dataset is submitted to Motif Sampler ten times. The results may vary in each run. The results from the ten runs are compiled to derive consensus motifs.
10. Which of the following is untrue about PhyloCon?
a) It stands for Phylogenetic Consensus
b) It is used to identify regulatory motifs
c) It is a UNIX program that combines phylogenetic footprinting with gene expression profiling analysis
d) No conservation among orthologous genes and conservation among coregulated genes is a disadvantage
Explanation: This approach takes advantage of conservation among orthologous genes as well as conservation among coregulated genes. For each individual gene in a set of coregulated genes, multiple sequence homologs are aligned to derive profiles. Based on the gene expression data, profiles between coregulated genes are further compared to identify functionally conserved motifs among evolutionary conserved motifs.
Sanfoundry Global Education & Learning Series – Bioinformatics.
To practice all areas of Bioinformatics Problems, here is complete set of 1000+ Multiple Choice Questions and Answers.