This set of Tough Bioinformatics Questions and Answers focuses on “Sequence Assembly and Gene Identification – 2”.
1. In the program COGNITOR each protein in the proteome is used as a query of a database of protein clusters.
Explanation: The database was made by performing an all-by-all genome comparison across a spectrum of prokaryotic organisms and a portion of the yeast proteome. Orthologous pairs of sequence were then merged with clusters or orthologous pairs (COGs) for multiple proteomes.
2. WU-BLAST produces P scores and BLAST (NCBI) produces E scores where _____
a) E = ln (1 + P2)
b) E = ln (1 – P2)
c) E = ln (1 + P)
d) E = ln (1 – P)
Explanation: For values less than 0.05, E = P. The choice of a < 10-20 score is a conservative one for identification of orthologs that should have a similar domain structure.
3. In Proteolysis and fragment sequencing, Protein spots may be excised from a two-dimensional protein gel and subjected to a combination of amino acid sequencing and cleavage analyses using the techniques of mass spectrometry and high-pressure liquid chromatography.
Explanation: Genome regions that encode these sequences can then be identified and the corresponding gene located. A similar method may be used to identify the gene that encodes a particular protein that has been purified and characterized in the laboratory.
4. In Protein 2D gel Electrophoresis, Individual proteins produced by the genome can be separated to _____ by this method and specific ones identified by various _______
a) smaller extent, biochemical and immunological tests
b) a large extent, biochemical and immunological tests
c) a large extent, biochemical tests only
d) smaller extent, purely mechanical tests
Explanation: Moreover, changes in levels of proteins in response to an environment signal can be monitored in much the same way as a microarray analysis is performed. Microarrays only detect untranslated mRNAs, whereas a two-dimensional gel protein analysis detects translation products, thus revealing an additional level of regulation.
5. In Metabolic pathways and regulation, as genes are identified in a new genome sequence, some will be found that are known to act sequentially in a metabolic pathway or to have a known role in gene regulation in other organisms.
Explanation: From this information, the metabolic pathways and metabolic activities of the organism will become apparent. In some cases, the apparent absence of a gene in a well-represented pathway may lead to a more detailed search for the gene. Clustering of genes in the pathway on the genome of a related organism can provide a further hint as to where the gene may be located.
6. _______ of the Drosophila sequence is composed of TEs and _____ is heterochromatic regions that do not include genes.
a) one-fourth, one-third
b) one-fifth, one-fourth
c) one-sixth, one-eighth
d) one-sixth, one-third
Explanation: Hence, in the euchromatic regions, the gene density in the Drosophila genome is one gene per 9 kb. Despite the fact that the lower number of predicted genes in Drosophila is smaller than that of the other genomes, the amount of functional diversity, as evidenced by protein family representation, is similar.
7. Yeast is about ______ compact than E. coli.
a) fivefold, less
b) threefold, more
c) twofold, less
d) twofold, more
Explanation: Of the remaining genomes, C. elegans and A. thaliana have approximately the same density of genes (one gene per 6 kb). Drosophila is the least dense in this comparison (one gene per 14 kb).
8. The ________ are _______by genetic structure to retroviruses.
a) STR retrotransposons, related
b) LTR retrotransposons, related
c) STR transposons, related
d) LTR retrotransposons, not related
Explanation: There are three main subclasses of these TEs—the long terminal repeat (LTR) retrotransposons, retroposons, and retrovirus-like elements with LTRs. Class I elements encode a reverse transcriptase and use RNA mediated mechanisms of transcription.
9. Which of the given statement is incorrect?
a) As in an all-by-all protein comparison within a proteome, a matrix of alignment scores with E values is made, and the most closely related sequences in the two organisms are identified
b) To perform a between-proteome analysis, proteome databases are made for the known and predicted genes of two or more genomes
c) Each protein of one proteome is selected in turn as a query of the proteome of another organism or the combined proteome of a group of organisms
d) Each protein of one proteome is selected in turn as a query of the proteome of another single organism only
Explanation: This analysis can predict orthologs. In other words proteins have an identical function attributable to descent of the respective genes from a common ancestor.
10. The higher the E value, the more significant the alignment between a pair of matching sequences.
Explanation: The lower the E value, the more significant the alignment between a pair of matching sequences. The E value of an alignment score is the probability that an alignment score as good as the one found would be observed between two random or unrelated sequences in a search of a database of the same size.
Sanfoundry Global Education & Learning Series – Bioinformatics.
To practice Tough questions and answers on all areas of Bioinformatics, here is complete set of 1000+ Multiple Choice Questions and Answers.