This set of Bioinformatics online quiz focuses on “Protein Secondary Structure Prediction for Globular Proteins”.
1. The formation of _____ is determined by ______ interactions, whereas the formation of ____ is strongly influenced by ______ interactions.
a) α-helices, long -range, α-helices, short -range
b) α-helices, long -range, β-strands, short -range
c) α-helices, short-range, β-strands, long-range
d) β-strands, short-range, β-strands, long-range
Explanation: Protein secondary structure prediction with high accuracy is not a trivial ask. It remained a very difficult problem for decades. This is because protein secondary structure elements are context dependent. The formation of α-helices is determined by short-range interactions, whereas the formation of β-strands is strongly influenced by long-range interactions. Prediction for long-range interactions is theoretically difficult. After more than three decades of effort, prediction accuracies have only been improved from about 50% to about 75%.
2. The secondary structure prediction methods can be either ab initio based, which make use of single sequence information only, or homology based, which make use of multiple sequence alignment information.
Explanation: The ab initio methods, which belong to early generation methods, predict secondary structures based on statistical calculations of the residues of a single query sequence. The homology-based methods do not rely on statistics of residues of a single sequence, but on common secondary structural patterns conserved among multiple homologous sequences.
3. Which of the following is untrue regarding Ab Initio–Based Methods?
a) This type of method predicts the secondary structure based on a single query sequence
b) This type of method predicts the secondary structure based on a multiple query sequence
c) It measures the relative propensity of each amino acid belonging to a certain secondary structure element
d) The propensity scores are derived from known crystal structures
Explanation: Examples of ab initio prediction are the Chou–Fasman and Garnier, Osguthorpe, Robson (GOR) methods. The ab initio methods were developed in the 1970s when protein structural data were very limited. The statistics derived from the limited data sets can therefore be rather inaccurate. However, the methods are simple enough that they are often used to illustrate the basics of secondary structure prediction.
4. The Chou–Fasman algorithm determines the propensity or intrinsic tendency of each residue to be in the helix, strand, and β-turn conformation using observed frequencies found in protein crystal structures.
Explanation: It determines the propensity or intrinsic tendency of each residue using observed frequencies found in protein crystal structures (conformational values for coils are not considered). For example, it is known that alanine, glutamic acid, and methionine are commonly found in α-helices, whereas glycine and proline are much less likely to be found in such structures.
5. The GOR method is based on the “propensity” of each residue to be in one of the two conformational states, helix (H), strand(E).
Explanation: The GOR method is based on the “propensity” of each residue to be in one of the two conformational states, helix (H), strand(E), turn(T),and coil (C). However, instead of using the propensity value from a single residue to predict a conformational state, it takes short-range interactions of neighboring residues into account.
6. Which of the following is untrue regarding Chou–Fasman and GOR methods?
a) Both are the first-generation methods
b) They are developed in the 1970s,
c) They suffer from the fact that the prediction rules are somewhat arbitrary
d) They are based on single sequence statistics with clear relation to known protein-folding theories
Explanation: They are based on single sequence statistics without clear relation to known protein-folding theories. The predictions solely rely on local sequence information and fail to take into account long range interactions. A Chou-Fasman–based prediction does not even consider the short-range environmental information.
7. Which of the following is untrue regarding Homology-Based Methods?
a) The third generation of algorithms was developed in the late 1990s
b) They were developed by making use of evolutionary information
c) This type of method uses the ab initio secondary structure prediction of individual sequences only
d) This type of method combines the ab initio secondary structure prediction of individual sequences and alignment information from multiple homologous sequences (>35% identity)
Explanation: The idea behind this approach is that close protein homologs should adopt the same secondary and tertiary structure. When each individual sequence is predicted for secondary structure using a method similar to the GOR method, errors and variations may occur. However, evolutionary conservation dictates that there should be no major variations for their secondary structure elements.
8. Because residues in the same aligned position are assumed to have the same secondary structure, any inconsistencies or errors in prediction of individual sequences can be corrected using a majority rule.
Explanation: By aligning multiple sequences, information of positional conservation is revealed. This homology based method has helped improve the prediction accuracy by another 10% over the second-generation methods.
9. Which of the following is untrue regarding Prediction with Neural Networks?
a) The third-generation prediction algorithms extensively apply sophisticated neural networks
b) It is used to analyze substitution patterns in multiple sequence alignments
c) It is not a machine learning process
d) It requires a structure of multiple layers of interconnected variables or nodes
Explanation: a neural network is a machine learning process that requires a structure of multiple layers of interconnected variables or nodes. In secondary structure prediction, the input is an amino acid sequence and the output is the probability of a residue to adopt a particular structure.
10. Which of the following is untrue regarding Prediction with Neural Networks?
a) It has to be first trained by sequences with known structures
b) Between input and output are many connected hidden layers
c) Between the connected hidden layers the machine learning takes place to adjust the mathematical weights of internal connections
d) It doesn’t have to be first trained by sequences with known structures
Explanation: The neural network has to be first trained by sequences with known structures so it can recognize the amino acid patterns and their relationships with known structures. During this process, the weight functions in hidden layers are optimized so they can relate input to output correctly. When the sufficiently trained network processes an unknown sequence, it applies the rules learned in training to recognize particular structural patterns.
11. Which of the following is untrue regarding Prediction with Neural Networks?
a) When multiple sequence alignments and neural networks are combined, the result is further improved accuracy
b) A neural network is trained by a single sequence
c) A neural network is trained by a sequence profile derived from the multiple sequence alignment
d) When the sufficiently trained network processes an unknown sequence, it applies the rules learned in training to recognize particular structural patterns
Explanation: A neural network is trained not by a single sequence but by a sequence profile derived from the multiple sequence alignment. This combined approach has been shown to improve the accuracy to above 75%, which is a breakthrough in secondary structure prediction. The improvement mainly comes from enhanced secondary structure signals through consensus drawing. The following lists several frequently used third generation prediction algorithms available as web servers.
12. Which of the following is untrue regarding PHD?
a) It stands for Profile network from Heidelberg
b) It is a web-based program that combines neural network only
c) It first performs a BLASTP of the query sequence against a non redundant protein sequence database
d) In initial steps it finds a set of homologous sequences, which are aligned with the MAXHOM program (a weighted dynamic programming algorithm performing global alignment)
Explanation: is a web-based program that combines neural network with multiple sequence alignment. After the initial steps, the resulting alignment in the form of a profile is fed into a neural network that contains three hidden layers. The first hidden layer makes raw prediction based on the multiple sequence alignment by sliding a window of thirteen positions.
Sanfoundry Global Education & Learning Series – Bioinformatics.
To practice all areas of Bioinformatics for online Quizzes, here is complete set of 1000+ Multiple Choice Questions and Answers.