This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Using the Database Access Program ENTREZ”.
1. Which of the following is incorrect about ENTREZ?
a) It is a resource prepared only by the staff of the National Center for Biotechnology Information
b) It provides a series of forms that can be filled out to retrieve a Medline reference related to the molecular biology sequence databases
c) One straightforward way to access the sequence databases is through ENTREZ
d) It provides a series of forms that can be filled out to retrieve a DNA or protein sequence
Explanation: It is a resource prepared by the staff of the National Center for Biotechnology Information and National Library of Medicine, Bethesda, Maryland. After search for either a protein or a DNA sequence is chosen at the above address, another Web page is provided with a form to fill out for the search.
2. The databases Genbank, EMBL and DDBJ are updated daily.
Explanation: The mentioned database centers are updated daily and exchange new sequences daily, so that it is only necessary to access one of them. The EMBL stands for European Molecular Biology Laboratory and DDBJ for DNA DataBank of Japan.
3. Using boolean logic, the search looks for database entries that include the first term ____ the second, and subsequent terms repeated until the last term.
Explanation: On the ENTREZ form, make a selection in the data entry window after the term “Search,” then enter search terms in the longer data entry window after “for.” The database will be searched for sequence database entries that contain all of these terms or related ones.
4. To assist in finding suitable terms, for each field, ENTREZ provides a list of index entries.
Explanation: When searching for terms in a particular field, some knowledge of the terms that are in the database can be helpful. The “Limits” link on the ENTREZ form page is used to limit the GenBank field to be searched, and various logical combinations of search terms may be designed by this method. These fields refer to the GenBank fields.
5. For a protein search, for example, current choices for fields include ______
Which of the following is a wrong blank?
a) Accession (number)
b) E. C. number
d) Journal number
Explanation: Other fields being- author name, journal name, keyword, modification date. Also, it includes organism, page number, primary accession (number), properties, protein name, publication date (of reference), seqID string, sequence length, substance name, text word, title word, volume, and sequence ID. Similar fields are shown for the DNA database search.
6. The results of searches in separate fields may be combined to narrow down the choices.
Explanation: The number of terms to be searched for and the field to be searched is the main decisions to be made. In doing so, it is important to be as specific as possible, or else there may be a great many possibilities.
7. Knowing ________ should be enough to find the required entry quickly.
a) publication date, protein name, journal name
b) accession number, protein name, or name of gene
c) publication date, protein name, or volume
d) properties, protein name, or title word
Explanation: If the same protein has been sequenced in several organisms, providing an organism name is also helpful. When the chosen search terms and fields have been decided and submitted, a database comprising all of the currently available sequences (called the non redundant or NR database) will be searched. Other database selections can also be made.
8. The program returns the number of matches found and provides an opportunity to narrow this list by including more terms.
Explanation: When the number of matching sequences has been narrowed to a reasonable number, the sequence may be retrieved in a chosen format in several straightforward steps. This helps in getting to the required data in less number of steps.
9. Which of the following is incorrect about ENTREZ?
a) There is no simple way to find the correct sequence without manually checking the information provided in each sequence, but this usually takes longer time
b) Before leaving ENTREZ, it is often useful to check for sequence database entries that are similar to the one of interest, called “neighbors” by ENTREZ
c) The expanded query searches other database entries of interest, such as the same protein in another organism, a large chromosomal sequence that includes the gene, or members of the same gene family
d) While visiting the site, note that ENTREZ has been adapted to search through a number of other biological databases, and also through Medline, and these searches are available from the initial ENTREZ Web page
Explanation: Opposite to what is mentioned in option a, this takes shorter time. It is important to look through the sequences to locate the one intended. There may be several different copies of the sequence because it may have been sequenced from more than one organism, or the sequence may be a mutant sequence, a particular clone, or a fragment.
10. Which of the following is incorrect about Retrieving a Specific Sequence?
a) It can be difficult to retrieve the sequence of a specific gene or protein simply because of the sheer number of sequences in the Gen-Bank database and the complex problem of indexing them
b) Other projects may benefit from the availability of better curated and annotated protein sequence databases, but not PIR and SwissProt
c) For projects that require the most currently available sequences, the NR databases should be searched
d) The genomic databases can also provide the sequence of a particular gene or protein. Protein sequences in the Genpro database are generated by automatic translation of DNA sequences
Explanation: Curated and annotated protein sequence databases include PIR and SwissProt. When read from cDNA copies of mRNA sequences, they provide a reliable sequence, given a certain amount of uncertainty as to the translational start site. Many protein sequences are now predicted by translation of genomic sequences, requiring a prediction of exons, a somewhat error-prone step.
Sanfoundry Global Education & Learning Series – Bioinformatics.
To practice all areas of Bioinformatics, here is complete set of 1000+ Multiple Choice Questions and Answers.