This set of Bioinformatics Multiple Choice Questions & Answers (MCQs) focuses on “Multiple Sequence Formats & Storage of Information in a Sequence Database”.
1. which of the given statements is incorrect about Block multiple sequence alignment format?
a) Identification starts contain a short identifier for the group of sequences from which the block was made and often is the original Prosite group ID
b) The identifier is terminated by a comma, and “BLOCK” indicates the entry type
c) AC contains the block number, a seven-character group number for sequences from which the block was made, followed by a letter (A–Z) indicating the order of the block in the sequences
d) The block number is a 5-digit number preceded by BL (BLOCKS database) or PR (PRINTS database)
Explanation: The identifier is terminated by a semicolon, and “BLOCK” indicates the entry type. Min, max is the minimum, maximum number of amino acids from the previous blocks or from the sequence starting. DE describes sequences from which the block was made.
2. BL contains information about the block: xxx is the amino acids in the spaced triplet found by MOTIF upon which the block is based.
Explanation: In addition to this, w is the width of the sequence segments (columns) in the block. s is the number of sequence segments (rows) in the block. Other values (n1, n2) describe statistical features of the block. Sequence id is a list of sequences. Each sequence line contains a sequence identifier, the offset from the beginning of the sequence to the block in parentheses, the sequence segment, and a weight for the segment.
3. Which of the given statements is incorrect about READSEQ?
a) It is an extremely useful sequence formatting program developed by D. G. Gilbert at Indiana University, Bloomington
b) It was developed at Indiana University, Bloomington
c) It can recognize a DNA or protein sequence file in any of the formats
d) It can recognize a DNA or protein sequence file in some particular formats
Explanation: It can identify the format, and write a new file with an alternative format. Some of these formats are used for special types of analyses such as multiple sequence alignment and phylogenetic analysis.
4. Data files that have multiple sequences, such as those required for multiple sequence alignment and phylogenetic analysis using parsimony (PAUP), are not converted in READSEQ.
Explanation: Data files with such multiple sequences as mentioned are converted in READSEQ. Options to reverse-complement and to remove gaps from sequences are included. SEQIO and another sequence conversion program for a UNIX machine.
5. The “from” programs convert sequence files from GCG format into the named format, and the “to” programs convert the alternative format into GCG format.
Explanation: In addition, the GCG programs include the following sequence formatting programs: (1) GETSEQ, which converts a simple ASCII file being received from a remote PC to GCG format; (2) REFORMAT, which will format a GCG file that has been edited, and will also perform other functions; and (3) SPEW, which sends a GCG sequence file as an ASCII file to a remote PC.
6. The Common Object Request Broker Architecture (CORBA) is the Object Management Group’s interface for objects.
Explanation: It allows different computer applications to communicate with each other through a common language, Interface Definition Language (IDL). To plan an object-oriented database by defining the classes of objects and the relationships among these objects, a specific set of procedures called the Unified Modeling Language (UML) has been devised by the OMG group.
7. The FASTA format is readily converted into other formats and also is smaller and simpler
Explanation: It contains just a line of sequence identifiers followed by the sequence without numbers, is very useful for browsing and analyzing purposes. One browser window may retrieve sequences from a database and a second may analyze these sequences.
8. Each DNA or protein sequence database entry has much information, including ______
a) an assigned accession number(s)
b) source organism
c) name of locus
d) reference number type(s)
Explanation: In addition to these keywords that apply to sequence, features in the sequence such as coding regions, intron splice sites, and mutations; and finally the sequence itself is given the sequence database entry. The above information is organized into a tabular form very much like that found in a relational database.
9. Which of the following is an incorrect statement?
a) The last column contains the sequences themselves
b) It is quite tough making an index of the information in each of these fields so that a search query can locate all the occurrences through the index
c) If one imagines a large table with each sequence entry occupying one row, then each column will include one of the above types of information for each sequence, and each column is called a FIELD
d) The DNA, protein, and reference databases have all been cross-referenced so that moving between them is readily accomplished
Explanation: It is very easy to make an index of the information in each of these fields so that a search query can locate all the occurrences through the index. Even related sequences are cross-referenced. In addition, the information in one database can be cross-referenced to that in another database.
10. Which of the given statements is incorrect about Database Types?
a) Relational databases are more useful in the development of biological databases
b) The tables in relational database are carefully indexed and cross-referenced with each other, sometimes using additional tables, so that each item in the database has a unique set of identifying features
c) The relational database orders data in tables made up of rows giving specific items in the database, and columns giving the features as attributes of those items
d) The two principal types of DBs are the relational and object-oriented databases
Explanation: The object-oriented database structure has been useful in the development of biological databases. The objects, such as genetic maps, genes, or proteins, each have an associated set of utilities for analysis and display of the object and a set of attributes such as identifying name or references.
Sanfoundry Global Education & Learning Series – Bioinformatics.
To practice all areas of Bioinformatics, here is complete set of 1000+ Multiple Choice Questions and Answers.