Detecting Overlapping Coding Sequences in Virus Genomes.
From: Department of Biochemistry, University of Otago, PO Box 56, Dunedin, New Zealand. aef@sanger.otago.ac.nz
BMC bioinformatics
- Publish Date: 2006
- ISSN: 1471-2105
- Volume: 7
- Issue:
- Pages: 75
- Medium: Internet
- Language: English
- Citation (JAMA): Firth Andrew E, Brown Chris M, et al. Detecting Overlapping Coding Sequences in Virus Genomes.. BMC Bioinformatics 2006;7:75
Abstract
BACKGROUND: Detecting new coding sequences (CDSs) in viral genomes can be difficult for several reasons. The typically compact genomes often contain a number of overlapping coding and non-coding functional elements, which can result in unusual patterns of codon usage; conservation between related sequences can be difficult to interpret — especially within overlapping genes; and viruses often employ non-canonical translational mechanisms — e.g. frameshifting, stop codon read-through, leaky-scanning and internal ribosome entry sites — which can conceal potentially coding open reading frames (ORFs). RESULTS: In a previous paper we introduced a new statistic — MLOGD (Maximum Likelihood Overlapping Gene Detector) — for detecting and analysing overlapping CDSs. Here we present (a) an improved MLOGD statistic, (b) a greatly extended suite of software using MLOGD, (c) a database of results for 640 virus sequence alignments, and (d) a web-interface to the software and database. Tests show that, from an alignment with just 20 mutations, MLOGD can discriminate non-overlapping CDSs from non-coding ORFs with a typical accuracy of up to 98%, and can detect CDSs overlapping known CDSs with a typical accuracy of 90%. In addition, the software produces a variety of statistics and graphics, useful for analysing an input multiple sequence alignment. CONCLUSION: MLOGD is an easy-to-use tool for virus genome annotation, detecting new CDSs — in particular overlapping or short CDSs — and for analysing overlapping CDSs following frameshift sites. The software, web-server, database and supplementary material are available at http://guinevere.otago.ac.nz/mlogd.html.
Mesh Headings (Keywords): Algorithms, Binding Sites, Codon, Codon, Terminator, Computational Biology, Conserved Sequence, Databases as Topic, Frameshift Mutation, Genes, Overlapping, Genes, Viral, Genome, Genome, Viral, Likelihood Functions, Models, Statistical, Mutation, Open Reading Frames, Protein Biosynthesis, Sequence Alignment, Sequence Analysis, DNA, Software, Viruses
Check for Full Text / PubMed Unique Identifier (PMID): 16483358
This abstract is part of PubMed, a service of the U.S. National Library of Medicine. PubMed includes more than 17 million citations from MEDLINE and other life science journals for biomedical articles. See Copyright and Disclaimers.
Linked medical terms appearing on this page are added by Healia to help readers find more information and are not part of the original PubMed document.
The data herein was last updated on July 8th, 2008 and may not reflect the most current and accurate data available from NLM.
