Medical Journals

Recovering Motifs from Biased Genomes: Application of Signal Correction.

Authors:
  • Hasan Samiul
  • Schreiber Mark

From: Novartis Institute for Tropical Diseases (NITD), 10 Biopolis Road, #05-01 Chromos, Singapore 138670.

Nucleic acids research

  • Publish Date: 2006
  • ISSN: 1362-4962
  • Volume: 34
  • Issue: 18
  • Pages: 5124-32
  • Medium: Internet
  • Language: English
  • Citation (JAMA): Hasan Samiul, Schreiber Mark, et al. Recovering Motifs from Biased Genomes: Application of Signal Correction.. Nucleic Acids Res. 2006;34:5124-32

Abstract

A significant problem in biological motif analysis arises when the background symbol distribution is biased (e.g. high/low GC content in the case of DNA sequences). This can lead to overestimation of the amount of information encoded in a motif. A motif can be depicted as a signal using information theory (IT). We apply two concepts from IT, distortion and patterned interference (a type of noise), to model genomic and codon bias respectively. This modeling approach allows us to correct a raw signal to recover signals that are weakened by compositional bias. The corrected signal is more likely to be discriminated from a biased background by a macromolecule. We apply this correction technique to recover ribosome-binding site (RBS) signals from available sequenced and annotated prokaryotic genomes having diverse compositional biases. We observed that linear correction was sufficient for recovering signals even at the extremes of these biases. Further comparative genomics studies were made possible upon correction of these signals. We find that the average Euclidian distance between RBS signal frequency matrices of different genomes can be significantly reduced by using the correction technique. Within this reduced average distance, we can find examples of class-specific RBS signals. Our results have implications for motif-based prediction, particularly with regards to the estimation of reliable inter-genomic model parameters.

Mesh Headings (Keywords): Archaea, Bacteria, Base Composition, Binding Sites, DNA, Escherichia coli, Genome, Archaeal, Genome, Bacterial, Genomics, Information Theory, Peptide Chain Initiation, Translational, Phylogeny, Principal Component Analysis, RNA, Messenger, Regulatory Sequences, Nucleic Acid, Ribosomes


Check for Full Text / PubMed Unique Identifier (PMID): 16990246


This abstract is part of PubMed, a service of the U.S. National Library of Medicine. PubMed includes more than 17 million citations from MEDLINE and other life science journals for biomedical articles. See Copyright and Disclaimers.

Linked medical terms appearing on this page are added by Healia to help readers find more information and are not part of the original PubMed document.

The data herein was last updated on July 8th, 2008 and may not reflect the most current and accurate data available from NLM.


Advertisements

About | Privacy Policy | Business Solutions | Advertise | Contact | Add Healia to your site

©2012. Healia / Meredith Corporation  

Use of this site constitutes acceptance of our Terms of Service and Privacy Policy. All content on this Web site, including medical opinion and any other health-related information, is for informational purposes only and should not be used for a specific diagnosis or individual treatment plan for any situation. Use of this site and the information contained herein does not create a doctor-patient relationship. Always seek the direct advice of your doctor in connection with any questions or issues you may have regarding your own health or the health of others.