Medical Journals

On Counting Position Weight Matrix Matches in a Sequence, with Application to Discriminative Motif Finding.

Authors:
  • Sinha Saurabh

From: Department of Computer Science, University of Illinois Urbana-Champaign, 201 N. Goodwin Ave, Urbana, IL 61801, USA. sinhas@cs.uiuc.edu

Bioinformatics (Oxford, England)

  • Publish Date: Jul 2006
  • ISSN: 1460-2059
  • Volume: 22
  • Issue: 14
  • Pages: e454-63
  • Medium: Internet
  • Language: English
  • Citation (JAMA): Sinha Saurabh, et al. On Counting Position Weight Matrix Matches in a Sequence, with Application to Discriminative Motif Finding.. Bioinformatics Jul 2006;22:e454-63

Abstract

MOTIVATION AND RESULTS: The position weight matrix (PWM) is a popular method to model transcription factor binding sites. A fundamental problem in cis-regulatory analysis is to “count” the occurrences of a PWM in a DNA sequence. We propose a novel probabilistic score to solve this problem of counting PWM occurrences. The proposed score has two important properties: (1) It gives appropriate weights to both strong and weak occurrences of the PWM, without using thresholds. (2) For any given PWM, this score can be computed while allowing for occurrences of other, a priori known PWMs, in a statistically sound framework. Additionally, the score is efficiently differentiable with respect to the PWM parameters, which has important consequences for designing search algorithms. The second problem we address is to find, ab initio, PWMs that have high counts in one set of sequences, and low counts in another. We develop a novel algorithm to solve this “discriminative motif-finding problem”, using the proposed score for counting a PWM in the sequences. The algorithm is a local search technique that exploits derivative information on an objective function to enhance speed and performance. It is extensively tested on synthetic data, and shown to perform better than other discriminative as well as non-discriminative PWM finding algorithms. It is then applied to cis-regulatory modules involved in development of the fruitfly embryo, to elicit known and novel motifs. We finally use the algorithm on genes predictive of social behavior in the honey bee, and find interesting motifs. AVAILABILITY: The program is available upon request from the author.

Mesh Headings (Keywords): Algorithms, Amino Acid Motifs, Artificial Intelligence, Base Sequence, Binding Sites, Chromosome Mapping, DNA, Discriminant Analysis, Molecular Sequence Data, Pattern Recognition, Automated, Protein Binding, Regulatory Elements, Transcriptional, Sequence Alignment, Sequence Analysis, DNA, Software, Transcription Factors


Check for Full Text / PubMed Unique Identifier (PMID): 16873507


This abstract is part of PubMed, a service of the U.S. National Library of Medicine. PubMed includes more than 17 million citations from MEDLINE and other life science journals for biomedical articles. See Copyright and Disclaimers.

Linked medical terms appearing on this page are added by Healia to help readers find more information and are not part of the original PubMed document.

The data herein was last updated on July 8th, 2008 and may not reflect the most current and accurate data available from NLM.


Advertisements

About | Privacy Policy | Business Solutions | Advertise | Contact | Add Healia to your site

©2012. Healia / Meredith Corporation  

Use of this site constitutes acceptance of our Terms of Service and Privacy Policy. All content on this Web site, including medical opinion and any other health-related information, is for informational purposes only and should not be used for a specific diagnosis or individual treatment plan for any situation. Use of this site and the information contained herein does not create a doctor-patient relationship. Always seek the direct advice of your doctor in connection with any questions or issues you may have regarding your own health or the health of others.