Medical Journals

Using Evolutionary and Structural Information to Predict Dna-binding Sites on Dna-binding Proteins.

Authors:
  • Kuznetsov Igor B
  • Gou Zhenkun
  • Li Run
  • Hwang Seungwoo

From: Gen*NY*sis Center for Excellence in Cancer Genomics, Department of Epidemiology and Biostatistics, University at Albany, Rensselaer, NewYork 12144, USA. Ikuznetsov@albany.edu

Proteins

  • Publish Date: Jul 2006
  • ISSN: 1097-0134
  • Volume: 64
  • Issue: 1
  • Pages: 19-27
  • Medium: Internet
  • Language: English
  • Citation (JAMA): Kuznetsov Igor B, Gou Zhenkun, Li Run, et al. Using Evolutionary and Structural Information to Predict Dna-binding Sites on Dna-binding Proteins.. Proteins Jul 2006;64:19-27

Abstract

Proteins that interact with DNA are involved in a number of fundamental biological activities such as DNA replication, transcription, and repair. A reliable identification of DNA-binding sites in DNA-binding proteins is important for functional annotation, site-directed mutagenesis, and modeling protein-DNA interactions. We apply Support Vector Machine (SVM), a supervised pattern recognition method, to predict DNA-binding sites in DNA-binding proteins using the following features: amino acid sequence, profile of evolutionary conservation of sequence positions, and low-resolution structural information. We use a rigorous statistical approach to study the performance of predictors that utilize different combinations of features and how this performance is affected by structural and sequence properties of proteins. Our results indicate that an SVM predictor based on a properly scaled profile of evolutionary conservation in the form of a position specific scoring matrix (PSSM) significantly outperforms a PSSM-based neural network predictor. The highest accuracy is achieved by SVM predictor that combines the profile of evolutionary conservation with low-resolution structural information. Our results also show that knowledge-based predictors of DNA-binding sites perform significantly better on proteins from mainly-alpha structural class and that the performance of these predictors is significantly correlated with certain structural and sequence properties of proteins. These observations suggest that it may be possible to assign a reliability index to the overall accuracy of the prediction of DNA-binding sites in any given protein using its sequence and structural properties. A web-server implementation of the predictors is freely available online at http://lcg.rit.albany.edu/dp-bind/.

Mesh Headings (Keywords): Amino Acid Sequence, Binding Sites, DNA, DNA-Binding Proteins, Databases, Nucleic Acid, Databases, Protein, Evolution, Molecular, Models, Molecular, Models, Theoretical, ROC Curve


Check for Full Text / PubMed Unique Identifier (PMID): 16568445


This abstract is part of PubMed, a service of the U.S. National Library of Medicine. PubMed includes more than 17 million citations from MEDLINE and other life science journals for biomedical articles. See Copyright and Disclaimers.

Linked medical terms appearing on this page are added by Healia to help readers find more information and are not part of the original PubMed document.

The data herein was last updated on July 8th, 2008 and may not reflect the most current and accurate data available from NLM.


Advertisements

About | Privacy Policy | Business Solutions | Advertise | Contact | Add Healia to your site

©2012. Healia / Meredith Corporation  

Use of this site constitutes acceptance of our Terms of Service and Privacy Policy. All content on this Web site, including medical opinion and any other health-related information, is for informational purposes only and should not be used for a specific diagnosis or individual treatment plan for any situation. Use of this site and the information contained herein does not create a doctor-patient relationship. Always seek the direct advice of your doctor in connection with any questions or issues you may have regarding your own health or the health of others.