Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Attempt to simplify output by incorporating a hierarchy of thresholds  





2 Each request triggers the application of over 20 different methods  





3 Availability  



3.1  Web Service  





3.2  Cloud Solution[buzzword]  







4 See also  





5 References  














Predictprotein







Add links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


PredictProtein
Original author(s)Burkhard Rost
Developer(s)Guy Yachdav Laszlo Kajan
Initial release1992
Stable release

1.0.88

Operating systemUNIX-based
TypeBioinformatics
LicenseGPLv2
Websitewww.predictprotein.org Edit this on Wikidata

PredictProtein (PP) is an automatic service that searches up-to-date public sequence databases, creates alignments, and predicts aspects of protein structure and function. Users send a protein sequence and receive a single file with results from database comparisons and prediction methods. PP went online in 1992 at the European Molecular Biology Laboratory; since 1999 it has operated from Columbia University and in 2009 it moved to the Technische Universität München. Although many servers have implemented particular aspects, PP remains the most widely used public server for structure prediction: over 1.5 million requests from users in 104 countries have been handled; over 13000 users submitted 10 or more different queries. PP web pages are mirrored in 17 countries on 4 continents. The system is optimized to meet the demands of experimentalists not experienced in bioinformatics. This implied that we focused on incorporating only high-quality methods, and tried to collate results omitting less reliable or less important ones.

Attempt to simplify output by incorporating a hierarchy of thresholds[edit]

The attempt to ‘pre-digest’ as much information as possible to simplify the ease of interpreting the results is a unique pillar of PP. For example, by default PP returns only those proteins found in the database that are very likely to have a similar structure to the query protein.[1] Particular predictions, such as those for membrane helices, coiled-coil regions, signal peptides and nuclear localization signals, are not returned if found to be below given probability thresholds.

Each request triggers the application of over 20 different methods[edit]

Users receive a single output file with the following results. Database searches: similar sequences are reported and aligned by a standard, pairwise BLAST,[2] an iterated PSI-BLAST search.[3] Although the pairwise BLAST searches are identical to those obtainable from the NCBI site, the iterated PSI-BLAST is performed on a carefully filtered database to avoid accumulating false positives during the iteration,.[4][5] A standard search for functional motifs in the PROSITE database.[6] PP now also identifies putative boundaries for structural domains through the CHOP procedure. Structure prediction methods: secondary structure, solvent accessibility and membrane helices predicted by the PHD and PROF programs,[7][8] membrane strands predicted by PROFtmb,[9] coiled-coil regions by COILS,[10] and inter-residue contacts through PROFcon,[11] low-complexity regions are marked by SEG [12] and long regions with no regular secondary structure are identified by NORSp,.[13][14] The PHD/PROF programs are only available through PP. The particular way in which PP automatically iterates PSI-BLAST searches and the way in which we decide what to include in sequence families is also unique to PP. The particular aspects of function that are currently embedded explicitly in PP are all somehow related to sub-cellular localization: we detect nuclear localization signals through PredictNLS,[15][16] we predict localization independent of targeting signals through LOCnet;[17] and annotations homology to proteins involved in cell-cycle control.[18]

Availability[edit]

Web Service[edit]

The PredictProtein web service is available at www.predictprotein.org. Users can submit an amino acid sequence, and get in return a set of automatic annotations for the submitted sequence. The service is supported by a database of pre-calculated results that speed up the interaction time.

Cloud Solution[buzzword][edit]

The PredictProtein cloud solution[buzzword] builds upon the open source operating system Debian,[19] and provides its functionality as a set of free [20] Debian software packages. Bio-Linux is an operating system for bioinformatics and computational biology. Its latest release 7 provides more than 500 bioinformatics programs on an Ubuntu Linux base.[21] Ubuntu is a Debian derivative, an operating system that is based on Debian with its own additions. Cloud BioLinux is a comprehensive cloud solution[buzzword] that is derived from Bio-Linux and Ubuntu. Debian derivatives can easily share packages between each other. For example, Debian packages are automatically incorporated in Ubuntu,[22] and are also usable in Cloud BioLinux (the procedure is described in [23]).

See also[edit]

References[edit]

  1. ^ Rost, B. (1999). "Twilight zone of protein sequence alignments". Protein Engineering. 12 (2): 85–94. doi:10.1093/protein/12.2.85. PMID 10195279.
  • ^ Altschul S.F. and Gish,W. (1996) Local alignment statistics. Methods Enzymol., 266, 460–480.
  • ^ Altschul S., Madden,T., Shaffer,A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D. (1997 Gapped Blast and PSI-Blast: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402.
  • ^ Przybylski D. and Rost,B. (2002) Alignments grow, secondary structure prediction improves. Proteins, 46, 195–205.
  • ^ Jones D.T. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol., 292, 195–202.
  • ^ Hofmann K., Bucher,P., Falquet,L. and Bairoch,A. (1999) The PROSITE database, its status in 1999. Nucleic Acids Res., 27, 215–219.
  • ^ Rost B. (1996) PHD: predicting one-dimensional protein structure by profile based neural networks. Methods Enzymol., 266, 525–539
  • ^ Rost B. (2001) Protein secondary structure prediction continues to rise. J. Struct. Biol., 134, 204–218.
  • ^ Bigelow, H.; Rost, B. (2006). "PROFtmb: A web server for predicting bacterial transmembrane beta barrel proteins". Nucleic Acids Research. 34 (Web Server issue): W186–W188. doi:10.1093/nar/gkl262. PMC 1538807. PMID 16844988.
  • ^ Lupas A., Van Dyke,M. and Stock,J. (1991) Predicting coiled coils from protein sequences. Science, 252, 1162–1164.
  • ^ Punta, M.; Rost, B. (2005). "PROFcon: Novel prediction of long-range contacts". Bioinformatics. 21 (13): 2960–2968. doi:10.1093/bioinformatics/bti454. PMID 15890748.
  • ^ Wootton J.C. and Federhen,S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol., 266, 554–571.
  • ^ Liu J., Tan,H. and Rost,B. (2002) Loopy proteins appear conserved in evolution. J. Mol. Biol., 322, 53–64
  • ^ Liu J. and Rost,B. (2003) NORSp: predictions of long regions without regular secondary structure. Nucleic Acids Res., 31, 3833–3835
  • ^ Cokol M., Nair,R. and Rost,B. (2000) Finding nuclear localisation signals. EMBO Rep., 1, 411–415.
  • ^ Nair R., Carter,P. and Rost,B. (2003) NLSdb: database of nuclear localization signals. Nucleic Acids Res., 31, 397–399
  • ^ Nair R. and Rost,B. (2003) Better prediction of sub-cellular localization by combining evolutionary and structural information. Proteins, 53, 917–930
  • ^ Wrzeszczynski K.O. and Rost,B. (2004) Cataloguing proteins in cell cycle control. Methods Mol. Biol., 241, 219–233
  • ^ Amor, J.J., et al. From pigs to stripes: A travel through debian. in Proceedings of the DebConf5 (Debian Annual Developers Meeting). 2005. Citeseer.
  • ^ The Debian Free Software Guidelines (DFSG). Available from: http://www.debian.org/social_contract#guidelines
  • ^ Dawn Field, B.T., Tim Booth, Stewart Houten, Dan Swan, Nicolas Bertrand, Milo Thurston. Bio-Linux 7. 2012; Available from: http://nebc.nerc.ac.uk/tools/bio-linux/bio-linux-7-info
  • ^ NEW packages through Debian. Available from: https://wiki.ubuntu.com/UbuntuDevelopment/NewPackages#NEW_packages_through_Debian
  • ^ Krampis, K., et al., Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community. BMC Bioinformatics, 2012. 13: p. 42

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Predictprotein&oldid=1159708967"

    Categories: 
    Protein engineering
    Proteomics
    Hidden categories: 
    Articles with short description
    Short description matches Wikidata
    Wikipedia articles containing buzzwords from December 2021
     



    This page was last edited on 12 June 2023, at 02:02 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki