Search | VHL Regional Portal

The Pfam protein families database.

Punta, Marco; Coggill, Penny C; Eberhardt, Ruth Y; Mistry, Jaina; Tate, John; Boursnell, Chris; Pang, Ningze; Forslund, Kristoffer; Ceric, Goran; Clements, Jody; Heger, Andreas; Holm, Liisa; Sonnhammer, Erik L L; Eddy, Sean R; Bateman, Alex; Finn, Robert D.

Nucleic Acids Res ; 40(Database issue): D290-301, 2012 Jan.

Article in English | MEDLINE | ID: mdl-22127870

ABSTRACT

Pfam is a widely used database of protein families, currently containing more than 13,000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the 'sunburst' representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.

Subject(s)

Databases, Protein , Proteins/classification , Encyclopedias as Topic , Internet , Protein Structure, Tertiary , Sequence Homology, Amino Acid

The Pfam protein families database.

Finn, Robert D; Mistry, Jaina; Tate, John; Coggill, Penny; Heger, Andreas; Pollington, Joanne E; Gavin, O Luke; Gunasekaran, Prasad; Ceric, Goran; Forslund, Kristoffer; Holm, Liisa; Sonnhammer, Erik L L; Eddy, Sean R; Bateman, Alex.

Nucleic Acids Res ; 38(Database issue): D211-22, 2010 Jan.

Article in English | MEDLINE | ID: mdl-19920124

ABSTRACT

Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

Subject(s)

Computational Biology/methods , Databases, Nucleic Acid , Databases, Protein , Amino Acid Sequence , Animals , Computational Biology/trends , Genome, Archaeal , Genome, Fungal , Humans , Information Storage and Retrieval/methods , Internet , Molecular Sequence Data , Protein Structure, Tertiary , Sequence Alignment , Sequence Homology, Amino Acid , Software

The Pfam protein families database.

Finn, Robert D; Tate, John; Mistry, Jaina; Coggill, Penny C; Sammut, Stephen John; Hotz, Hans-Rudolf; Ceric, Goran; Forslund, Kristoffer; Eddy, Sean R; Sonnhammer, Erik L L; Bateman, Alex.

Nucleic Acids Res ; 36(Database issue): D281-8, 2008 Jan.

Article in English | MEDLINE | ID: mdl-18039703

ABSTRACT

Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. The current release of Pfam (22.0) contains 9318 protein families. Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenPept and on sequences from selected metagenomics projects. Pfam is available on the web from the consortium members using a new, consistent and improved website design in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/), as well as from mirror sites in France (http://pfam.jouy.inra.fr/) and South Korea (http://pfam.ccbb.re.kr/).

Subject(s)

Databases, Protein , Protein Structure, Tertiary , Proteins/classification , Animals , Genomics , Internet , Proteins/genetics , Sequence Alignment , Sequence Analysis, Protein , User-Computer Interface

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL