Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
1.
Nucleic Acids Res ; 50(D1): D765-D770, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34634797

ABSTRACT

The COVID-19 pandemic has seen unprecedented use of SARS-CoV-2 genome sequencing for epidemiological tracking and identification of emerging variants. Understanding the potential impact of these variants on the infectivity of the virus and the efficacy of emerging therapeutics and vaccines has become a cornerstone of the fight against the disease. To support the maximal use of genomic information for SARS-CoV-2 research, we launched the Ensembl COVID-19 browser; the first virus to be encompassed within the Ensembl platform. This resource incorporates a new Ensembl gene set, multiple variant sets, and annotation from several relevant resources aligned to the reference SARS-CoV-2 assembly. Since the first release in May 2020, the content has been regularly updated using our new rapid release workflow, and tools such as the Ensembl Variant Effect Predictor have been integrated. The Ensembl COVID-19 browser is freely available at https://covid-19.ensembl.org.


Subject(s)
COVID-19/virology , Databases, Genetic , SARS-CoV-2/genetics , Web Browser , Coronaviridae/genetics , Genetic Variation , Genome, Viral , Humans , Molecular Sequence Annotation
2.
Nucleic Acids Res ; 47(D1): D941-D947, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30371878

ABSTRACT

COSMIC, the Catalogue Of Somatic Mutations In Cancer (https://cancer.sanger.ac.uk) is the most detailed and comprehensive resource for exploring the effect of somatic mutations in human cancer. The latest release, COSMIC v86 (August 2018), includes almost 6 million coding mutations across 1.4 million tumour samples, curated from over 26 000 publications. In addition to coding mutations, COSMIC covers all the genetic mechanisms by which somatic mutations promote cancer, including non-coding mutations, gene fusions, copy-number variants and drug-resistance mutations. COSMIC is primarily hand-curated, ensuring quality, accuracy and descriptive data capture. Building on our manual curation processes, we are introducing new initiatives that allow us to prioritize key genes and diseases, and to react more quickly and comprehensively to new findings in the literature. Alongside improvements to the public website and data-download systems, new functionality in COSMIC-3D allows exploration of mutations within three-dimensional protein structures, their protein structural and functional impacts, and implications for druggability. In parallel with COSMIC's deep and broad variant coverage, the Cancer Gene Census (CGC) describes a curated catalogue of genes driving every form of human cancer. Currently describing 719 genes, the CGC has recently introduced functional descriptions of how each gene drives disease, summarized into the 10 cancer Hallmarks.


Subject(s)
Databases, Nucleic Acid , Mutation , Neoplasms/genetics , Genes , Humans , Protein Conformation
3.
Nucleic Acids Res ; 45(D1): D777-D783, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899578

ABSTRACT

COSMIC, the Catalogue of Somatic Mutations in Cancer (http://cancer.sanger.ac.uk) is a high-resolution resource for exploring targets and trends in the genetics of human cancer. Currently the broadest database of mutations in cancer, the information in COSMIC is curated by expert scientists, primarily by scrutinizing large numbers of scientific publications. Over 4 million coding mutations are described in v78 (September 2016), combining genome-wide sequencing results from 28 366 tumours with complete manual curation of 23 489 individual publications focused on 186 key genes and 286 key fusion pairs across all cancers. Molecular profiling of large tumour numbers has also allowed the annotation of more than 13 million non-coding mutations, 18 029 gene fusions, 187 429 genome rearrangements, 1 271 436 abnormal copy number segments, 9 175 462 abnormal expression variants and 7 879 142 differentially methylated CpG dinucleotides. COSMIC now details the genetics of drug resistance, novel somatic gene mutations which allow a tumour to evade therapeutic cancer drugs. Focusing initially on highly characterized drugs and genes, COSMIC v78 contains wide resistance mutation profiles across 20 drugs, detailing the recurrence of 301 unique resistance alleles across 1934 drug-resistant tumours. All information from the COSMIC database is available freely on the COSMIC website.


Subject(s)
Databases, Genetic , Mutation , Neoplasms/genetics , Computational Biology/methods , Drug Resistance, Neoplasm/genetics , Genome, Human , Genome-Wide Association Study/methods , Genomics/methods , Humans , Web Browser
4.
Nucleic Acids Res ; 44(D1): D279-85, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26673716

ABSTRACT

In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool.


Subject(s)
Databases, Protein , Proteins/classification , Proteome/chemistry , Sequence Alignment , Sequence Analysis, Protein , Molecular Sequence Annotation
5.
Nucleic Acids Res ; 43(Database issue): D130-7, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25392425

ABSTRACT

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated/chemistry , Genomics , Internet , Molecular Sequence Annotation , Nucleic Acid Conformation , Nucleotide Motifs , RNA, Long Noncoding/chemistry , RNA, Untranslated/classification , Software
6.
Nature ; 517(7534): 272, 2015 Jan 15.
Article in English | MEDLINE | ID: mdl-25592527
7.
Nucleic Acids Res ; 42(Database issue): D222-30, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24288371

ABSTRACT

Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.


Subject(s)
Databases, Protein , Sequence Alignment , Sequence Analysis, Protein , Internet , Intrinsically Disordered Proteins/chemistry , Protein Conformation , Proteins/chemistry , Proteins/classification , Proteins/genetics , Proteome/chemistry , Sequence Analysis, DNA
8.
Nucleic Acids Res ; 41(Database issue): D226-32, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23125362

ABSTRACT

The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated/chemistry , RNA, Untranslated/classification , Base Sequence , Genomics , Internet , Molecular Sequence Annotation , Nucleic Acid Conformation , RNA, Untranslated/genetics , Sequence Alignment , User-Computer Interface
9.
Nucleic Acids Res ; 40(Database issue): D290-301, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22127870

ABSTRACT

Pfam is a widely used database of protein families, currently containing more than 13,000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the 'sunburst' representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.


Subject(s)
Databases, Protein , Proteins/classification , Encyclopedias as Topic , Internet , Protein Structure, Tertiary , Sequence Homology, Amino Acid
10.
Nucleic Acids Res ; 40(Database issue): D306-12, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22096229

ABSTRACT

InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.


Subject(s)
Databases, Protein , Protein Structure, Tertiary , Proteins/classification , Proteins/physiology , Sequence Analysis, Protein , Software , Terminology as Topic , User-Computer Interface
11.
Nucleic Acids Res ; 39(Database issue): D141-5, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21062808

ABSTRACT

The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated/chemistry , Encyclopedias as Topic , Models, Statistical , Nucleic Acid Conformation , RNA, Untranslated/classification , Sequence Alignment , Sequence Analysis, RNA
12.
Nucleic Acids Res ; 38(Database issue): D211-22, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19920124

ABSTRACT

Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).


Subject(s)
Computational Biology/methods , Databases, Nucleic Acid , Databases, Protein , Amino Acid Sequence , Animals , Computational Biology/trends , Genome, Archaeal , Genome, Fungal , Humans , Information Storage and Retrieval/methods , Internet , Molecular Sequence Data , Protein Structure, Tertiary , Sequence Alignment , Sequence Homology, Amino Acid , Software
13.
Anticancer Drugs ; 22(1): 104-10, 2011 Jan.
Article in English | MEDLINE | ID: mdl-20938339

ABSTRACT

Genotyping of putative determinants of temozolomide (TMZ)-induced life-threatening bone marrow suppression was performed in two patients with glioma treated with adjuvant TMZ and radiation therapy. DNA was extracted from the patients' mononuclear cells and genotyping of O-methylguanine-DNA-methyltransferase (MGMT), multidrug resistance (MDR1; also known as ABCB1), NQO1, and GSTP1 genes and analysis for the epigenetic silencing of specific MGMT gene promoters were carried out to evaluate the possible genetic determinants of increased risk of severe TMZ-induced myelosuppression. Although both patients were heterozygous for all ABCB1 single nucleotide polymorphisms and for rs12917 and rs1803965 in the MGMT gene, patient 1 was heterozygous for rs1695 in GSTP1 and rs2308327 in the MGMT gene. This patient also exhibited GG genotype for the MGMT single nucleotide polymorphisms, rs2308321, which is noteworthy for its 0.7% frequency globally. Epigenetic silencing of MGMT gene was not detected in either patient. Two single nucleotide polymorphisms identified in patient 1 (missense I143V and K178R polymorphisms; rs2308321 and rs2308327, respectively) have recently been shown to correlate with an increased risk of severe TMZ-induced myelosuppression. The polymorphisms identified in patient 2 have not been associated with an increased risk of severe TMZ-induced myelosuppression. Genotyping analyses of larger patient populations administered TMZ are required to validate the genetic determinants of severe TMZ-induced myelosuppression.


Subject(s)
Antineoplastic Agents, Alkylating/adverse effects , Dacarbazine/analogs & derivatives , Glioma/drug therapy , Myeloproliferative Disorders/chemically induced , Myeloproliferative Disorders/genetics , ATP Binding Cassette Transporter, Subfamily B , ATP Binding Cassette Transporter, Subfamily B, Member 1/genetics , Adult , Antineoplastic Agents, Alkylating/pharmacology , Dacarbazine/adverse effects , Dacarbazine/therapeutic use , Female , Glioma/enzymology , Glioma/genetics , Glutathione S-Transferase pi/genetics , Humans , Male , Middle Aged , NAD(P)H Dehydrogenase (Quinone)/genetics , Neoplasms, Neuroepithelial/complications , Neoplasms, Neuroepithelial/therapy , O(6)-Methylguanine-DNA Methyltransferase/genetics , Patients , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Temozolomide
14.
Nucleic Acids Res ; 37(Database issue): D136-40, 2009 Jan.
Article in English | MEDLINE | ID: mdl-18953034

ABSTRACT

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.


Subject(s)
Databases, Nucleic Acid , RNA/chemistry , RNA/classification , Computer Graphics , Internet , Sequence Alignment , Sequence Analysis, RNA
15.
RNA ; 14(12): 2462-4, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18945806

ABSTRACT

The online encyclopedia Wikipedia has become one of the most important online references in the world and has a substantial and growing scientific content. A search of Google with many RNA-related keywords identifies a Wikipedia article as the top hit. We believe that the RNA community has an important and timely opportunity to maximize the content and quality of RNA information in Wikipedia. To this end, we have formed the RNA WikiProject (http://en.wikipedia.org/wiki/Wikipedia:WikiProject_RNA) as part of the larger Molecular and Cellular Biology WikiProject. We have created over 600 new Wikipedia articles describing families of noncoding RNAs based on the Rfam database, and invite the community to update, edit, and correct these articles. The Rfam database now redistributes this Wikipedia content as the primary textual annotation of its RNA families. Users can, therefore, for the first time, directly edit the content of one of the major RNA databases. We believe that this Wikipedia/Rfam link acts as a functioning model for incorporating community annotation into molecular biology databases.


Subject(s)
Databases, Nucleic Acid , RNA/genetics , Database Management Systems , RNA/chemistry
16.
Invest New Drugs ; 28(1): 91-7, 2010 Feb.
Article in English | MEDLINE | ID: mdl-19238328

ABSTRACT

BACKGROUND: The objective of ECOG 1503 was to determine the response rate of this combination in the second-line treatment of advanced NSCLC. METHODS: Triapine 105 mg/m(2) IV on days 1, 8, and 15, and gemcitabine 1,000 mg/m(2) on days 1, 8, and 15, of a 28 day cycle. RESULTS: Eighteen patients enrolled. Three patients were not eligible due to protocol violations. No objective antitumor responses were seen. Three patients (20%) experienced stable disease (90% CI 5.7-44%). Median overall survival: 5.4 months (95% CI 4.2-11.6 months); median time to progression: 1.8 months (95% CI 1.7-3.5 months). Five patients developed acute infusion reactions to Triapine related to elevated methemoglobinemia. Patients with MDR1 variant genotypes of C3435T experienced superior overall survival compared to non-variants (13.3 vs. 4.3 months, respectively, p = 0.023). CONCLUSION: This regimen did not demonstrate activity in relapsed NSCLC. Prolonged survival seen with MDR1 variant genotypes is hypothesis-generating.


Subject(s)
Antineoplastic Agents/therapeutic use , Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Carcinoma, Non-Small-Cell Lung/drug therapy , Cooperative Behavior , Deoxycytidine/analogs & derivatives , Lung Neoplasms/drug therapy , Medical Oncology , ATP Binding Cassette Transporter, Subfamily B , ATP Binding Cassette Transporter, Subfamily B, Member 1/genetics , ATP Binding Cassette Transporter, Subfamily B, Member 2 , ATP-Binding Cassette Transporters/genetics , Aged , Antineoplastic Agents/adverse effects , Antineoplastic Combined Chemotherapy Protocols/adverse effects , Carcinoma, Non-Small-Cell Lung/pathology , Deoxycytidine/adverse effects , Deoxycytidine/therapeutic use , Female , Genotype , Humans , Lung Neoplasms/pathology , Male , Middle Aged , Neoplasm Staging , Pyridines/adverse effects , Pyridines/therapeutic use , Survival Analysis , Thiosemicarbazones/adverse effects , Thiosemicarbazones/therapeutic use , Time Factors , Treatment Outcome , Gemcitabine
17.
Nucleic Acids Res ; 36(Database issue): D281-8, 2008 Jan.
Article in English | MEDLINE | ID: mdl-18039703

ABSTRACT

Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. The current release of Pfam (22.0) contains 9318 protein families. Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenPept and on sequences from selected metagenomics projects. Pfam is available on the web from the consortium members using a new, consistent and improved website design in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/), as well as from mirror sites in France (http://pfam.jouy.inra.fr/) and South Korea (http://pfam.ccbb.re.kr/).


Subject(s)
Databases, Protein , Protein Structure, Tertiary , Proteins/classification , Animals , Genomics , Internet , Proteins/genetics , Sequence Alignment , Sequence Analysis, Protein , User-Computer Interface
18.
Nucleic Acids Res ; 30(1): 409-11, 2002 Jan 01.
Article in English | MEDLINE | ID: mdl-11752351

ABSTRACT

The Conserved Key Amino Acid Positions DataBase (CKAAPs DB) provides access to an analysis of structurally similar proteins with dissimilar sequences where key residues within a common fold are identified. CKAAPs may be important in protein folding and structural stability and function, and hence useful for protein engineering studies. This paper provides an update to the initial report of CKAAPs DB [Li et al. (2001) Nucleic Acids Res., 29, 329-331]. CKAAPs DB contains CKAAPs for the representative set of polypeptide chains derived from the CE and FSSP databases, as well as subdomains (conserved regions of the order of 100 residues within a domain) identified by CE. The new version now offers different perspectives on the CKAAPs. First, CKAAPs are mapped onto their respective Protein Data Bank (PDB) structures rendered by Molscript, providing a spatial context for the CKAAPs. Secondly, CKAAPs may be highlighted within a structure-based sequence alignment, as well as secondary structure alignment. Thirdly, the resulting sequence homologs from the structure alignment may be viewed in alignments colorized based on identities and property groups using Mview. New search capabilities have also been provided for searching by keyword combinations, PDB IDs, EC numbers, GI numbers, LocusLink ID, taxonomy, gene ontology and pathways. A new custom CKAAPs analysis interface has been implemented where a user may change the criteria for inclusion of chains, initiate CKAAPs analysis and retrieve results. CKAAPs DB is accessible through the web at http://ckaaps.sdsc.edu/. Plain text analysis results are available by FTP at ftp://ftp.sdsc.edu/pub/sdsc/biology/ckaap.


Subject(s)
Conserved Sequence , Databases, Protein , Amino Acid Sequence , Animals , Information Storage and Retrieval , Internet , Peptides/chemistry , Protein Folding , Protein Structure, Secondary , Protein Structure, Tertiary , Proteins/chemistry , Sequence Alignment , User-Computer Interface
19.
Biochim Biophys Acta ; 1565(2): 294-307, 2002 Oct 11.
Article in English | MEDLINE | ID: mdl-12409202

ABSTRACT

Potassium channels have been studied intensively in terms of the relationship between molecular structure and physiological function. They provide an opportunity to integrate structural and computational studies in order to arrive at an atomic resolution description of mechanism. We review recent progress in K channel structural studies, focussing on the bacterial channel KcsA. Structural studies can be extended via use of computational (i.e. molecular simulation) approaches in order to provide a perspective on aspects of channel function such as permeation, selectivity, block and gating. Results from molecular dynamics simulations are shown to be in good agreement with recent structural studies of KcsA in terms of the interactions of K(+) ions with binding sites within the selectivity filter of the channel, and in revealing the importance of filter flexibility in channel function. We discuss how the KcsA structure may be used as a template for developing structural models of other families of K channels. Progress in this area is explored via two examples: inward rectifier (Kir) and voltage-gated (Kv) potassium channels. A brief account of structural studies of ancillary domains and subunits of K channels is provided.


Subject(s)
Bacterial Proteins/chemistry , Bacterial Proteins/physiology , Membrane Proteins/chemistry , Potassium Channels/chemistry , Potassium Channels/physiology , Amino Acid Sequence , Animals , Binding Sites , Cations, Monovalent/chemistry , Computer Simulation , Crystallography, X-Ray , Humans , Membrane Proteins/physiology , Models, Molecular , Molecular Sequence Data , Molecular Structure , Potassium Channels, Inwardly Rectifying/chemistry , Potassium Channels, Voltage-Gated/chemistry , Protein Structure, Tertiary , Sequence Alignment , Structure-Activity Relationship
20.
Methods Mol Biol ; 1269: 349-63, 2015.
Article in English | MEDLINE | ID: mdl-25577390

ABSTRACT

The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.


Subject(s)
Computational Biology/methods , RNA, Untranslated/chemistry , Databases, Nucleic Acid , Sequence Analysis, RNA , Software
SELECTION OF CITATIONS
SEARCH DETAIL