Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 51(D1): D418-D427, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350672

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction.


Assuntos
Bases de Dados de Proteínas , Humanos , Sequência de Aminoácidos , Inteligência Artificial , Internet , Proteínas/química , Software
2.
Nucleic Acids Res ; 49(D1): D916-D923, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33270111

RESUMO

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Assuntos
COVID-19/prevenção & controle , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Anotação de Sequência Molecular/métodos , SARS-CoV-2/genética , Animais , COVID-19/epidemiologia , COVID-19/virologia , Epidemias , Humanos , Internet , Camundongos , Pseudogenes/genética , RNA Longo não Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiologia , Transcrição Gênica/genética
3.
Nucleic Acids Res ; 49(D1): D884-D891, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33137190

RESUMO

The Ensembl project (https://www.ensembl.org) annotates genomes and disseminates genomic data for vertebrate species. We create detailed and comprehensive annotation of gene structures, regulatory elements and variants, and enable comparative genomics by inferring the evolutionary history of genes and genomes. Our integrated genomic data are made available in a variety of ways, including genome browsers, search interfaces, specialist tools such as the Ensembl Variant Effect Predictor, download files and programmatic interfaces. Here, we present recent Ensembl developments including two new website portals. Ensembl Rapid Release (http://rapid.ensembl.org) is designed to provide core tools and services for genomes as soon as possible and has been deployed to support large biodiversity sequencing projects. Our SARS-CoV-2 genome browser (https://covid-19.ensembl.org) integrates our own annotation with publicly available genomic data from numerous sources to facilitate the use of genomics in the international scientific response to the COVID-19 pandemic. We also report on other updates to our annotation resources, tools and services. All Ensembl data and software are freely available without restriction.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Genômica/métodos , SARS-CoV-2/genética , Vertebrados/genética , Animais , COVID-19/epidemiologia , COVID-19/virologia , Humanos , Internet , Anotação de Sequência Molecular/métodos , Pandemias , Vertebrados/classificação
4.
Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33156333

RESUMO

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Sequência de Aminoácidos , COVID-19/metabolismo , Internet , Anotação de Sequência Molecular , Domínios Proteicos , Mapas de Interação de Proteínas , SARS-CoV-2/metabolismo , Alinhamento de Sequência
5.
Nucleic Acids Res ; 48(D1): D682-D688, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31691826

RESUMO

The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms. The Ensembl annotation pipeline is capable of integrating experimental and reference data from multiple providers into a single integrated resource. Here, we present 94 newly annotated and re-annotated genomes, bringing the total number of genomes offered by Ensembl to 227. This represents the single largest expansion of the resource since its inception. We also detail our continued efforts to improve human annotation, developments in our epigenome analysis and display, a new tool for imputing causal genes from genome-wide association studies and visualisation of variation within a 3D protein model. Finally, we present information on our new website. Both software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license) and data updates made available four times a year.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Epigenoma , Anotação de Sequência Molecular , Algoritmos , Animais , Gráficos por Computador , Bases de Dados de Proteínas , Variação Genética , Estudo de Associação Genômica Ampla , Genômica , Histonas/metabolismo , Humanos , Imageamento Tridimensional , Internet , Ligantes , Ferramenta de Busca , Software , Especificidade da Espécie , Transcriptoma , Interface Usuário-Computador , Navegador
6.
Nucleic Acids Res ; 47(D1): D745-D751, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30407521

RESUMO

The Ensembl project (https://www.ensembl.org) makes key genomic data sets available to the entire scientific community without restrictions. Ensembl seeks to be a fundamental resource driving scientific progress by creating, maintaining and updating reference genome annotation and comparative genomics resources. This year we describe our new and expanded gene, variant and comparative annotation capabilities, which led to a 50% increase in the number of vertebrate genomes we support. We have also doubled the number of available human variants and added regulatory regions for many mouse cell types and developmental stages. Our data sets and tools are available via the Ensembl website as well as a through a RESTful webservice, Perl application programming interface and as data files for download.


Assuntos
Bases de Dados Genéticas , Genoma/genética , Genômica , Vertebrados/genética , Animais , Biologia Computacional/tendências , Humanos , Camundongos , Anotação de Sequência Molecular , Software
7.
Nucleic Acids Res ; 47(D1): D766-D773, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30357393

RESUMO

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.


Assuntos
Bases de Dados Genéticas , Genoma Humano/genética , Genômica , Pseudogenes/genética , Animais , Biologia Computacional , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Software
8.
Nat Methods ; 14(3): 287-289, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28135257

RESUMO

Loss-of-function studies are key for investigating gene function, and CRISPR technology has made genome editing widely accessible in model organisms and cells. However, conditional gene inactivation in diploid cells is still difficult to achieve. Here, we present CRISPR-FLIP, a strategy that provides an efficient, rapid and scalable method for biallelic conditional gene knockouts in diploid or aneuploid cells, such as pluripotent stem cells, 3D organoids and cell lines, by co-delivery of CRISPR-Cas9 and a universal conditional intronic cassette.


Assuntos
Sistemas CRISPR-Cas/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Células-Tronco Embrionárias/citologia , Edição de Genes/métodos , Técnicas de Inativação de Genes/métodos , beta Catenina/genética , Animais , Linhagem Celular , Genoma/genética , Células HEK293 , Humanos , Camundongos
9.
Bioinformatics ; 31(18): 3078-80, 2015 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-25979474

RESUMO

UNLABELLED: The rapid development of CRISPR-Cas9 mediated genome editing techniques has given rise to a number of online and stand-alone tools to find and score CRISPR sites for whole genomes. Here we describe the Wellcome Trust Sanger Institute Genome Editing database (WGE), which uses novel methods to compute, visualize and select optimal CRISPR sites in a genome browser environment. The WGE database currently stores single and paired CRISPR sites and pre-calculated off-target information for CRISPRs located in the mouse and human exomes. Scoring and display of off-target sites is simple, and intuitive, and filters can be applied to identify high-quality CRISPR sites rapidly. WGE also provides a tool for the design and display of gene targeting vectors in the same genome browser, along with gene models, protein translation and variation tracks. WGE is open, extensible and can be set up to compute and present CRISPR sites for any genome. AVAILABILITY AND IMPLEMENTATION: The WGE database is freely available at www.sanger.ac.uk/htgt/wge CONTACT: : vvi@sanger.ac.uk or skarnes@sanger.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sistemas CRISPR-Cas/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Bases de Dados Factuais , Regulação da Expressão Gênica , Vetores Genéticos , Genoma , Edição de RNA/genética , Animais , Humanos , Camundongos , Software
10.
PLoS One ; 8(5): e62984, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23658791

RESUMO

With the amount of chemical data being produced and reported in the literature growing at a fast pace, it is increasingly important to efficiently retrieve this information. To tackle this issue text mining tools have been applied, but despite their good performance they still provide many errors that we believe can be filtered by using semantic similarity. Thus, this paper proposes a novel method that receives the results of chemical entity identification systems, such as Whatizit, and exploits the semantic relationships in ChEBI to measure the similarity between the entities found in the text. The method assigns a single validation score to each entity based on its similarities with the other entities also identified in the text. Then, by using a given threshold, the method selects a set of validated entities and a set of outlier entities. We evaluated our method using the results of two state-of-the-art chemical entity identification tools, three semantic similarity measures and two text window sizes. The method was able to increase precision without filtering a significant number of correctly identified entities. This means that the method can effectively discriminate the correctly identified chemical entities, while discarding a significant number of identification errors. For example, selecting a validation set with 75% of all identified entities, we were able to increase the precision by 28% for one of the chemical entity identification tools (Whatizit), maintaining in that subset 97% the correctly identified entities. Our method can be directly used as an add-on by any state-of-the-art entity identification tool that provides mappings to a database, in order to improve their results. The proposed method is included in a freely accessible web tool at www.lasige.di.fc.ul.pt/webtools/ice/.


Assuntos
Mineração de Dados/métodos , Bases de Dados de Compostos Químicos , Semântica , Mineração de Dados/normas , Patentes como Assunto , Padrões de Referência , Reprodutibilidade dos Testes
11.
ISRN Bioinform ; 2012: 619427, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-25937941

RESUMO

Chemical entities are ubiquitous through the biomedical literature and the development of text-mining systems that can efficiently identify those entities are required. Due to the lack of available corpora and data resources, the community has focused its efforts in the development of gene and protein named entity recognition systems, but with the release of ChEBI and the availability of an annotated corpus, this task can be addressed. We developed a machine-learning-based method for chemical entity recognition and a lexical-similarity-based method for chemical entity resolution and compared them with Whatizit, a popular-dictionary-based method. Our methods outperformed the dictionary-based method in all tasks, yielding an improvement in F-measure of 20% for the entity recognition task, 2-5% for the entity-resolution task, and 15% for combined entity recognition and resolution tasks.

12.
Mol Pharmacol ; 71(1): 366-76, 2007 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17065237

RESUMO

According to previous reports, flavonoids and nutraceuticals correct defective electrolyte transport in cystic fibrosis (CF) airways. Traditional medicinal plants from China and Thailand contain phytoflavonoids and other bioactive compounds. We examined herbal extracts of the common Thai medicinal euphorbiaceous plant Phyllanthus acidus for their potential effects on epithelial transport. Functional assays by Ussing chamber, patch-clamping, double-electrode voltage-clamp and Ca2+ imaging demonstrate activation of Cl- secretion and inhibition of Na+ absorption by P. acidus. No cytotoxic effects of P. acidus could be detected. Mucosal application of P. acidus to native mouse trachea suggested transient and steady-state activation of Cl- secretion by increasing both intracellular Ca2+ and cAMP. These effects were mimicked by a mix of the isolated components adenosine, kaempferol, and hypogallic acid. Additional experiments in human airway cells and CF transmembrane conductance regulator (CFTR)-expressing BHK cells and Xenopus laevis oocytes confirm the results obtained in native tissues. Cl- secretion was also induced in tracheas of CF mice homozygous for Phe508del-CFTR and in Phe508del-CFTR homozygous human airway epithelial cells. Taken together, P. acidus corrects defective electrolyte transport in CF airways by parallel mechanisms including 1) increasing the intracellular levels of second messengers cAMP and Ca2+, thereby activating Ca2+-dependent Cl- channels and residual CFTR-Cl- conductance; 2) stimulating basolateral K+ channels; 3) redistributing cellular localization of CFTR; 4) directly activating CFTR; and 5) inhibiting ENaC through activation of CFTR. These combinatorial effects on epithelial transport may provide a novel complementary nutraceutical treatment for the CF lung disease.


Assuntos
Cloretos/metabolismo , Phyllanthus , Extratos Vegetais/farmacologia , Plantas Medicinais , Cálcio/metabolismo , Técnicas de Cultura de Células , Sobrevivência Celular , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Canais Epiteliais de Sódio/genética , Técnicas de Patch-Clamp , Folhas de Planta
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...