Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Microb Genom ; 10(2)2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38358325

RESUMO

The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Pandemias , COVID-19/epidemiologia , Genômica , Disseminação de Informação
2.
Nucleic Acids Res ; 50(D1): D106-D110, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34850158

RESUMO

The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena), maintained at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) provides freely accessible services, both for deposition of, and access to, open nucleotide sequencing data. Open scientific data are of paramount importance to the scientific community and contribute daily to the acceleration of scientific advance. Here, we outline the major updates to ENA's services and infrastructure that have been delivered over the past year.


Assuntos
Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Nucleotídeos/genética , Software , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Anotação de Sequência Molecular , Nucleotídeos/classificação
3.
PLoS One ; 16(1): e0245475, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33476328

RESUMO

INTRODUCTION: Depression, cardiovascular diseases and diabetes are among the major non-communicable diseases, leading to significant disability and mortality worldwide. These diseases may share environmental and genetic determinants associated with multimorbid patterns. Stressful early-life events are among the primary factors associated with the development of mental and physical diseases. However, possible causative mechanisms linking early life stress (ELS) with psycho-cardio-metabolic (PCM) multi-morbidity are not well understood. This prevents a full understanding of causal pathways towards the shared risk of these diseases and the development of coordinated preventive and therapeutic interventions. METHODS AND ANALYSIS: This paper describes the study protocol for EarlyCause, a large-scale and inter-disciplinary research project funded by the European Union's Horizon 2020 research and innovation programme. The project takes advantage of human longitudinal birth cohort data, animal studies and cellular models to test the hypothesis of shared mechanisms and molecular pathways by which ELS shapes an individual's physical and mental health in adulthood. The study will research in detail how ELS converts into biological signals embedded simultaneously or sequentially in the brain, the cardiovascular and metabolic systems. The research will mainly focus on four biological processes including possible alterations of the epigenome, neuroendocrine system, inflammatome, and the gut microbiome. Life-course models will integrate the role of modifying factors as sex, socioeconomics, and lifestyle with the goal to better identify groups at risk as well as inform promising strategies to reverse the possible mechanisms and/or reduce the impact of ELS on multi-morbidity development in high-risk individuals. These strategies will help better manage the impact of multi-morbidity on human health and the associated risk.


Assuntos
Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/etiologia , Depressão/epidemiologia , Depressão/etiologia , Diabetes Mellitus/epidemiologia , Diabetes Mellitus/etiologia , Estresse Psicológico/complicações , Adulto , Experiências Adversas da Infância/psicologia , Biomarcadores/metabolismo , Doenças Cardiovasculares/metabolismo , Doenças Cardiovasculares/psicologia , Criança , Depressão/metabolismo , Depressão/psicologia , Diabetes Mellitus/metabolismo , Diabetes Mellitus/psicologia , Meio Ambiente , Humanos , Estudos Longitudinais , Morbidade , Fatores de Risco
4.
Nucleic Acids Res ; 49(D1): D82-D85, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33175160

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), has for almost forty years continued in its mission to freely archive and present the world's public sequencing data for the benefit of the entire scientific community and for the acceleration of the global research effort. Here we highlight the major developments to ENA services and content in 2020, focussing in particular on the recently released updated ENA browser, modernisation of our release process and our data coordination collaborations with specific research communities.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos/tendências , Ácidos Nucleicos/genética , Nucleotídeos/genética , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Europa (Continente) , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Anotação de Sequência Molecular , Ácidos Nucleicos/química , Nucleotídeos/química , Análise de Sequência de DNA , Análise de Sequência de RNA
5.
G3 (Bethesda) ; 10(4): 1361-1374, 2020 04 09.
Artigo em Inglês | MEDLINE | ID: mdl-32071071

RESUMO

Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.


Assuntos
Genoma , Software , Análise de Sequência de DNA
6.
Nucleic Acids Res ; 48(D1): D70-D76, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31722421

RESUMO

The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.


Assuntos
Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Genômica , Biologia Computacional/métodos , Europa (Continente) , Genômica/métodos , Anotação de Sequência Molecular , Software , Interface Usuário-Computador , Navegador
7.
Nucleic Acids Res ; 47(D1): D84-D88, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30395270

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided from EMBL-EBI, has for more than three decades been responsible for archiving the world's public sequencing data and presenting this important resource to the scientific community to support and accelerate the global research effort. Here, we outline ENA services and content in 2018 and provide an overview of a selection of focus areas of development work: extending data coordination services around ENA, sequence submissions through template expansion, early pre-submission validation tools and our move towards a new browser and retrieval infrastructure.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Genômica/métodos , Europa (Continente) , Genoma , Humanos , Anotação de Sequência Molecular , Ferramenta de Busca , Software , Transcriptoma , Interface Usuário-Computador , Navegador
8.
Nucleic Acids Res ; 46(D1): D36-D40, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29140475

RESUMO

For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world's public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.


Assuntos
Bases de Dados de Ácidos Nucleicos , Biologia Computacional , Bases de Dados de Ácidos Nucleicos/tendências , Europa (Continente) , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Armazenamento e Recuperação da Informação , Internet , Anotação de Sequência Molecular
9.
J Eukaryot Microbiol ; 64(3): 407-411, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28337822

RESUMO

Universal taxonomic frameworks have been critical tools to structure the fields of botany, zoology, mycology, and bacteriology as well as their large research communities. Animals, plants, and fungi have relatively solid, stable morpho-taxonomies built over the last three centuries, while bacteria have been classified for the last three decades under a coherent molecular taxonomic framework. By contrast, no such common language exists for microbial eukaryotes, even though environmental '-omics' surveys suggest that protists make up most of the organismal and genetic complexity of our planet's ecosystems! With the current deluge of eukaryotic meta-omics data, we urgently need to build up a universal eukaryotic taxonomy bridging the protist -omics age to the fragile, centuries-old body of classical knowledge that has effectively linked protist taxa to morphological, physiological, and ecological information. UniEuk is an open, inclusive, community-based and expert-driven international initiative to build a flexible, adaptive universal taxonomic framework for eukaryotes. It unites three complementary modules, EukRef, EukBank, and EukMap, which use phylogenetic markers, environmental metabarcoding surveys, and expert knowledge to inform the taxonomic framework. The UniEuk taxonomy is directly implemented in the European Nucleotide Archive at EMBL-EBI, ensuring its broad use and long-term preservation as a reference taxonomy for eukaryotes.


Assuntos
Classificação , Eucariotos/classificação , Animais , Bactérias/classificação , Biodiversidade , Bases de Dados de Ácidos Nucleicos , Ecossistema , Meio Ambiente , Eucariotos/citologia , Eucariotos/genética , Eucariotos/fisiologia , Células Eucarióticas , Fungos/classificação , Filogenia
10.
Nucleic Acids Res ; 45(D1): D32-D36, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899630

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) offers a rich platform for data sharing, publishing and archiving and a globally comprehensive data set for onward use by the scientific community. With a broad scope spanning raw sequencing reads, genome assemblies and functional annotation, the resource provides extensive data submission, search and download facilities across web and programmatic interfaces. Here, we outline ENA content and major access modalities, highlight major developments in 2016 and outline a number of examples of data reuse from ENA.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Análise de Sequência de RNA , Genômica , Internet , Anotação de Sequência Molecular
11.
Nucleic Acids Res ; 44(D1): D58-66, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26615190

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the submission, maintenance and presentation of nucleotide sequence data and related sample and experimental information. In this article we report on ENA in 2015 regarding general activity, notable published data sets and major achievements. This is followed by a focus on sustainable biocuration of functional annotation, an area which has particularly felt the pressure of sequencing growth. The importance of functional annotation, how it can be submitted and the shifting role of the biocurator in the context of increasing volumes of data are all discussed.


Assuntos
Bases de Dados de Ácidos Nucleicos , Anotação de Sequência Molecular , Análise de Sequência de DNA , Análise de Sequência de RNA , Curadoria de Dados
12.
Nucleic Acids Res ; 42(Database issue): D865-72, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24217909

RESUMO

The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.


Assuntos
Bases de Dados Genéticas , Proteínas/genética , Animais , Éxons , Genômica , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Análise de Sequência
13.
Genome Res ; 22(9): 1760-74, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22955987

RESUMO

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Genômica/métodos , Anotação de Sequência Molecular , Animais , Biologia Computacional/métodos , DNA Complementar/química , DNA Complementar/genética , Evolução Molecular , Éxons , Loci Gênicos , Humanos , Internet , Modelos Moleculares , Fases de Leitura Aberta , Pseudogenes , Controle de Qualidade , Sítios de Splice de RNA , RNA Longo não Codificante , Reprodutibilidade dos Testes , Regiões não Traduzidas
14.
Genome Res ; 19(7): 1316-23, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19498102

RESUMO

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.


Assuntos
Sequência Consenso , Genoma , Fases de Leitura Aberta/genética , Animais , Humanos , Camundongos , Alinhamento de Sequência
15.
Genome Res ; 15(10): 1402-10, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16204193

RESUMO

The Hedgehog (Hh) signaling pathway promotes pattern formation and cell proliferation in Drosophila and vertebrates. Hh is a ligand that binds and represses the Patched (Ptc) receptor and thereby releases the latent activity of the multipass membrane protein Smoothened (Smo), which is essential for transducing the Hh signal. In Caenorhabditis elegans, the Hh signaling pathway has undergone considerable divergence. Surprisingly, obvious Smo and Hh homologs are absent whereas PTC, PTC-related (PTR), and a large family of nematode Hh-related (Hh-r) proteins are present. We find that the number of PTC-related and Hh-r proteins has expanded in C. elegans, and that this expansion occurred early in Nematoda. Moreover, the function of these proteins appears to be conserved in Caenorhabditis briggsae. Given our present understanding of the Hh signaling pathway, the absence of Hh and Smo raises many questions about the evolution and the function of the PTC, PTR, and Hh-r proteins in C. elegans. To gain insights into their roles, we performed a global survey of the phenotypes produced by RNA-mediated interference (RNAi). Our study reveals that these genes do not require Smo for activity and that they function in multiple aspects of C. elegans development, including molting, cytokinesis, growth, and pattern formation. Moreover, a subset of the PTC, PTR, and Hh-r proteins have the same RNAi phenotypes, indicating that they have the potential to participate in the same processes.


Assuntos
Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Receptores de Superfície Celular/genética , Animais , Caenorhabditis elegans/genética , Caenorhabditis elegans/fisiologia , Endocitose/genética , Exocitose/genética , Muda/genética , Receptores Patched , Interferência de RNA
16.
Brief Funct Genomic Proteomic ; 3(1): 26-34, 2004 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-15163357

RESUMO

The nematode Caenorhabditis elegans is widely used as a model organism for studying many fundamental aspects of development and cell biology, including processes underlying human disease. The genome of C. elegans encodes over 19,000 protein-coding genes and hundreds of non-coding RNAs. The availability of whole genome sequence has facilitated the development of high throughput techniques for elucidating the function of individual genes and gene products. Furthermore, attempts can now be made to integrate these substantial functional genomics data collections and to understand at a global level how the flow of genomic information that is at the core of the central dogma leads to the development of a multicellular organism.


Assuntos
Caenorhabditis elegans/genética , Genoma , Animais , Caenorhabditis elegans/fisiologia , Genes Reporter , Análise de Sequência com Séries de Oligonucleotídeos , Proteoma , RNA/química , Interferência de RNA , Transcrição Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA