Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Filtrar
1.
Microb Genom ; 10(2)2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38358325

RESUMO

The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Pandemias , COVID-19/epidemiologia , Genômica , Disseminação de Informação
2.
Nucleic Acids Res ; 52(D1): D92-D97, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37956313

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) is maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI). The ENA is one of the three members of the International Nucleotide Sequence Database Collaboration (INSDC). It serves the bioinformatics community worldwide via the submission, processing, archiving and dissemination of sequence data. The ENA supports data types ranging from raw reads, through alignments and assemblies to functional annotation. The data is enriched with contextual information relating to samples and experimental configurations. In this article, we describe recent progress and improvements to ENA services. In particular, we focus upon three areas of work in 2023: FAIRness of ENA data, pandemic preparedness and foundational technology. For FAIRness, we have introduced minimal requirements for spatiotemporal annotation, created a metadata-based classification system, incorporated third party metadata curations with archived records, and developed a new rapid visualisation platform, the ENA Notebooks. For foundational enhancements, we have improved the INSDC data exchange and synchronisation pipelines, and invested in site reliability engineering for ENA infrastructure. In order to support genomic surveillance efforts, we have continued to provide ENA services in support of SARS-CoV-2 data mobilisation and have adapted these for broader pathogen surveillance efforts.


Assuntos
Genômica , Nucleotídeos , Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Internet , Reprodutibilidade dos Testes , Europa (Continente)
3.
Nucleic Acids Res ; 51(D1): D121-D125, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36399492

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), offers those producing data an open and supported platform for the management, archiving, publication, and dissemination of data; and to the scientific community as a whole, it offers a globally comprehensive data set through a host of data discovery and retrieval tools. Here, we describe recent updates to the ENA's submission and retrieval services as well as focused efforts to improve connectivity, reusability, and interoperability of ENA data and metadata.


Assuntos
Bases de Dados de Ácidos Nucleicos , Academias e Institutos , Biologia Computacional , Internet , Software , Conjuntos de Dados como Assunto
4.
Nucleic Acids Res ; 50(D1): D106-D110, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34850158

RESUMO

The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena), maintained at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) provides freely accessible services, both for deposition of, and access to, open nucleotide sequencing data. Open scientific data are of paramount importance to the scientific community and contribute daily to the acceleration of scientific advance. Here, we outline the major updates to ENA's services and infrastructure that have been delivered over the past year.


Assuntos
Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Nucleotídeos/genética , Software , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Anotação de Sequência Molecular , Nucleotídeos/classificação
5.
Bioinformatics ; 38(1): 299-300, 2021 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-34260694

RESUMO

MOTIVATION: Reference sequences are essential in creating a baseline of knowledge for many common bioinformatics methods, especially those using genomic sequencing. RESULTS: We have created refget, a Global Alliance for Genomics and Health API specification to access reference sequences and sub-sequences using an identifier derived from the sequence itself. We present four reference implementations across in-house and cloud infrastructure, a compliance suite and a web report used to ensure specification conformity across implementations. AVAILABILITY AND IMPLEMENTATION: The refget specification can be found at: https://w3id.org/ga4gh/refget. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Software
6.
Nucleic Acids Res ; 49(D1): D82-D85, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33175160

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), has for almost forty years continued in its mission to freely archive and present the world's public sequencing data for the benefit of the entire scientific community and for the acceleration of the global research effort. Here we highlight the major developments to ENA services and content in 2020, focussing in particular on the recently released updated ENA browser, modernisation of our release process and our data coordination collaborations with specific research communities.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos/tendências , Ácidos Nucleicos/genética , Nucleotídeos/genética , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Europa (Continente) , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , Anotação de Sequência Molecular , Ácidos Nucleicos/química , Nucleotídeos/química , Análise de Sequência de DNA , Análise de Sequência de RNA
7.
Nucleic Acids Res ; 48(D1): D70-D76, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31722421

RESUMO

The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.


Assuntos
Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Genômica , Biologia Computacional/métodos , Europa (Continente) , Genômica/métodos , Anotação de Sequência Molecular , Software , Interface Usuário-Computador , Navegador
8.
Database (Oxford) ; 20192019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31868882

RESUMO

Data sharing enables research communities to exchange findings and build upon the knowledge that arises from their discoveries. Areas of public and animal health as well as food safety would benefit from rapid data sharing when it comes to emergencies. However, ethical, regulatory and institutional challenges, as well as lack of suitable platforms which provide an infrastructure for data sharing in structured formats, often lead to data not being shared or at most shared in form of supplementary materials in journal publications. Here, we describe an informatics platform that includes workflows for structured data storage, managing and pre-publication sharing of pathogen sequencing data and its analysis interpretations with relevant stakeholders.


Assuntos
Bases de Dados Factuais , Disseminação de Informação , Bactérias/classificação , Metagenômica , Filogenia , Interface Usuário-Computador
9.
Nucleic Acids Res ; 47(D1): D84-D88, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30395270

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided from EMBL-EBI, has for more than three decades been responsible for archiving the world's public sequencing data and presenting this important resource to the scientific community to support and accelerate the global research effort. Here, we outline ENA services and content in 2018 and provide an overview of a selection of focus areas of development work: extending data coordination services around ENA, sequence submissions through template expansion, early pre-submission validation tools and our move towards a new browser and retrieval infrastructure.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Genômica/métodos , Europa (Continente) , Genoma , Humanos , Anotação de Sequência Molecular , Ferramenta de Busca , Software , Transcriptoma , Interface Usuário-Computador , Navegador
10.
Nucleic Acids Res ; 46(D1): D36-D40, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29140475

RESUMO

For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world's public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.


Assuntos
Bases de Dados de Ácidos Nucleicos , Biologia Computacional , Bases de Dados de Ácidos Nucleicos/tendências , Europa (Continente) , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Armazenamento e Recuperação da Informação , Internet , Anotação de Sequência Molecular
11.
Nucleic Acids Res ; 45(D1): D32-D36, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899630

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) offers a rich platform for data sharing, publishing and archiving and a globally comprehensive data set for onward use by the scientific community. With a broad scope spanning raw sequencing reads, genome assemblies and functional annotation, the resource provides extensive data submission, search and download facilities across web and programmatic interfaces. Here, we outline ENA content and major access modalities, highlight major developments in 2016 and outline a number of examples of data reuse from ENA.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Análise de Sequência de RNA , Genômica , Internet , Anotação de Sequência Molecular
12.
Nucleic Acids Res ; 44(D1): D58-66, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26615190

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the submission, maintenance and presentation of nucleotide sequence data and related sample and experimental information. In this article we report on ENA in 2015 regarding general activity, notable published data sets and major achievements. This is followed by a focus on sustainable biocuration of functional annotation, an area which has particularly felt the pressure of sequencing growth. The importance of functional annotation, how it can be submitted and the shifting role of the biocurator in the context of increasing volumes of data are all discussed.


Assuntos
Bases de Dados de Ácidos Nucleicos , Anotação de Sequência Molecular , Análise de Sequência de DNA , Análise de Sequência de RNA , Curadoria de Dados
14.
Nucleic Acids Res ; 43(Database issue): D23-9, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25404130

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its discoverability and usability. In response to this, ENA has been introducing and improving checklists for use during submission and expanding its search facilities to provide targeted search results. Here, we give a brief update on ENA content and some major developments undertaken in data submission services during 2014. We then describe in more detail the services we offer for data discovery and retrieval.


Assuntos
Bases de Dados de Ácidos Nucleicos , Sequência de Bases , Genômica , Anotação de Sequência Molecular , Análise de Sequência
15.
Nucleic Acids Res ; 42(Database issue): D38-43, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24214989

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers a spectrum of data types including raw reads, assembly data and functional annotation. ENA has faced a dramatic growth in genome assembly submission rates, data volumes and complexity of datasets. This has prompted a broad reworking of assembly submission services, for which we now reach the end of a major programme of work and many enhancements have already been made available over the year to components of the submission service. In this article, we briefly review ENA content and growth over 2013, describe our rapidly developing services for genome assembly information and outline further major developments over the last year.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genômica , Europa (Continente) , Internet
16.
Nucleic Acids Res ; 42(Database issue): D600-6, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24165880

RESUMO

Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.


Assuntos
Bases de Dados Genéticas , Metagenômica , Perfilação da Expressão Gênica , Internet , Metabolômica , Proteômica , Software
17.
PLoS One ; 8(2): e56225, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23437094

RESUMO

BACKGROUND: Bacterial non-necrotizing erysipelas and cellulitis are often recurring, diffusely spreading infections of the skin and subcutaneous tissues caused most commonly by streptococci. Host genetic factors influence infection susceptibility but no extensive studies on the genetic determinants of human erysipelas exist. METHODS: We performed genome-wide linkage with the 10,000 variant Human Mapping Array (HMA10K) array on 52 Finnish families with multiple erysipelas cases followed by microsatellite fine mapping of suggestive linkage peaks. A scan with the HMA250K array was subsequently performed with a subset of cases and controls. RESULTS: Significant linkage was found at 9q34 (nonparametric multipoint linkage score (NPL(all)) 3.84, p=0.026), which is syntenic to a quantitative trait locus for susceptibility to group A streptococci infections on chromosome 2 in mouse. Sequencing of candidate genes in the 9q34 region did not conclusively associate any to erysipelas/cellulitis susceptibility. Suggestive linkage (NPL(all)>3.0) was found at three loci: 3q22-24, 21q22, and 22q13. A subsequent denser genome scan with the HMA250K array supported the 3q22 locus, in which several SNPs in the promoter of AGTR1 (Angiotensin II receptor type I) suggestively associated with erysipelas/cellulitis susceptibility. CONCLUSIONS: Specific host genetic factors may cause erysipelas/cellulitis susceptibility in humans.


Assuntos
Celulite (Flegmão)/genética , Erisipela/genética , Predisposição Genética para Doença , Animais , Cromossomos Humanos Par 9/genética , Família , Feminino , Ligação Genética , Marcadores Genéticos , Genoma Humano/genética , Técnicas de Genotipagem , Humanos , Masculino , Camundongos , Repetições de Microssatélites/genética , Análise de Sequência com Séries de Oligonucleotídeos , Linhagem , Mapeamento Físico do Cromossomo , Reprodutibilidade dos Testes
18.
Nucleic Acids Res ; 41(Database issue): D30-5, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23203883

RESUMO

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) collects, maintains and presents comprehensive nucleic acid sequence and related information as part of the permanent public scientific record. Here, we provide brief updates on ENA content developments and major service enhancements in 2012 and describe in more detail two important areas of development and policy that are driven by ongoing growth in sequencing technologies. First, we describe the ENA data warehouse, a resource for which we provide a programmatic entry point to integrated content across the breadth of ENA. Second, we detail our plans for the deployment of CRAM data compression technology in ENA.


Assuntos
Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Compressão de Dados , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Interface Usuário-Computador
19.
Nat Methods ; 9(5): 459-62, 2012 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-22543379

RESUMO

The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalog of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies, the project makes all of its data publicly available. Members of the project data coordination center have developed and deployed several tools to enable widespread data access.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Genômica/métodos , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , Variação Genética , Humanos
20.
Nucleic Acids Res ; 40(Database issue): D54-6, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22009675

RESUMO

New generation sequencing platforms are producing data with significantly higher throughput and lower cost. A portion of this capacity is devoted to individual and community scientific projects. As these projects reach publication, raw sequencing datasets are submitted into the primary next-generation sequence data archive, the Sequence Read Archive (SRA). Archiving experimental data is the key to the progress of reproducible science. The SRA was established as a public repository for next-generation sequence data as a part of the International Nucleotide Sequence Database Collaboration (INSDC). INSDC is composed of the National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EBI) and the DNA Data Bank of Japan (DDBJ). The SRA is accessible at www.ncbi.nlm.nih.gov/sra from NCBI, at www.ebi.ac.uk/ena from EBI and at trace.ddbj.nig.ac.jp from DDBJ. In this article, we present the content and structure of the SRA and report on updated metadata structures, submission file formats and supported sequencing platforms. We also briefly outline our various responses to the challenge of explosive data growth.


Assuntos
Bases de Dados de Ácidos Nucleicos , Sequenciamento de Nucleotídeos em Larga Escala , Genômica , Internet
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA