RESUMEN
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) is maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI). The ENA is one of the three members of the International Nucleotide Sequence Database Collaboration (INSDC). It serves the bioinformatics community worldwide via the submission, processing, archiving and dissemination of sequence data. The ENA supports data types ranging from raw reads, through alignments and assemblies to functional annotation. The data is enriched with contextual information relating to samples and experimental configurations. In this article, we describe recent progress and improvements to ENA services. In particular, we focus upon three areas of work in 2023: FAIRness of ENA data, pandemic preparedness and foundational technology. For FAIRness, we have introduced minimal requirements for spatiotemporal annotation, created a metadata-based classification system, incorporated third party metadata curations with archived records, and developed a new rapid visualisation platform, the ENA Notebooks. For foundational enhancements, we have improved the INSDC data exchange and synchronisation pipelines, and invested in site reliability engineering for ENA infrastructure. In order to support genomic surveillance efforts, we have continued to provide ENA services in support of SARS-CoV-2 data mobilisation and have adapted these for broader pathogen surveillance efforts.
Asunto(s)
Genómica , Nucleótidos , Biología Computacional , Bases de Datos de Ácidos Nucleicos , Internet , Reproducibilidad de los Resultados , Europa (Continente)RESUMEN
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), offers those producing data an open and supported platform for the management, archiving, publication, and dissemination of data; and to the scientific community as a whole, it offers a globally comprehensive data set through a host of data discovery and retrieval tools. Here, we describe recent updates to the ENA's submission and retrieval services as well as focused efforts to improve connectivity, reusability, and interoperability of ENA data and metadata.
Asunto(s)
Bases de Datos de Ácidos Nucleicos , Academias e Institutos , Biología Computacional , Internet , Programas Informáticos , Conjuntos de Datos como AsuntoRESUMEN
The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena), maintained at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) provides freely accessible services, both for deposition of, and access to, open nucleotide sequencing data. Open scientific data are of paramount importance to the scientific community and contribute daily to the acceleration of scientific advance. Here, we outline the major updates to ENA's services and infrastructure that have been delivered over the past year.
Asunto(s)
Biología Computacional , Bases de Datos de Ácidos Nucleicos , Nucleótidos/genética , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Internet , Anotación de Secuencia Molecular , Nucleótidos/clasificaciónRESUMEN
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), has for almost forty years continued in its mission to freely archive and present the world's public sequencing data for the benefit of the entire scientific community and for the acceleration of the global research effort. Here we highlight the major developments to ENA services and content in 2020, focussing in particular on the recently released updated ENA browser, modernisation of our release process and our data coordination collaborations with specific research communities.
Asunto(s)
Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos/tendencias , Ácidos Nucleicos/genética , Nucleótidos/genética , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Europa (Continente) , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Internet , Anotación de Secuencia Molecular , Ácidos Nucleicos/química , Nucleótidos/química , Análisis de Secuencia de ADN , Análisis de Secuencia de ARNRESUMEN
Retinal blood vessels are the source to provide oxygen and nutrition to retina and any change in the normal structure may lead to different retinal abnormalities. Automated detection of vascular structure is very important while designing a computer aided diagnostic system for retinal diseases. Most popular methods for vessel segmentation are based on matched filters and Gabor wavelets which give good response against blood vessels. One major drawback in these techniques is that they also give strong response for lesion (exudates, hemorrhages) boundaries which give rise to false vessels. These false vessels may lead to incorrect detection of vascular changes. In this paper, we propose a new hybrid feature set along with new classification technique for accurate detection of blood vessels. The main motivation is to lower the false positives especially from retinal images with severe disease level. A novel region based hybrid feature set is presented for proper discrimination between true and false vessels. A new modified m-mediods based classification is also presented which uses most discriminating features to categorize vessel regions into true and false vessels. The evaluation of proposed system is done thoroughly on publicly available databases along with a locally gathered database with images of advanced level of retinal diseases. The results demonstrate the validity of the proposed system as compared to existing state of the art techniques.
Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Enfermedades de la Retina/diagnóstico , Enfermedades de la Retina/patología , Vasos Retinianos/patología , Algoritmos , Reacciones Falso Positivas , Fondo de Ojo , Humanos , Retina/patologíaRESUMEN
As public health laboratories expand their genomic sequencing and bioinformatics capacity for the surveillance of different pathogens, labs must carry out robust validation, training, and optimization of wet- and dry-lab procedures. Achieving these goals for algorithms, pipelines and instruments often requires that lower quality datasets be made available for analysis and comparison alongside those of higher quality. This range of data quality in reference sets can complicate the sharing of sub-optimal datasets that are vital for the community and for the reproducibility of assays. Sharing of useful, but sub-optimal datasets requires careful annotation and documentation of known issues to enable appropriate interpretation, avoid being mistaken for better quality information, and for these data (and their derivatives) to be easily identifiable in repositories. Unfortunately, there are currently no standardized attributes or mechanisms for tagging poor-quality datasets, or datasets generated for a specific purpose, to maximize their utility, searchability, accessibility and reuse. The Public Health Alliance for Genomic Epidemiology (PHA4GE) is an international community of scientists from public health, industry and academia focused on improving the reproducibility, interoperability, portability, and openness of public health bioinformatic software, skills, tools and data. To address the challenges of sharing lower quality datasets, PHA4GE has developed a set of standardized contextual data tags, namely fields and terms, that can be included in public repository submissions as a means of flagging pathogen sequence data with known quality issues, increasing their discoverability. The contextual data tags were developed through consultations with the community including input from the International Nucleotide Sequence Data Collaboration (INSDC), and have been standardized using ontologies - community-based resources for defining the tag properties and the relationships between them. The standardized tags are agnostic to the organism and the sequencing technique used and thus can be applied to data generated from any pathogen using an array of sequencing techniques. The tags can also be applied to synthetic (lab created) data. The list of standardized tags is maintained by PHA4GE and can be found at https://github.com/pha4ge/contextual_data_QC_tags. Definitions, ontology IDs, examples of use, as well as a JSON representation, are provided. The PHA4GE QC tags were tested, and are now implemented, by the FDA's GenomeTrakr laboratory network as part of its routine submission process for SARS-CoV-2 wastewater surveillance. We hope that these simple, standardized tags will help improve communication regarding quality control in public repositories, in addition to making datasets of variable quality more easily identifiable. Suggestions for additional tags can be submitted to PHA4GE via the New Term Request Form in the GitHub repository. By providing a mechanism for feedback and suggestions, we also expect that the tags will evolve with the needs of the community.
Asunto(s)
Biología Computacional , Salud Pública , Control de Calidad , Humanos , Biología Computacional/métodos , Difusión de la Información/métodos , Reproducibilidad de los Resultados , Anotación de Secuencia Molecular/métodos , Genómica/métodos , Programas InformáticosRESUMEN
The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.
Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Pandemias , COVID-19/epidemiología , Genómica , Difusión de la InformaciónRESUMEN
The development of the mouse salivary gland involves a tip-driven process of branching morphogenesis that takes place in concert with differentiation into acinar, myoepithelial, and ductal (basal and luminal) sub-lineages. By combining clonal lineage tracing with a three-dimensional (3D) reconstruction of the branched epithelial network and single-cell RNA-seq analysis, we show that in tips, a heterogeneous population of renewing progenitors transition from a Krt14+ multipotent state to unipotent states via two transcriptionally distinct bipotent states, one restricted to the Krt14+ basal and myoepithelial lineage and the other to the Krt8+ acinar and luminal lineage. Using genetic perturbations, we show how the differential expression of Notch signaling correlates with spatial segregation, exits from multipotency, and promotes the Krt8+ lineage, whereas Kras activation promotes proacinar fate. These findings provide a mechanistic basis for how positional cues within growing tips regulate the process of lineage segregation and ductal patterning.
Asunto(s)
Transducción de Señal , Células Madre , Ratones , Animales , Linaje de la Célula , Diferenciación Celular/fisiología , Células Epiteliales/metabolismo , Glándulas SalivalesRESUMEN
Obesity has long been linked to adverse health effects over time. As the prevalence of obesity continues to rise, it is important to anticipate and minimize the complications that obesity brings in the anesthesia setting during surgery. Anesthetic departments must recognize the innumerable risks when managing patients with obesity undergoing surgery, including anatomical and physiological changes as well as comorbidities such as diabetes, cardiovascular diseases, and malignancies. Therefore, the purpose of this review is to analyze the current literature and evaluate the current and recent advances in anesthetic care of obese patients undergoing surgery, to better understand the specific challenges this patient population faces. A greater understanding of the differences between anesthetic care for obese patients can help to improve patient care and the specificity of treatment. The examination of the literature will focus on differing patient outcomes and safety precautions in obese patients as compared to the general population. Specifically highlighting the differences in pre-operative, intra-operative, and post-operative care, with the aim to identify issues and present possible solutions.
RESUMEN
The COVID-19 pandemic has exemplified the importance of interoperable and equitable data sharing for global surveillance and to support research. While many challenges could be overcome, at least in some countries, many hurdles within the organizational, scientific, technical and cultural realms still remain to be tackled to be prepared for future threats. We propose to (i) continue supporting global efforts that have proven to be efficient and trustworthy toward addressing challenges in pathogen molecular data sharing; (ii) establish a distributed network of Pathogen Data Platforms to (a) ensure high quality data, metadata standardization and data analysis, (b) perform data brokering on behalf of data providers both for research and surveillance, (c) foster capacity building and continuous improvements, also for pandemic preparedness; (iii) establish an International One Health Pathogens Portal, connecting pathogen data isolated from various sources (human, animal, food, environment), in a truly One Health approach and following FAIR principles. To address these challenging endeavors, we have started an ELIXIR Focus Group where we invite all interested experts to join in a concerted, expert-driven effort toward sustaining and ensuring high-quality data for global surveillance and research.
Asunto(s)
COVID-19 , Animales , Humanos , COVID-19/epidemiología , Pandemias , Creación de Capacidad , Difusión de la InformaciónRESUMEN
Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient public health decisions. However, these international resources began primarily for academic purposes, rather than for surveillance or interventions. Now, queries need to access not only the whole genomes of multiple pathogens but also make connections using robust contextual metadata to identify issues of public health relevance. Databases that over time developed a patchwork of submission formats and requirements need to be consistently organized and coordinated internationally to allow effective searches.To help resolve these issues, we propose a common pathogen data structure called the Pathogen Data Object Model (DOM) that will formalize the minimum pieces of sequence data and contextual data necessary for general public health uses, while recognizing that submitters will likely withhold a wide range of non-public contextual data. Further, we propose contributors use the Pathogen DOM for all pathogen submissions (bacterial, viral, fungal, and parasites), which will simplify data submissions and provide a consistent and transparent data structure for downstream data analyses. We also highlight how improved submission tools can support the Pathogen DOM, offering users additional easy-to-use methods to ensure this structure is followed.