RESUMO
Spatially resolved omics technologies are transforming our understanding of biological tissues. However, the handling of uni- and multimodal spatial omics datasets remains a challenge owing to large data volumes, heterogeneity of data types and the lack of flexible, spatially aware data structures. Here we introduce SpatialData, a framework that establishes a unified and extensible multiplatform file-format, lazy representation of larger-than-memory data, transformations and alignment to common coordinate systems. SpatialData facilitates spatial annotations and cross-modal aggregation and analysis, the utility of which is illustrated in the context of multiple vignettes, including integrative analysis on a multimodal Xenium and Visium breast cancer study.
RESUMO
The increasing technical complexity of all aspects involving bioimages, ranging from their acquisition to their analysis, has led to a diversification in the expertise of scientists engaged at the different stages of the discovery process. Although this diversity of profiles comes with the major challenge of establishing fruitful interdisciplinary collaboration, such collaboration also offers a superb opportunity for scientific discovery. In this Perspective, we review the different actors within the bioimaging research universe and identify the primary obstacles that hinder their interactions. We advocate that data sharing, which lies at the heart of innovation, is finally within reach after decades of being viewed as next to impossible in bioimaging. Building on recent community efforts, we propose actions to consolidate the development of a truly interdisciplinary bioimaging culture based on open data exchange and highlight the promising outlook of bioimaging as an example of multidisciplinary scientific endeavour.
Assuntos
Disseminação de Informação , Humanos , Animais , Comunicação InterdisciplinarRESUMO
The rapid pace of innovation in biological imaging and the diversity of its applications have prevented the establishment of a community-agreed standardized data format. We propose that complementing established open formats such as OME-TIFF and HDF5 with a next-generation file format such as Zarr will satisfy the majority of use cases in bioimaging. Critically, a common metadata format used in all these vessels can deliver truly findable, accessible, interoperable and reusable bioimaging data.
Assuntos
Biologia Computacional/instrumentação , Biologia Computacional/normas , Metadados , Microscopia/instrumentação , Microscopia/normas , Software , Benchmarking , Biologia Computacional/métodos , Compressão de Dados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Internet , Microscopia/métodos , Linguagens de Programação , SARS-CoV-2RESUMO
A growing community is constructing a next-generation file format (NGFF) for bioimaging to overcome problems of scalability and heterogeneity. Organized by the Open Microscopy Environment (OME), individuals and institutes across diverse modalities facing these problems have designed a format specification process (OME-NGFF) to address these needs. This paper brings together a wide range of those community members to describe the cloud-optimized format itself-OME-Zarr-along with tools and data resources available today to increase FAIR access and remove barriers in the scientific process. The current momentum offers an opportunity to unify a key component of the bioimaging domain-the file format that underlies so many personal, institutional, and global data management and analysis tasks.
Assuntos
Microscopia , Software , Humanos , Apoio ComunitárioRESUMO
This paper was originally published under standard Nature America Inc. copyright. As of the date of this correction, the Resource is available online as an open-access paper with a CC-BY license. No other part of the paper has been changed.
RESUMO
Access to primary research data is vital for the advancement of science. To extend the data types supported by community repositories, we built a prototype Image Data Resource (IDR) that collects and integrates imaging data acquired across many different imaging modalities. IDR links data from several imaging modalities, including high-content screening, super-resolution and time-lapse microscopy, digital pathology, public genetic or chemical databases, and cell and tissue phenotypes expressed using controlled ontologies. Using this integration, IDR facilitates the analysis of gene networks and reveals functional interactions that are inaccessible to individual studies. To enable re-analysis, we also established a computational resource based on Jupyter notebooks that allows remote access to the entire IDR. IDR is also an open source platform that others can use to publish their own image data. Thus IDR provides both a novel on-line resource and a software infrastructure that promotes and extends publication and re-analysis of scientific image data.
Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Interpretação de Imagem Assistida por Computador/métodos , Disseminação de Informação/métodos , Software , Interface Usuário-Computador , Algoritmos , Editoração , Integração de SistemasAssuntos
Biologia Computacional/métodos , Processamento de Imagem Assistida por Computador/métodos , Metadados , Algoritmos , Animais , Congressos como Assunto , Microscopia Crioeletrônica/métodos , Mineração de Dados/métodos , Bases de Dados Factuais , Diagnóstico por Imagem/métodos , Humanos , Microscopia , SoftwareRESUMO
High content screening (HCS) experiments create a classic data management challenge-multiple, large sets of heterogeneous structured and unstructured data, that must be integrated and linked to produce a set of "final" results. These different data include images, reagents, protocols, analytic output, and phenotypes, all of which must be stored, linked and made accessible for users, scientists, collaborators and where appropriate the wider community. The OME Consortium has built several open source tools for managing, linking and sharing these different types of data. The OME Data Model is a metadata specification that supports the image data and metadata recorded in HCS experiments. Bio-Formats is a Java library that reads recorded image data and metadata and includes support for several HCS screening systems. OMERO is an enterprise data management application that integrates image data, experimental and analytic metadata and makes them accessible for visualization, mining, sharing and downstream analysis. We discuss how Bio-Formats and OMERO handle these different data types, and how they can be used to integrate, link and share HCS experiments in facilities and public data repositories. OME specifications and software are open source and are available at https://www.openmicroscopy.org.
Assuntos
Biologia Computacional/estatística & dados numéricos , Mineração de Dados/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Armazenamento e Recuperação da Informação/estatística & dados numéricos , Software , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Ensaios de Triagem em Larga Escala/métodos , Humanos , Disseminação de Informação , Armazenamento e Recuperação da Informação/métodos , InternetRESUMO
Imaging data are used in the life and biomedical sciences to measure the molecular and structural composition and dynamics of cells, tissues, and organisms. Datasets range in size from megabytes to terabytes and usually contain a combination of binary pixel data and metadata that describe the acquisition process and any derived results. The OMERO image data management platform allows users to securely share image datasets according to specific permissions levels: data can be held privately, shared with a set of colleagues, or made available via a public URL. Users control access by assigning data to specific Groups with defined membership and access rights. OMERO's Permission system supports simple data sharing in a lab, collaborative data analysis, and even teaching environments. OMERO software is open source and released by the OME Consortium at www.openmicroscopy.org.
Assuntos
Disseminação de Informação , Imagem Molecular , Software , Animais , Internet , EditoraçãoRESUMO
Data-intensive research depends on tools that manage multidimensional, heterogeneous datasets. We built OME Remote Objects (OMERO), a software platform that enables access to and use of a wide range of biological data. OMERO uses a server-based middleware application to provide a unified interface for images, matrices and tables. OMERO's design and flexibility have enabled its use for light-microscopy, high-content-screening, electron-microscopy and even non-image-genotype data. OMERO is open-source software, available at http://openmicroscopy.org/.
Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Modelos Biológicos , Software , Interface Usuário-Computador , Animais , Biologia/métodos , Simulação por Computador , HumanosRESUMO
Diabetes is a complex chronic condition that affects the body's ability to produce or use insulin effectively, resulting in elevated blood glucose levels. It is associated with various complications and comorbidities, significantly impacting both individuals and the health care system. Effective management involves a combination of lifestyle adjustments, medication adherence, monitoring, education, and support. The expanding use of continuous glucose monitoring (CGM) has been transformative in diabetes care, providing valuable real-time data and insights for better management. To understand the opportunity for health plans to support improved patient outcomes with CGM, AMCP sponsored a multifaceted approach to identify best practices consisting of expert interviews, a national payer survey, an expert panel workshop with clinical experts and managed care stakeholders, and a national webcast to communicate the program findings. This article summarizes current evidence for CGM to support managed care and payer professionals in making collaborative, evidence-based decisions to optimize outcomes among patients with diabetes. In addition, this review also presents the findings of a national payer survey and describes expert-supported health plan best practices around coverage and access to CGM.
Assuntos
Monitoramento Contínuo da Glicose , Diabetes Mellitus , Humanos , Glicemia , Automonitorização da Glicemia , Diabetes Mellitus/tratamento farmacológico , Tomada de DecisõesRESUMO
Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.
RESUMO
Together with the molecular knowledge of genes and proteins, biological images promise to significantly enhance the scientific understanding of complex cellular systems and to advance predictive and personalized therapeutic products for human health. For this potential to be realized, quality-assured bioimage data must be shared among labs at a global scale to be compared, pooled, and reanalyzed, thus unleashing untold potential beyond the original purpose for which the data was generated. There are two broad sets of requirements to enable bioimage data sharing in the life sciences. One set of requirements is articulated in the companion White Paper entitled "Enabling Global Image Data Sharing in the Life Sciences," which is published in parallel and addresses the need to build the cyberinfrastructure for sharing bioimage data (arXiv:2401.13023 [q-bio.OT], https://doi.org/10.48550/arXiv.2401.13023). Here, we detail a broad set of requirements, which involves collecting, managing, presenting, and propagating contextual information essential to assess the quality, understand the content, interpret the scientific implications, and reuse bioimage data in the context of the experimental details. We start by providing an overview of the main lessons learned to date through international community activities, which have recently made generating community standard practices for imaging Quality Control (QC) and metadata (Faklaris et al., 2022; Hammer et al., 2021; Huisman et al., 2021; Microscopy Australia, 2016; Montero Llopis et al., 2021; Rigano et al., 2021; Sarkans et al., 2021). We then provide a clear set of recommendations for amplifying this work. The driving goal is to address remaining challenges and democratize access to common practices and tools for a spectrum of biomedical researchers, regardless of their expertise, access to resources, and geographical location.
RESUMO
A growing community is constructing a next-generation file format (NGFF) for bioimaging to overcome problems of scalability and heterogeneity. Organized by the Open Microscopy Environment (OME), individuals and institutes across diverse modalities facing these problems have designed a format specification process (OME-NGFF) to address these needs. This paper brings together a wide range of those community members to describe the cloud-optimized format itself -- OME-Zarr -- along with tools and data resources available today to increase FAIR access and remove barriers in the scientific process. The current momentum offers an opportunity to unify a key component of the bioimaging domain -- the file format that underlies so many personal, institutional, and global data management and analysis tasks.
RESUMO
Background: Knowing the needs of the bioimaging community with respect to research data management (RDM) is essential for identifying measures that enable adoption of the FAIR (findable, accessible, interoperable, reusable) principles for microscopy and bioimage analysis data across disciplines. As an initiative within Germany's National Research Data Infrastructure, we conducted this community survey in summer 2021 to assess the state of the art of bioimaging RDM and the community needs. Methods: An online survey was conducted with a mixed question-type design. We created a questionnaire tailored to relevant topics of the bioimaging community, including specific questions on bioimaging methods and bioimage analysis, as well as more general questions on RDM principles and tools. 203 survey entries were included in the analysis covering the perspectives from various life and biomedical science disciplines and from participants at different career levels. Results: The results highlight the importance and value of bioimaging RDM and data sharing. However, the practical implementation of FAIR practices is impeded by technical hurdles, lack of knowledge, and insecurity about the legal aspects of data sharing. The survey participants request metadata guidelines and annotation tools and endorse the usage of image data management platforms. At present, OMERO (Open Microscopy Environment Remote Objects) is the best known and most widely used platform. Most respondents rely on image processing and analysis, which they regard as the most time-consuming step of the bioimage data workflow. While knowledge about and implementation of electronic lab notebooks and data management plans is limited, respondents acknowledge their potential value for data handling and publication. Conclusion: The bioimaging community acknowledges and endorses the value of RDM and data sharing. Still, there is a need for information, guidance, and standardization to foster the adoption of FAIR data handling. This survey may help inspiring targeted measures to close this gap.
Assuntos
Gerenciamento de Dados , Metadados , Humanos , Disseminação de Informação , Inquéritos e Questionários , Fluxo de TrabalhoRESUMO
Cell migration research has become a high-content field. However, the quantitative information encapsulated in these complex and high-dimensional datasets is not fully exploited owing to the diversity of experimental protocols and non-standardized output formats. In addition, typically the datasets are not open for reuse. Making the data open and Findable, Accessible, Interoperable, and Reusable (FAIR) will enable meta-analysis, data integration, and data mining. Standardized data formats and controlled vocabularies are essential for building a suitable infrastructure for that purpose but are not available in the cell migration domain. We here present standardization efforts by the Cell Migration Standardisation Organisation (CMSO), an open community-driven organization to facilitate the development of standards for cell migration data. This work will foster the development of improved algorithms and tools and enable secondary analysis of public datasets, ultimately unlocking new knowledge of the complex biological process of cell migration.
Assuntos
Biomarcadores , Movimento Celular , Pesquisa/normas , Biologia Computacional/métodos , Biologia Computacional/normas , Análise de Dados , Bases de Dados Factuais , MetadadosRESUMO
Faced with the need to support a growing number of whole slide imaging (WSI) file formats, our team has extended a long-standing community file format (OME-TIFF) for use in digital pathology. The format makes use of the core TIFF specification to store multi-resolution (or "pyramidal") representations of a single slide in a flexible, performant manner. Here we describe the structure of this format, its performance characteristics, as well as an open-source library support for reading and writing pyramidal OME-TIFFs.
RESUMO
Cell migration research has recently become both a high content and a high throughput field thanks to technological, computational, and methodological advances. Simultaneously, however, urgent bioinformatics needs regarding data management, standardization, and dissemination have emerged. To address these concerns, we propose to establish an open data ecosystem for cell migration research.