Pesquisa | BVS IEC

1.

Best practices for data management and sharing in experimental biomedical research.

Cunha-Oliveira, Teresa; Ioannidis, John P A; Oliveira, Paulo J.

Physiol Rev ; 104(3): 1387-1408, 2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-38451234

RESUMO

Effective data management is crucial for scientific integrity and reproducibility, a cornerstone of scientific progress. Well-organized and well-documented data enable validation and building on results. Data management encompasses activities including organization, documentation, storage, sharing, and preservation. Robust data management establishes credibility, fostering trust within the scientific community and benefiting researchers' careers. In experimental biomedicine, comprehensive data management is vital due to the typically intricate protocols, extensive metadata, and large datasets. Low-throughput experiments, in particular, require careful management to address variations and errors in protocols and raw data quality. Transparent and accountable research practices rely on accurate documentation of procedures, data collection, and analysis methods. Proper data management ensures long-term preservation and accessibility of valuable datasets. Well-managed data can be revisited, contributing to cumulative knowledge and potential new discoveries. Publicly funded research has an added responsibility for transparency, resource allocation, and avoiding redundancy. Meeting funding agency expectations increasingly requires rigorous methodologies, adherence to standards, comprehensive documentation, and widespread sharing of data, code, and other auxiliary resources. This review provides critical insights into raw and processed data, metadata, high-throughput versus low-throughput datasets, a common language for documentation, experimental and reporting guidelines, efficient data management systems, sharing practices, and relevant repositories. We systematically present available resources and optimal practices for wide use by experimental biomedical researchers.

Assuntos

Pesquisa Biomédica , Gerenciamento de Dados , Disseminação de Informação , Pesquisa Biomédica/normas , Pesquisa Biomédica/métodos , Disseminação de Informação/métodos , Humanos , Animais , Gerenciamento de Dados/métodos

2.

OMD Curation Toolkit: a workflow for in-house curation of public omics datasets.

Piquer-Esteban, Samuel; Arnau, Vicente; Diaz, Wladimiro; Moya, Andrés.

BMC Bioinformatics ; 25(1): 184, 2024 May 09.

Artigo em Inglês | MEDLINE | ID: mdl-38724907

RESUMO

BACKGROUND: Major advances in sequencing technologies and the sharing of data and metadata in science have resulted in a wealth of publicly available datasets. However, working with and especially curating public omics datasets remains challenging despite these efforts. While a growing number of initiatives aim to re-use previous results, these present limitations that often lead to the need for further in-house curation and processing. RESULTS: Here, we present the Omics Dataset Curation Toolkit (OMD Curation Toolkit), a python3 package designed to accompany and guide the researcher during the curation process of metadata and fastq files of public omics datasets. This workflow provides a standardized framework with multiple capabilities (collection, control check, treatment and integration) to facilitate the arduous task of curating public sequencing data projects. While centered on the European Nucleotide Archive (ENA), the majority of the provided tools are generic and can be used to curate datasets from different sources. CONCLUSIONS: Thus, it offers valuable tools for the in-house curation previously needed to re-use public omics data. Due to its workflow structure and capabilities, it can be easily used and benefit investigators in developing novel omics meta-analyses based on sequencing data.

Assuntos

Curadoria de Dados , Software , Fluxo de Trabalho , Curadoria de Dados/métodos , Metadados , Bases de Dados Genéticas , Genômica/métodos , Biologia Computacional/métodos

3.

DOMAS: a data management software framework for advanced light sources.

Hu, Hao; Lei, Lei; Wang, Haofan; Zhuang, Bo; Zhang, Ruojin; Luo, Qi; Sun, Xiaokang; Qi, Fazhi.

J Synchrotron Radiat ; 31(Pt 2): 312-321, 2024 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-38300131

RESUMO

In recent years, China's advanced light sources have entered a period of rapid construction and development. As modern X-ray detectors and data acquisition technologies advance, these facilities are expected to generate massive volumes of data annually, presenting significant challenges in data management and utilization. These challenges encompass data storage, metadata handling, data transfer and user data access. In response, the Data Organization Management Access Software (DOMAS) has been designed as a framework to address these issues. DOMAS encapsulates four fundamental modules of data management software, including metadata catalogue, metadata acquisition, data transfer and data service. For light source facilities, building a data management system only requires parameter configuration and minimal code development within DOMAS. This paper firstly discusses the development of advanced light sources in China and the associated demands and challenges in data management, prompting a reconsideration of data management software framework design. It then outlines the architecture of the framework, detailing its components and functions. Lastly, it highlights the application progress and effectiveness of DOMAS when deployed for the High Energy Photon Source (HEPS) and Beijing Synchrotron Radiation Facility (BSRF).

4.

RefXAS: an open access database of X-ray absorption spectra.

Paripsa, Sebastian; Gaur, Abhijeet; Förste, Frank; Doronkin, Dmitry E; Malzer, Wolfgang; Schlesiger, Christopher; Kanngießer, Birgit; Welter, Edmund; Grunwaldt, Jan Dierk; Lützenkirchen-Hecht, Dirk.

J Synchrotron Radiat ; 31(Pt 5): 1105-1117, 2024 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-39190503

RESUMO

Under DAPHNE4NFDI, the X-ray absorption spectroscopy (XAS) reference database, RefXAS, has been set up. For this purpose, we developed a method to enable users to submit a raw dataset, with its associated metadata, via a dedicated website for inclusion in the database. Implementation of the database includes an upload of metadata to the scientific catalogue and an upload of files via object storage, with automated query capabilities through a web server and visualization of the data and files. Based on the mode of measurements, quality criteria have been formulated for the automated check of any uploaded data. In the present work, the significant metadata fields for reusability, as well as reproducibility of results (FAIR data principles), are discussed. Quality criteria for the data uploaded to the database have been formulated and assessed. Moreover, the usability and interoperability of available XAS data/file formats have been explored. The first version of the RefXAS database prototype is presented, which features a human verification procedure, currently being tested with a new user interface designed specifically for curators; a user-friendly landing page; a full list of datasets; advanced search capabilities; a streamlined upload process; and, finally, a server-side automatic authentication and (meta-) data storage via MongoDB, PostgreSQL and (data-) files via relevant APIs.

5.

Effects of organic fertilizers on plant growth and the rhizosphere microbiome.

Yu, Yitian; Zhang, Qi; Kang, Jian; Xu, Nuohan; Zhang, Zhenyan; Deng, Yu; Gillings, Michael; Lu, Tao; Qian, Haifeng.

Appl Environ Microbiol ; 90(2): e0171923, 2024 Feb 21.

Artigo em Inglês | MEDLINE | ID: mdl-38193672

RESUMO

Application of organic fertilizers is an important strategy for sustainable agriculture. The biological source of organic fertilizers determines their specific functional characteristics, but few studies have systematically examined these functions or assessed their health risk to soil ecology. To fill this gap, we analyzed 16S rRNA gene amplicon sequencing data from 637 soil samples amended with plant- and animal-derived organic fertilizers (hereafter plant fertilizers and animal fertilizers). Results showed that animal fertilizers increased the diversity of soil microbiome, while plant fertilizers maintained the stability of soil microbial community. Microcosm experiments verified that plant fertilizers were beneficial to plant root development and increased carbon cycle pathways, while animal fertilizers enriched nitrogen cycle pathways. Compared with animal fertilizers, plant fertilizers harbored a lower abundance of risk factors such as antibiotic resistance genes and viruses. Consequently, plant fertilizers might be more suitable for long-term application in agriculture. This work provides a guide for organic fertilizer selection from the perspective of soil microecology and promotes sustainable development of organic agriculture.IMPORTANCEThis study provides valuable guidance for use of organic fertilizers in agricultural production from the perspective of the microbiome and ecological risk.

Assuntos

Microbiota , Rizosfera , Animais , Fertilizantes , RNA Ribossômico 16S/genética , Microbiota/genética , Solo , Plantas/genética , Microbiologia do Solo , Raízes de Plantas

6.

Associations between keystroke and stylus metadata and depressive symptoms in adolescents.

Jang, Moonyoung; Cho, Youngeun; Kim, Do Hyung; Park, Sunghyun; Park, Seonghyeon; Hur, Ji-Won; Kim, Minah; Cho, Kwangsu; Lee, Chang-Gun; Kwon, Jun Soo.

Psychol Med ; : 1-6, 2024 Sep 05.

Artigo em Inglês | MEDLINE | ID: mdl-39233471

RESUMO

BACKGROUND: Adolescents often experience a heightened incidence of depressive symptoms, which can persist without early intervention. However, adolescents often struggle to identify depressive symptoms, and even when they are aware of these symptoms, seeking help is not always their immediate response. This study aimed to explore the relationship between passively collected digital data, specifically keystroke and stylus data collected via mobile devices, and the manifestation of depressive symptoms. METHODS: A total of 927 first-year middle school students from schools in Seoul solved Korean language and math problems. Throughout this study, 77 types of keystroke and stylus data were collected, including parameters such as the number of key presses, tap pressure, stroke speed, and stroke acceleration. Depressive symptoms were measured using the self-rated Patient Health Questionnaire-9 (PHQ-9). RESULTS: Multiple regression analysis highlighted the significance of stroke length, speed, and acceleration, the average y-coordinate, the tap pressure, and the number of incorrect answers in relation to PHQ-9 scores. The keystroke and stylus metadata were able to reflect mood, energy, cognitive abilities, and psychomotor symptoms among adolescents with depressive symptoms. CONCLUSIONS: This study demonstrates the potential of automatically collected data during school exams or classes for the early screening of clinical depressive symptoms in students. This study has the potential to serve as a cornerstone in the development of digital data frameworks for the early detection of depressive symptoms in adolescents.

7.

Creating and troubleshooting microscopy analysis workflows: Common challenges and common solutions.

Cimini, Beth A.

J Microsc ; 295(2): 93-101, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38532662

RESUMO

As microscopy diversifies and becomes ever more complex, the problem of quantification of microscopy images has emerged as a major roadblock for many researchers. All researchers must face certain challenges in turning microscopy images into answers, independent of their scientific question and the images they have generated. Challenges may arise at many stages throughout the analysis process, including handling of the image files, image pre-processing, object finding, or measurement, and statistical analysis. While the exact solution required for each obstacle will be problem-specific, by keeping analysis in mind, optimizing data quality, understanding tools and tradeoffs, breaking workflows and data sets into chunks, talking to experts, and thoroughly documenting what has been done, analysts at any experience level can learn to overcome these challenges and create better and easier image analyses.

8.

Assessing inclusion and representativeness on digital platforms for health education: Evidence from YouTube.

Pothugunta, Krishna; Liu, Xiao; Susarla, Anjana; Padman, Rema.

J Biomed Inform ; 157: 104669, 2024 Jun 15.

Artigo em Inglês | MEDLINE | ID: mdl-38880237

RESUMO

BACKGROUND: Studies confirm that significant biases exist in online recommendation platforms, exacerbating pre-existing disparities and leading to less-than-optimal outcomes for underrepresented demographics. We study issues of bias in inclusion and representativeness in the context of healthcare information disseminated via videos on the YouTube social media platform, a widely used online channel for multi-media rich information. With one in three US adults using the Internet to learn about a health concern, it is critical to assess inclusivity and representativeness regarding how health information is disseminated by digital platforms such as YouTube. METHODS: Leveraging methods from fair machine learning (ML), natural language processing and voice and facial recognition methods, we examine inclusivity and representativeness of video content presenters using a large corpus of videos and their metadata on a chronic condition (diabetes) extracted from the YouTube platform. Regression models are used to determine whether presenter demographics impact video popularity, measured by the video's average daily view count. A video that generates a higher view count is considered to be more popular. RESULTS: The voice and facial recognition methods predicted the gender and race of the presenter with reasonable success. Gender is predicted through voice recognition (accuracy = 78%, AUC = 76%), while the gender and race predictions use facial recognition (accuracy = 93%, AUC = 92% and accuracy = 82%, AUC = 80%, respectively). The gender of the presenter is more significant for video views only when the face of the presenter is not visible while videos with male presenters with no face visibility have a positive relationship with view counts. Furthermore, videos with white and male presenters have a positive influence on view counts while videos with female and non - white group have high view counts. CONCLUSION: Presenters' demographics do have an influence on average daily view count of videos viewed on social media platforms as shown by advanced voice and facial recognition algorithms used for assessing inclusion and representativeness of the video content. Future research can explore short videos and those at the channel level because popularity of the channel name and the number of videos associated with that channel do have an influence on view counts.

9.

Metadata for Data dIscoverability aNd Study rEplicability in obseRVAtional Studies (MINERVA): Development and Pilot of a Metadata List and Catalogue in Europe.

Pajouheshnia, Romin; Gini, Rosa; Gutierrez, Lia; Swertz, Morris A; Hyde, Eleanor; Sturkenboom, Miriam; Arana, Alejandro; Franzoni, Carla; Ehrenstein, Vera; Roberto, Giuseppe; Gil, Miguel; Maciá, Miguel Angel; Schäfer, Wiebke; Haug, Ulrike; Thurin, Nicolas H; Lassalle, Régis; Droz-Perroteau, Cécile; Zaccagnino, Silvia; Busto, Maria Paula; Middelkoop, Bas; Gembert, Karin; Sanchez-Saez, Francisco; Rodriguez-Bernal, Clara; Sanfélix-Gimeno, Gabriel; Hurtado, Isabel; Acosta, Manuel Barreiro-de; Poblador-Plou, Beatriz; Carmona-Pírez, Jonás; Gimeno-Miguel, Antonio; Prados-Torres, Alexandra; Schultze, Anna; Jansen, Ella; Herings, Ron; Kuiper, Josine; Locatelli, Igor; Jazbar, Janja; Zerovnik, Spela; Kos, Mitja; Smit, Steven; Lind, Sirje; Metspalu, Andres; Simou, Stefania; Hedenmalm, Karin; Cochino, Ana; Alcini, Paolo; Kurz, Xavier; Perez-Gutthann, Susana.

Pharmacoepidemiol Drug Saf ; 33(8): e5871, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-39145406

RESUMO

PURPOSE: Metadata for data dIscoverability aNd study rEplicability in obseRVAtional studies (MINERVA), a European Medicines Agency-funded project (EUPAS39322), defined a set of metadata to describe real-world data sources (RWDSs) and piloted metadata collection in a prototype catalogue to assist investigators from data source discoverability through study conduct. METHODS: A list of metadata was created from a review of existing metadata catalogues and recommendations, structured interviews, a stakeholder survey, and a technical workshop. The prototype was designed to comply with the FAIR principles (findable, accessible, interoperable, reusable), using MOLGENIS software. Metadata collection was piloted by 15 data access partners (DAPs) from across Europe. RESULTS: A total of 442 metadata variables were defined in six domains: institutions (organizations connected to a data source); data banks (data collections sustained by an organization); data sources (collections of linkable data banks covering a common underlying population); studies; networks (of institutions); and common data models (CDMs). A total of 26 institutions were recorded in the prototype. Each DAP populated the metadata of one data source and its selected data banks. The number of data banks varied by data source; the most common data banks were hospital administrative records and pharmacy dispensation records (10 data sources each). Quantitative metadata were successfully extracted from three data sources conforming to different CDMs and entered into the prototype. CONCLUSIONS: A metadata list was finalized, a prototype was successfully populated, and a good practice guide was developed. Setting up and maintaining a metadata catalogue on RWDSs will require substantial effort to support discoverability of data sources and reproducibility of studies in Europe.

Assuntos

Metadados , Estudos Observacionais como Assunto , Europa (Continente) , Humanos , Projetos Piloto , Reprodutibilidade dos Testes , Estudos Observacionais como Assunto/métodos , Coleta de Dados/métodos , Coleta de Dados/normas , Bases de Dados Factuais/estatística & dados numéricos , Software , Farmacoepidemiologia/métodos

10.

Poor data stewardship will hinder global genetic diversity surveillance.

Toczydlowski, Rachel H; Liggins, Libby; Gaither, Michelle R; Anderson, Tanner J; Barton, Randi L; Berg, Justin T; Beskid, Sofia G; Davis, Beth; Delgado, Alonso; Farrell, Emily; Ghoojaei, Maryam; Himmelsbach, Nan; Holmes, Ann E; Queeno, Samantha R; Trinh, Thienthanh; Weyand, Courtney A; Bradburd, Gideon S; Riginos, Cynthia; Toonen, Robert J; Crandall, Eric D.

Proc Natl Acad Sci U S A ; 118(34)2021 08 24.

Artigo em Inglês | MEDLINE | ID: mdl-34404731

RESUMO

Genomic data are being produced and archived at a prodigious rate, and current studies could become historical baselines for future global genetic diversity analyses and monitoring programs. However, when we evaluated the potential utility of genomic data from wild and domesticated eukaryote species in the world's largest genomic data repository, we found that most archived genomic datasets (86%) lacked the spatiotemporal metadata necessary for genetic biodiversity surveillance. Labor-intensive scouring of a subset of published papers yielded geospatial coordinates and collection years for only 33% (39% if place names were considered) of these genomic datasets. Streamlined data input processes, updated metadata deposition policies, and enhanced scientific community awareness are urgently needed to preserve these irreplaceable records of today's genetic biodiversity and to plug the growing metadata gap.

Assuntos

Biodiversidade , Confiabilidade dos Dados , Eucariotos/genética , Variação Genética , Genoma , Genômica/métodos , Dinâmica Populacional

11.

Recommended data elements for health registries: a survey from a German funding initiative.

Harkener, Sonja; Jenetzky, Ekkehart; Rupp, Rüdiger; Dell, Jennifer; Engel, Christoph; von Bargen, Maximilian Ferry; Finger, Robert; Glienke, Maximilian; Heinz, Carsten; Jersch, Patrick; Martin, David; Schmutzler, Rita; Schönthaler, Martin; Suwelack, Barbara; Wegner, Jeannine; Stausberg, Jürgen.

BMC Med Inform Decis Mak ; 24(1): 136, 2024 May 27.

Artigo em Inglês | MEDLINE | ID: mdl-38802886

RESUMO

BACKGROUND: The selection of data elements is a decisive task within the development of a health registry. Having the right metadata is crucial for answering the particular research questions. Furthermore, the set of data elements determines the registries' readiness of interoperability and data reusability to a major extent. Six health registries shared and published their metadata within a German funding initiative. As one step in the direction of a common set of data elements, a selection of those metadata was evaluated with regard to their appropriateness for a broader usage. METHODS: Each registry was asked to contribute a 10%-selection of their data elements to an evaluation sample. The survey was set up with the online survey tool "LimeSurvey Cloud". The registries and an accompanying project participated in the survey with one vote for each project. The data elements were offered in content groups along with the question of whether the data element is appropriate for health registries on a broader scale. The question could be answered using a Likert scale with five options. Furthermore, "no answer" was allowed. The level of agreement was assessed using weighted Cohen's kappa and Kendall's coefficient of concordance. RESULTS: The evaluation sample consisted of 269 data elements. With a grade of "perhaps recommendable" or higher in the mean, 169 data elements were selected. These data elements belong preferably to groups' demography, education/occupation, medication, and nutrition. Half of the registries lost significance compared with their percentage of data elements in the evaluation sample, one remained stable. The level of concordance was adequate. CONCLUSIONS: The survey revealed a set of 169 data elements recommended for health registries. When developing a registry, this set could be valuable help in selecting the metadata appropriate to answer the registry's research questions. However, due to the high specificity of research questions, data elements beyond this set will be needed to cover the whole range of interests of a register. A broader discussion and subsequent surveys are needed to establish a common set of data elements on an international scale.

Assuntos

Sistema de Registros , Sistema de Registros/normas , Alemanha , Humanos , Inquéritos e Questionários , Metadados

12.

Detection and Localization of Small Moving Objects in the Presence of Sensor and Platform Movement.

Cuellar, Adam; Mahalanobis, Abhijit; Renshaw, C Kyle; Mikhael, Wasfy.

Sensors (Basel) ; 24(4)2024 Feb 14.

Artigo em Inglês | MEDLINE | ID: mdl-38400376

RESUMO

In this paper, we address the challenge of detecting small moving targets in dynamic environments characterized by the concurrent movement of both platform and sensor. In such cases, simple image-based frame registration and optical flow analysis cannot be used to detect moving targets. To tackle this, it is necessary to use sensor and platform meta-data in addition to image analysis for temporal and spatial anomaly detection. To this end, we investigate techniques that utilize inertial data to enhance frame-to-frame registration, consistently yielding improved detection outcomes when compared against purely feature-based techniques. For cases where image registration is not possible even with metadata, we propose single-frame spatial anomaly detection and then estimate the range to the target using the platform velocity. The behavior of the estimated range over time helps us to discern targets from clutter. Finally, we show that a KNN classifier can be used to further reduce the false alarm rate without a significant reduction in detection performance. The proposed strategies offer a robust solution for the detection of moving targets in dynamically challenging settings.

13.

Schema Playground: a tool for authoring, extending, and using metadata schemas to improve FAIRness of biomedical data.

Cano, Marco A; Tsueng, Ginger; Zhou, Xinghua; Xin, Jiwen; Hughes, Laura D; Mullen, Julia L; Su, Andrew I; Wu, Chunlei.

BMC Bioinformatics ; 24(1): 159, 2023 Apr 20.

Artigo em Inglês | MEDLINE | ID: mdl-37081398

RESUMO

BACKGROUND: Biomedical researchers are strongly encouraged to make their research outputs more Findable, Accessible, Interoperable, and Reusable (FAIR). While many biomedical research outputs are more readily accessible through open data efforts, finding relevant outputs remains a significant challenge. Schema.org is a metadata vocabulary standardization project that enables web content creators to make their content more FAIR. Leveraging Schema.org could benefit biomedical research resource providers, but it can be challenging to apply Schema.org standards to biomedical research outputs. We created an online browser-based tool that empowers researchers and repository developers to utilize Schema.org or other biomedical schema projects. RESULTS: Our browser-based tool includes features which can help address many of the barriers towards Schema.org-compliance such as: The ability to easily browse for relevant Schema.org classes, the ability to extend and customize a class to be more suitable for biomedical research outputs, the ability to create data validation to ensure adherence of a research output to a customized class, and the ability to register a custom class to our schema registry enabling others to search and re-use it. We demonstrate the use of our tool with the creation of the Outbreak.info schema-a large multi-class schema for harmonizing various COVID-19 related resources. CONCLUSIONS: We have created a browser-based tool to empower biomedical research resource providers to leverage Schema.org classes to make their research outputs more FAIR.

Assuntos

Pesquisa Biomédica , COVID-19 , Humanos , Metadados

14.

The metabolomics workbench file status website: a metadata repository promoting FAIR principles of metabolomics data.

Powell, Christian D; Moseley, Hunter N B.

BMC Bioinformatics ; 24(1): 299, 2023 Jul 24.

Artigo em Inglês | MEDLINE | ID: mdl-37482620

RESUMO

BACKGROUND: An updated version of the mwtab Python package for programmatic access to the Metabolomics Workbench (MetabolomicsWB) data repository was released at the beginning of 2021. Along with updating the package to match the changes to MetabolomicsWB's 'mwTab' file format specification and enhancing the package's functionality, the included validation facilities were used to detect and catalog file inconsistencies and errors across all publicly available datasets in MetabolomicsWB. RESULTS: The MetabolomicsWB File Status website was developed to provide continuous validation of MetabolomicsWB data files and a useful interface to all found inconsistencies and errors. This list of detectable issues/errors include format parsing errors, format compliance issues, access problems via MetabolomicsWB's REST interface, and other small inconsistencies that can hinder reusability. The website uses the mwtab Python package to pull down and validate each available analysis file and then generates an html report. The website is updated on a weekly basis. Moreover, the Python website design utilizes GitHub and GitHub.io, providing an easy to replicate template for implementing other metadata, virtual, and meta- repositories. CONCLUSIONS: The MetabolomicsWB File Status website provides a metadata repository of validation metadata to promote the FAIR use of existing metabolomics datasets from the MetabolomicsWB data repository.

Assuntos

Metadados , Software , Metabolômica , Armazenamento e Recuperação da Informação

15.

Setting up a data management infrastructure for bioimaging.

Kunis, Susanne; Bernhardt, Karen; Hensel, Michael.

Biol Chem ; 404(5): 433-439, 2023 04 25.

Artigo em Inglês | MEDLINE | ID: mdl-36853922

RESUMO

While the FAIR (Findable, Accessible, Interoperable, and Re-usable) principles are well accepted in the scientific community, there are still many challenges in implementing them in the day-to-day scientific process. Data management of microscopy images poses special challenges due to the volume, variety, and many proprietary formats. In particular, appropriate metadata collection, a basic requirement for FAIR data, is a real challenge for scientists due to the technical and content-related aspects. Researchers benefit here from interdisciplinary research network with centralized data management. The typically multimodal structure requires generalized data management and the corresponding acquisition of metadata. Here we report on the establishment of an appropriate infrastructure for the research network by a Core Facility and the development and integration of a software tool MDEmic that allows easy and convenient processing of metadata of microscopy images while providing high flexibility in terms of customization of metadata sets. Since it is also in the interest of the core facility to apply standards regarding the scope and serialization formats to realize successful and sustainable data management for bioimaging, we report on our efforts within the community to define standards in metadata, interfaces, and to reduce the barriers of daily data management.

Assuntos

Gerenciamento de Dados , Software , Metadados

16.

A review on viral data sources and search systems for perspective mitigation of COVID-19.

Bernasconi, Anna; Canakoglu, Arif; Masseroli, Marco; Pinoli, Pietro; Ceri, Stefano.

Brief Bioinform ; 22(2): 664-675, 2021 03 22.

Artigo em Inglês | MEDLINE | ID: mdl-33348368

RESUMO

With the outbreak of the COVID-19 disease, the research community is producing unprecedented efforts dedicated to better understand and mitigate the effects of the pandemic. In this context, we review the data integration efforts required for accessing and searching genome sequences and metadata of SARS-CoV2, the virus responsible for the COVID-19 disease, which have been deposited into the most important repositories of viral sequences. Organizations that were already present in the virus domain are now dedicating special interest to the emergence of COVID-19 pandemics, by emphasizing specific SARS-CoV2 data and services. At the same time, novel organizations and resources were born in this critical period to serve specifically the purposes of COVID-19 mitigation while setting the research ground for contrasting possible future pandemics. Accessibility and integration of viral sequence data, possibly in conjunction with the human host genotype and clinical data, are paramount to better understand the COVID-19 disease and mitigate its effects. Few examples of host-pathogen integrated datasets exist so far, but we expect them to grow together with the knowledge of COVID-19 disease; once such datasets will be available, useful integrative surveillance mechanisms can be put in place by observing how common variants distribute in time and space, relating them to the phenotypic impact evidenced in the literature.

Assuntos

COVID-19/terapia , COVID-19/epidemiologia , COVID-19/virologia , Genes Virais , Humanos , Armazenamento e Recuperação da Informação , Pandemias , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação

17.

The road towards data integration in human genomics: players, steps and interactions.

Bernasconi, Anna; Canakoglu, Arif; Masseroli, Marco; Ceri, Stefano.

Brief Bioinform ; 22(1): 30-44, 2021 01 18.

Artigo em Inglês | MEDLINE | ID: mdl-32496509

RESUMO

Thousands of new experimental datasets are becoming available every day; in many cases, they are produced within the scope of large cooperative efforts, involving a variety of laboratories spread all over the world, and typically open for public use. Although the potential collective amount of available information is huge, the effective combination of such public sources is hindered by data heterogeneity, as the datasets exhibit a wide variety of notations and formats, concerning both experimental values and metadata. Thus, data integration is becoming a fundamental activity, to be performed prior to data analysis and biological knowledge discovery, consisting of subsequent steps of data extraction, normalization, matching and enrichment; once applied to heterogeneous data sources, it builds multiple perspectives over the genome, leading to the identification of meaningful relationships that could not be perceived by using incompatible data formats. In this paper, we first describe a technological pipeline from data production to data integration; we then propose a taxonomy of genomic data players (based on the distinction between contributors, repository hosts, consortia, integrators and consumers) and apply the taxonomy to describe about 30 important players in genomic data management. We specifically focus on the integrator players and analyse the issues in solving the genomic data integration challenges, as well as evaluate the computational environments that they provide to follow up data integration by means of visualization and analysis tools.

Assuntos

Gerenciamento de Dados/métodos , Genoma Humano , Genômica/métodos , Humanos , Metadados

18.

Providing open imaging data at scale: An EMBL-EBI perspective.

Hartley, Matthew; Iudin, Andrii; Padwardhan, Ardan; Sarkans, Ugis; Yoldas, Aybüke Küpcü; Kleywegt, Gerard J.

Histochem Cell Biol ; 160(3): 211-221, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37537341

RESUMO

Biological imaging is one of the primary tools by which we understand living systems across scales from atoms to organisms. Rapid advances in imaging technology have increased both the spatial and temporal resolutions at which we examine those systems, as well as enabling visualisation of larger tissue volumes. These advances have huge potential but also generate ever increasing amounts of imaging data that must be stored and analysed. Public image repositories provide a critical scientific service through open data provision, supporting reproducibility of scientific results, access to reference imaging datasets and reuse of data for new scientific discovery and acceleration of image analysis methods development. The scale and scope of imaging data provides both challenges and opportunities for open sharing of image data. In this article, we provide a perspective influenced by decades of provision of open data resources for biological information, suggesting areas to focus on and a path towards global interoperability.

Assuntos

Processamento de Imagem Assistida por Computador , Reprodutibilidade dos Testes

19.

Building a FAIR image data ecosystem for microscopy communities.

Kemmer, Isabel; Keppler, Antje; Serrano-Solano, Beatriz; Rybina, Arina; Özdemir, Bugra; Bischof, Johanna; El Ghadraoui, Ayoub; Eriksson, John E; Mathur, Aastha.

Histochem Cell Biol ; 160(3): 199-209, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37341795

RESUMO

Bioimaging has now entered the era of big data with faster-than-ever development of complex microscopy technologies leading to increasingly complex datasets. This enormous increase in data size and informational complexity within those datasets has brought with it several difficulties in terms of common and harmonized data handling, analysis, and management practices, which are currently hampering the full potential of image data being realized. Here, we outline a wide range of efforts and solutions currently being developed by the microscopy community to address these challenges on the path towards FAIR bioimaging data. We also highlight how different actors in the microscopy ecosystem are working together, creating synergies that develop new approaches, and how research infrastructures, such as Euro-BioImaging, are fostering these interactions to shape the field.

Assuntos

Ecossistema , Microscopia

20.

Big data in contemporary electron microscopy: challenges and opportunities in data transfer, compute and management.

Poger, David; Yen, Lisa; Braet, Filip.

Histochem Cell Biol ; 160(3): 169-192, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37052655

RESUMO

The second decade of the twenty-first century witnessed a new challenge in the handling of microscopy data. Big data, data deluge, large data, data compliance, data analytics, data integrity, data interoperability, data retention and data lifecycle are terms that have introduced themselves to the electron microscopy sciences. This is largely attributed to the booming development of new microscopy hardware tools. As a result, large digital image files with an average size of one terabyte within one single acquisition session is not uncommon nowadays, especially in the field of cryogenic electron microscopy. This brings along numerous challenges in data transfer, compute and management. In this review, we will discuss in detail the current state of international knowledge on big data in contemporary electron microscopy and how big data can be transferred, computed and managed efficiently and sustainably. Workflows, solutions, approaches and suggestions will be provided, with the example of the latest experiences in Australia. Finally, important principles such as data integrity, data lifetime and the FAIR and CARE principles will be considered.

Assuntos

Big Data , Microscopia Eletrônica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA