Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 717
Filtrar
1.
Stud Health Technol Inform ; 317: 40-48, 2024 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-39234705

RESUMO

INTRODUCTION: The Local Data Hub (LDH) is a platform for FAIR sharing of medical research (meta-)data. In order to promote the usage of LDH in different research communities, it is important to understand the domain-specific needs, solutions currently used for data organization and provide support for seamless uploads to a LDH. In this work, we analyze the use case of microneurography, which is an electrophysiological technique for analyzing neural activity. METHODS: After performing a requirements analysis in dialogue with microneurography researchers, we propose a concept-mapping and a workflow, for the researchers to transform and upload their metadata. Further, we implemented a semi-automatic upload extension to odMLtables, a template-based tool for handling metadata in the electrophysiological community. RESULTS: The open-source implementation enables the odML-to-LDH concept mapping, allows data anonymization from within the tool and the creation of custom-made summaries on the underlying data sets. DISCUSSION: This concludes a first step towards integrating improved FAIR processes into the research laboratory's daily workflow. In future work, we will extend this approach to other use cases to disseminate the usage of LDHs in a larger research community.


Assuntos
Metadados , Humanos , Disseminação de Informação/métodos , Armazenamento e Recuperação da Informação/métodos
2.
Stud Health Technol Inform ; 317: 115-122, 2024 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-39234713

RESUMO

INTRODUCTION: NFDI4Health is a consortium funded by the German Research Foundation to make structured health data findable and accessible internationally according to the FAIR principles. Its goal is bringing data users and Data Holding Organizations (DHOs) together. It mainly considers DHOs conducting epidemiological and public health studies or clinical trials. METHODS: Local data hubs (LDH) are provided for such DHOs to connect decentralized local research data management within their organizations with the option of publishing shareable metadata via centralized NFDI4Health services such as the German central Health Study Hub. The LDH platform is based on FAIRDOM SEEK and provides a complete and flexible, locally controlled data and information management platform for health research data. A tailored NFDI4Health metadata schema for studies and their corresponding resources has been developed which is fully supported by the LDH software, e.g. for metadata transfer to other NFDI4Health services. RESULTS: The SEEK platform has been technically enhanced to support extended metadata structures tailored to the needs of the user communities in addition to the existing metadata structuring of SEEK. CONCLUSION: With the LDH and the MDS, the NFDI4Health provides all DHOs with a standardized and free and open source research data management platform for the FAIR exchange of structured health data.


Assuntos
Metadados , Alemanha , Humanos , Gerenciamento de Dados , Disseminação de Informação , Software
3.
Stud Health Technol Inform ; 317: 146-151, 2024 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-39234717

RESUMO

INTRODUCTION: The reuse of clinical data from clinical routine is a topic of research within the field of medical informatics under the term secondary use. In order to ensure the correct use and interpretation of data, there is a need for context information of data collection and a general understanding of the data. The use of metadata as an effective method of defining and maintaining context is well-established, particularly in the field of clinical trials. The objectives of this paper is to examine a method for integrating routine clinical data using metadata. METHODS: To this end, clinical forms extracted from a hospital information system will be converted into the FHIR format. A particular focus is placed on the consistent use of a metadata repository (MDR). RESULTS: A metadata-based approach using an MDR system was developed to simplify data integration and mapping of structured forms into FHIR resources, while offering many advantages in terms of flexibility and data quality. This facilitated the management and configuration of logic and definitions in one place, enabling the reusability and secondary use of data. DISCUSSION: This work allows the transfer of data elements without loss of detail and simplifies integration with target formats. The approach is adaptable for other ETL processes and eliminates the need for formatting concerns in the target profile.


Assuntos
Metadados , Projetos Piloto , Reino Unido , Registros Eletrônicos de Saúde , Humanos , Sistemas de Informação Hospitalar , Integração de Sistemas
4.
PLoS One ; 19(9): e0295662, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39240878

RESUMO

Stable isotope data have made pivotal contributions to nearly every discipline of the physical and natural sciences. As the generation and application of stable isotope data continues to grow exponentially, so does the need for a unifying data repository to improve accessibility and promote collaborative engagement. This paper provides an overview of the design, development, and implementation of IsoBank (www.isobank.org), a community-driven initiative to create an open-access repository for stable isotope data implemented online in 2021. A central goal of IsoBank is to provide a web-accessible database supporting interdisciplinary stable isotope research and educational opportunities. To achieve this goal, we convened a multi-disciplinary group of over 40 analytical experts, stable isotope researchers, database managers, and web developers to collaboratively design the database. This paper outlines the main features of IsoBank and provides a focused description of the core metadata structure. We present plans for future database and tool development and engagement across the scientific community. These efforts will help facilitate interdisciplinary collaboration among the many users of stable isotopic data while also offering useful data resources and standardization of metadata reporting across eco-geoinformatics landscapes.


Assuntos
Bases de Dados Factuais , Metadados , Isótopos , Internet
5.
Sci Rep ; 14(1): 20842, 2024 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-39242690

RESUMO

Melanoma of the skin is the 17th most common cancer worldwide. Early detection of suspicious skin lesions (melanoma) can increase 5-year survival rates by 20%. The 7-point checklist (7PCL) has been extensively used to suggest urgent referrals for patients with a possible melanoma. However, the 7PCL method only considers seven meta-features to calculate a risk score and is only relevant for patients with suspected melanoma. There are limited studies on the extensive use of patient metadata for the detection of all skin cancer subtypes. This study investigates artificial intelligence (AI) models that utilise patient metadata consisting of 23 attributes for suspicious skin lesion detection. We have identified a new set of most important risk factors, namely "C4C risk factors", which is not just for melanoma, but for all types of skin cancer. The performance of the C4C risk factors for suspicious skin lesion detection is compared to that of the 7PCL and the Williams risk factors that predict the lifetime risk of melanoma. Our proposed AI framework ensembles five machine learning models and identifies seven new skin cancer risk factors: lesion pink, lesion size, lesion colour, lesion inflamed, lesion shape, lesion age, and natural hair colour, which achieved a sensitivity of 80.46 ± 2.50 % and a specificity of 62.09 ± 1.90 % in detecting suspicious skin lesions when evaluated using the metadata of 53,601 skin lesions collected from different skin cancer diagnostic clinics across the UK, significantly outperforming the 7PCL-based method (sensitivity 68.09 ± 2.10 % , specificity 61.07 ± 0.90 % ) and the Williams risk factors (sensitivity 66.32 ± 1.90 % , specificity 61.71 ± 0.6 % ). Furthermore, through weighting the seven new risk factors we came up with a new risk score, namely "C4C risk score", which alone achieved a sensitivity of 76.09 ± 1.20 % and a specificity of 61.71 ± 0.50 % , significantly outperforming the 7PCL-based risk score (sensitivity 73.91 ± 1.10 % , specificity 49.49 ± 0.50 % ) and the Williams risk score (sensitivity 60.68 ± 1.30 % , specificity 60.87 ± 0.80 % ). Finally, fusing the C4C risk factors with the 7PCL and Williams risk factors achieved the best performance, with a sensitivity of 85.24 ± 2.20 % and a specificity of 61.12 ± 0.90 % . We believe that fusing these newly found risk factors and new risk score with image data will further boost the AI model performance for suspicious skin lesion detection. Hence, the new set of skin cancer risk factors has the potential to be used to modify current skin cancer referral guidelines for all skin cancer subtypes, including melanoma.


Assuntos
Inteligência Artificial , Melanoma , Neoplasias Cutâneas , Humanos , Neoplasias Cutâneas/diagnóstico , Melanoma/diagnóstico , Fatores de Risco , Masculino , Pessoa de Meia-Idade , Feminino , Metadados , Detecção Precoce de Câncer/métodos , Adulto , Idoso , Aprendizado de Máquina , Medição de Risco/métodos
6.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 41(4): 807-817, 2024 Aug 25.
Artigo em Chinês | MEDLINE | ID: mdl-39218608

RESUMO

High-grade serous ovarian cancer has a high degree of malignancy, and at detection, it is prone to infiltration of surrounding soft tissues, as well as metastasis to the peritoneum and lymph nodes, peritoneal seeding, and distant metastasis. Whether recurrence occurs becomes an important reference for surgical planning and treatment methods for this disease. Current recurrence prediction models do not consider the potential pathological relationships between internal tissues of the entire ovary. They use convolutional neural networks to extract local region features for judgment, but the accuracy is low, and the cost is high. To address this issue, this paper proposes a new lightweight deep learning algorithm model for predicting recurrence of high-grade serous ovarian cancer. The model first uses ghost convolution (Ghost Conv) and coordinate attention (CA) to establish ghost counter residual (SCblock) modules to extract local feature information from images. Then, it captures global information and integrates multi-level information through proposed layered fusion Transformer (STblock) modules to enhance interaction between different layers. The Transformer module unfolds the feature map to compute corresponding region blocks, then folds it back to reduce computational cost. Finally, each STblock module fuses deep and shallow layer depth information and incorporates patient's clinical metadata for recurrence prediction. Experimental results show that compared to the mainstream lightweight mobile visual Transformer (MobileViT) network, the proposed slicer visual Transformer (SlicerViT) network improves accuracy, precision, sensitivity, and F1 score, with only 1/6 of the computational cost and half the parameter count. This research confirms that the proposed algorithm model is more accurate and efficient in predicting recurrence of high-grade serous ovarian cancer. In the future, it can serve as an auxiliary diagnostic technique to improve patient survival rates and facilitate the application of the model in embedded devices.


Assuntos
Algoritmos , Aprendizado Profundo , Recidiva Local de Neoplasia , Redes Neurais de Computação , Neoplasias Ovarianas , Humanos , Feminino , Neoplasias Ovarianas/patologia , Metadados , Cistadenocarcinoma Seroso/patologia
7.
Bioinformatics ; 40(Suppl 2): ii4-ii10, 2024 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-39230700

RESUMO

With the development of high-throughput technologies, genomics datasets rapidly grow in size, including functional genomics data. This has allowed the training of large Deep Learning (DL) models to predict epigenetic readouts, such as protein binding or histone modifications, from genome sequences. However, large dataset sizes come at a price of data consistency, often aggregating results from a large number of studies, conducted under varying experimental conditions. While data from large-scale consortia are useful as they allow studying the effects of different biological conditions, they can also contain unwanted biases from confounding experimental factors. Here, we introduce Metadata-guided Feature Disentanglement (MFD)-an approach that allows disentangling biologically relevant features from potential technical biases. MFD incorporates target metadata into model training, by conditioning weights of the model output layer on different experimental factors. It then separates the factors into disjoint groups and enforces independence of the corresponding feature subspaces with an adversarially learned penalty. We show that the metadata-driven disentanglement approach allows for better model introspection, by connecting latent features to experimental factors, without compromising, or even improving performance in downstream tasks, such as enhancer prediction, or genetic variant discovery. The code will be made available at https://github.com/HealthML/MFD.


Assuntos
Genômica , Metadados , Genômica/métodos , Aprendizado Profundo , Humanos
8.
Med Image Anal ; 98: 103325, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-39208560

RESUMO

Recent advances in generative models have paved the way for enhanced generation of natural and medical images, including synthetic brain MRIs. However, the mainstay of current AI research focuses on optimizing synthetic MRIs with respect to visual quality (such as signal-to-noise ratio) while lacking insights into their relevance to neuroscience. To generate high-quality T1-weighted MRIs relevant for neuroscience discovery, we present a two-stage Diffusion Probabilistic Model (called BrainSynth) to synthesize high-resolution MRIs conditionally-dependent on metadata (such as age and sex). We then propose a novel procedure to assess the quality of BrainSynth according to how well its synthetic MRIs capture macrostructural properties of brain regions and how accurately they encode the effects of age and sex. Results indicate that more than half of the brain regions in our synthetic MRIs are anatomically plausible, i.e., the effect size between real and synthetic MRIs is small relative to biological factors such as age and sex. Moreover, the anatomical plausibility varies across cortical regions according to their geometric complexity. As is, the MRIs generated by BrainSynth significantly improve the training of a predictive model to identify accelerated aging effects in an independent study. These results indicate that our model accurately capture the brain's anatomical information and thus could enrich the data of underrepresented samples in a study. The code of BrainSynth will be released as part of the MONAI project at https://github.com/Project-MONAI/GenerativeModels.


Assuntos
Imageamento Tridimensional , Imageamento por Ressonância Magnética , Humanos , Imageamento por Ressonância Magnética/métodos , Imageamento Tridimensional/métodos , Feminino , Masculino , Metadados , Encéfalo/diagnóstico por imagem , Adulto , Pessoa de Meia-Idade , Razão Sinal-Ruído
9.
Database (Oxford) ; 20242024 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-39167718

RESUMO

Microbiome research has made significant gains with the evolution of sequencing technologies. Ensuring comparability between studies and enhancing the findability, accessibility, interoperability and reproducibility of microbiome data are crucial for maximizing the value of this growing body of research. Addressing the challenges of standardized metadata reporting, collection and curation, the Microbiome Working Group of the Human Hereditary and Health in Africa (H3Africa) consortium aimed to develop a comprehensive solution. In this paper, we present the Microbiome Research Data Toolkit, a versatile tool designed to standardize microbiome research metadata, facilitate MIxS-MIMS and PhenX reporting, standardize prospective collection of participant biological and lifestyle data, and retrospectively harmonize such data. This toolkit enables past, present and future microbiome research endeavors to collaborate effectively, fostering novel collaborations and accelerating knowledge discovery in the field. Database URL: https://doi.org/10.25375/uct.24218999.v2.


Assuntos
Metadados , Microbiota , Humanos , Bases de Dados Factuais
10.
Med Image Anal ; 97: 103296, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39154616

RESUMO

Deep learning has potential to automate screening, monitoring and grading of disease in medical images. Pretraining with contrastive learning enables models to extract robust and generalisable features from natural image datasets, facilitating label-efficient downstream image analysis. However, the direct application of conventional contrastive methods to medical datasets introduces two domain-specific issues. Firstly, several image transformations which have been shown to be crucial for effective contrastive learning do not translate from the natural image to the medical image domain. Secondly, the assumption made by conventional methods, that any two images are dissimilar, is systematically misleading in medical datasets depicting the same anatomy and disease. This is exacerbated in longitudinal image datasets that repeatedly image the same patient cohort to monitor their disease progression over time. In this paper we tackle these issues by extending conventional contrastive frameworks with a novel metadata-enhanced strategy. Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships. To this end we employ records for patient identity, eye position (i.e. left or right) and time series information. In experiments using two large longitudinal datasets containing 170,427 retinal optical coherence tomography (OCT) images of 7912 patients with age-related macular degeneration (AMD), we evaluate the utility of using metadata to incorporate the temporal dynamics of disease progression into pretraining. Our metadata-enhanced approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks related to AMD. We find benefits in both a low-data and high-data regime across tasks ranging from AMD stage and type classification to prediction of visual acuity. Due to its modularity, our method can be quickly and cost-effectively tested to establish the potential benefits of including available metadata in contrastive pretraining.


Assuntos
Aprendizado Profundo , Metadados , Tomografia de Coerência Óptica , Humanos , Tomografia de Coerência Óptica/métodos , Degeneração Macular/diagnóstico por imagem , Interpretação de Imagem Assistida por Computador/métodos , Retina/diagnóstico por imagem
11.
Comput Methods Programs Biomed ; 256: 108382, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-39213898

RESUMO

OBJECTIVE: In diabetes mellitus patients, hyperuricemia may lead to the development of diabetic complications, including macrovascular and microvascular dysfunction. However, the level of blood uric acid in diabetic patients is obtained by sampling peripheral blood from the patient, which is an invasive procedure and not conducive to routine monitoring. Therefore, we developed deep learning algorithm to detect noninvasively hyperuricemia from retina photographs and metadata of patients with diabetes and evaluated performance in multiethnic populations and different subgroups. MATERIALS AND METHODS: To achieve the task of non-invasive detection of hyperuricemia in diabetic patients, given that blood uric acid metabolism is directly related to estimated glomerular filtration rate(eGFR), we first performed a regression task for eGFR value before the classification task for hyperuricemia and reintroduced the eGFR regression values into the baseline information. We trained 3 deep learning models: (1) metadata model adjusted for sex, age, body mass index, duration of diabetes, HbA1c, systolic blood pressure, diastolic blood pressure; (2) image model based on fundus photographs; (3)hybrid model combining image and metadata model. Data from the Shanghai General Hospital Diabetes Management Center (ShDMC) were used to develop (6091 participants with diabetes) and internally validated (using 5-fold cross-validation) the models. External testing was performed on an independent dataset (UK Biobank dataset) consisting of 9327 participants with diabetes. RESULTS: For the regression task of eGFR, in ShDMC dataset, the coefficient of determination (R2) was 0.684±0.07 (95 % CI) for image model, 0.501±0.04 for metadata model, and 0.727±0.002 for hybrid model. In external UK Biobank dataset, a coefficient of determination (R2) was 0.647±0.06 for image model, 0.627±0.03 for metadata model, and 0.697±0.07 for hybrid model. Our method was demonstrably superior to previous methods. For the classification of hyperuricemia, in ShDMC validation, the area, under the curve (AUC) was 0.86±0.013for image model, 0.86±0.013 for metadata model, and 0.92±0.026 for hybrid model. Estimates with UK biobank were 0.82±0.017 for image model, 0.79±0.024 for metadata model, and 0.89±0.032 for hybrid model. CONCLUSION: There is a potential deep learning algorithm using fundus photographs as a noninvasively screening adjunct for hyperuricemia among individuals with diabetes. Meanwhile, combining patient's metadata enables higher screening accuracy. After applying the visualization tool, it found that the deep learning network for the identification of hyperuricemia mainly focuses on the fundus optic disc region.


Assuntos
Algoritmos , Aprendizado Profundo , Diabetes Mellitus , Taxa de Filtração Glomerular , Hiperuricemia , Metadados , Redes Neurais de Computação , Humanos , Pessoa de Meia-Idade , Hiperuricemia/complicações , Masculino , Feminino , Diabetes Mellitus/sangue , Fundo de Olho , Idoso , Adulto , Ácido Úrico/sangue , Processamento de Imagem Assistida por Computador/métodos
12.
Stud Health Technol Inform ; 316: 1689-1693, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176535

RESUMO

Multicentre studies become possible with the current strategies to solve the interoperability problems between databases. With the great adoption of those strategies, new problems regarding data discovery were raised. Some were solved using database catalogues and graphical dashboards for data analysis and comparison. However, when these communities grow, these strategies become obsolete. In this work, we addressed those challenges by proposing a platform with a chatbot-like mechanism to help medical researchers identify databases of interest. The tool was developed using the metadata extracted from OMOP CDM databases.


Assuntos
Bases de Dados Factuais , Humanos , Metadados , Registros Eletrônicos de Saúde
13.
Stud Health Technol Inform ; 316: 1120-1124, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176578

RESUMO

Secondary use of health data has become an emerging topic in medical informatics. Many initiatives focus on clinical routine data, but clinical trial data has complementary strengths regarding highly structured documentation and mandatory data quality (DQ) reviews during the implementation. Clinical imaging trials investigate new imaging methods and procedures. Recently, DQ frameworks for structured data were proposed for harmonized quality assessments (QA). In this article, we investigate the application of these concepts to imaging trials and how a DQ framework could be defined for secondary use scenarios. We conclude that image quality can be assessed through both pixel data and metadata, and the latter can mostly be handled like structured study documentation in QA. For pixel data, typical quality indicators can be mapped to existing frameworks, but require additional image processing. Specific attention needs to be drawn to complete de-identification of imaging data, both on pixel data and metadata level.


Assuntos
Confiabilidade dos Dados , Diagnóstico por Imagem , Humanos , Ensaios Clínicos como Assunto , Metadados , Garantia da Qualidade dos Cuidados de Saúde
14.
Stud Health Technol Inform ; 316: 1269-1273, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176613

RESUMO

The results and details of the clinical studies and research must be securely stored to ensure reliability, accountability, and prevent malicious misuse. To accomplish this, a secure method for storing metadata and study results is crucial. Also, a mechanism to ensure accountability for both data owners and researchers is needed. In this way, data owners and the scientific community can rely on and verify results and methods presented by researchers, while researchers can check the validity of the analyzed data and have proof of authorship for their work. A modular framework is presented in this paper, which utilizes blockchain and cryptography to store study results and metadata, along with proof of accountability. The framework has been tested within a privacy-preserving distributed analytics infrastructure.


Assuntos
Blockchain , Segurança Computacional , Responsabilidade Social , Reprodutibilidade dos Testes , Humanos , Confidencialidade , Armazenamento e Recuperação da Informação , Metadados
15.
Stud Health Technol Inform ; 316: 1413-1417, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176645

RESUMO

The National Research Data Infrastructure for Personal Health Data (NFDI4Health) uses Local Data Hubs (LDHs) to manage locally research studies, documents and sensitive personal data to support controlled data sharing. While research data management (RDM) systems facilitate the storage and preparation of data and metadata as well as organizational access, they often lack support for interoperability standards of the application domain. To support the exchange with external registries of research studies, we chose 17 attributes to characterize the most relevant aspects of clinical trials (in the following named "metadata profile"). We implemented the metadata profile in the RDM system FAIRDOM SEEK using core attributes and SEEK's extended metadata feature and created a mapping conforming to the Health Level 7 Fast Healthcare Interoperability Resources (FHIR) standard version R4. Finally, we implemented a prototype application interface for exports in FHIR-JSON format. We plan to extend the interface to serve central registries and support specific FHIR Implementation Guides from various use cases.


Assuntos
Metadados , Metadados/normas , Gerenciamento de Dados , Interoperabilidade da Informação em Saúde/normas , Humanos , Sistema de Registros , Disseminação de Informação , Troca de Informação em Saúde/normas
16.
Stud Health Technol Inform ; 316: 358-359, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176750

RESUMO

This work aims to improve FAIR-ness of the microneurography research by integrating the local (meta)data to existing research data infrastructures. In the previous work, we developed an odML based solution for local metadata storage of microneurography data. However, this solution is limited to a narrow community. As a next step, we propose the integration into the Local Data Hubs, data-sharing services within NFDI4Health infrastructure. We outline a first concept, that streams chosen data from the established odMLtables GUI.


Assuntos
Metadados , Humanos , Armazenamento e Recuperação da Informação/métodos , Disseminação de Informação
17.
Stud Health Technol Inform ; 316: 1960-1961, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176876

RESUMO

This work presents the Fast Healthcare Interoperability Resources (FHIR®) specification of the NFDI4Health Metadata schema based on FHIR Version 4: We created 16 profiles to facilitate the integration of clinical, epidemiological, and public health study data. Despite challenges arising from the extensive MDS as well as missing concepts in semantic standards, it marks a significant advance in applying information technology standards to health research.


Assuntos
Interoperabilidade da Informação em Saúde , Nível Sete de Saúde , Metadados , Humanos , Registros Eletrônicos de Saúde , Estudos Epidemiológicos , Saúde Pública , Pesquisa Biomédica
18.
Genome Biol ; 25(1): 205, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39090672

RESUMO

Many datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While biospecimen and experimental information is often captured, detailed metadata standards related to data matrices and analysis workflows are currently lacking. To address this, we develop the matrix and analysis metadata standards (MAMS) to serve as a resource for data centers, repositories, and tool developers. We define metadata fields for matrices and parameters commonly utilized in analytical workflows and developed the rmams package to extract MAMS from single-cell objects. Overall, MAMS promotes the harmonization, integration, and reproducibility of single-cell data across platforms.


Assuntos
Metadados , Análise de Célula Única , Análise de Célula Única/métodos , Análise de Célula Única/normas , Reprodutibilidade dos Testes , Humanos , Software
20.
Pharmacoepidemiol Drug Saf ; 33(8): e5871, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39145406

RESUMO

PURPOSE: Metadata for data dIscoverability aNd study rEplicability in obseRVAtional studies (MINERVA), a European Medicines Agency-funded project (EUPAS39322), defined a set of metadata to describe real-world data sources (RWDSs) and piloted metadata collection in a prototype catalogue to assist investigators from data source discoverability through study conduct. METHODS: A list of metadata was created from a review of existing metadata catalogues and recommendations, structured interviews, a stakeholder survey, and a technical workshop. The prototype was designed to comply with the FAIR principles (findable, accessible, interoperable, reusable), using MOLGENIS software. Metadata collection was piloted by 15 data access partners (DAPs) from across Europe. RESULTS: A total of 442 metadata variables were defined in six domains: institutions (organizations connected to a data source); data banks (data collections sustained by an organization); data sources (collections of linkable data banks covering a common underlying population); studies; networks (of institutions); and common data models (CDMs). A total of 26 institutions were recorded in the prototype. Each DAP populated the metadata of one data source and its selected data banks. The number of data banks varied by data source; the most common data banks were hospital administrative records and pharmacy dispensation records (10 data sources each). Quantitative metadata were successfully extracted from three data sources conforming to different CDMs and entered into the prototype. CONCLUSIONS: A metadata list was finalized, a prototype was successfully populated, and a good practice guide was developed. Setting up and maintaining a metadata catalogue on RWDSs will require substantial effort to support discoverability of data sources and reproducibility of studies in Europe.


Assuntos
Metadados , Estudos Observacionais como Assunto , Europa (Continente) , Humanos , Projetos Piloto , Reprodutibilidade dos Testes , Estudos Observacionais como Assunto/métodos , Coleta de Dados/métodos , Coleta de Dados/normas , Bases de Dados Factuais/estatística & dados numéricos , Software , Farmacoepidemiologia/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA