Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 851
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38752856

RESUMO

Enhancing the reproducibility and comprehension of adaptive immune receptor repertoire sequencing (AIRR-seq) data analysis is critical for scientific progress. This study presents guidelines for reproducible AIRR-seq data analysis, and a collection of ready-to-use pipelines with comprehensive documentation. To this end, ten common pipelines were implemented using ViaFoundry, a user-friendly interface for pipeline management and automation. This is accompanied by versioned containers, documentation and archiving capabilities. The automation of pre-processing analysis steps and the ability to modify pipeline parameters according to specific research needs are emphasized. AIRR-seq data analysis is highly sensitive to varying parameters and setups; using the guidelines presented here, the ability to reproduce previously published results is demonstrated. This work promotes transparency, reproducibility, and collaboration in AIRR-seq data analysis, serving as a model for handling and documenting bioinformatics pipelines in other research domains.


Assuntos
Biologia Computacional , Software , Humanos , Biologia Computacional/métodos , Reprodutibilidade dos Testes , Receptores Imunológicos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Imunidade Adaptativa/genética , Guias como Assunto
2.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38836701

RESUMO

Biomedical data are generated and collected from various sources, including medical imaging, laboratory tests and genome sequencing. Sharing these data for research can help address unmet health needs, contribute to scientific breakthroughs, accelerate the development of more effective treatments and inform public health policy. Due to the potential sensitivity of such data, however, privacy concerns have led to policies that restrict data sharing. In addition, sharing sensitive data requires a secure and robust infrastructure with appropriate storage solutions. Here, we examine and compare the centralized and federated data sharing models through the prism of five large-scale and real-world use cases of strategic significance within the European data sharing landscape: the French Health Data Hub, the BBMRI-ERIC Colorectal Cancer Cohort, the federated European Genome-phenome Archive, the Observational Medical Outcomes Partnership/OHDSI network and the EBRAINS Medical Informatics Platform. Our analysis indicates that centralized models facilitate data linkage, harmonization and interoperability, while federated models facilitate scaling up and legal compliance, as the data typically reside on the data generator's premises, allowing for better control of how data are shared. This comparative study thus offers guidance on the selection of the most appropriate sharing strategy for sensitive datasets and provides key insights for informed decision-making in data sharing efforts.


Assuntos
Disciplinas das Ciências Biológicas , Disseminação de Informação , Humanos , Informática Médica/métodos
3.
Proc Natl Acad Sci U S A ; 120(43): e2206981120, 2023 Oct 24.
Artigo em Inglês | MEDLINE | ID: mdl-37831745

RESUMO

In January 2023, a new NIH policy on data sharing went into effect. The policy applies to both quantitative and qualitative research (QR) data such as data from interviews or focus groups. QR data are often sensitive and difficult to deidentify, and thus have rarely been shared in the United States. Over the past 5 y, our research team has engaged stakeholders on QR data sharing, developed software to support data deidentification, produced guidance, and collaborated with the ICPSR data repository to pilot the deposit of 30 QR datasets. In this perspective article, we share important lessons learned by addressing eight clusters of questions on issues such as where, when, and what to share; how to deidentify data and support high-quality secondary use; budgeting for data sharing; and the permissions needed to share data. We also offer a brief assessment of the state of preparedness of data repositories, QR journals, and QR textbooks to support data sharing. While QR data sharing could yield important benefits to the research community, we quickly need to develop enforceable standards, expertise, and resources to support responsible QR data sharing. Absent these resources, we risk violating participant confidentiality and wasting a significant amount of time and funding on data that are not useful for either secondary use or data transparency and verification.

4.
Syst Biol ; 73(1): 158-182, 2024 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-38102727

RESUMO

Phylogenetic metrics are essential tools used in the study of ecology, evolution and conservation. Phylogenetic diversity (PD) in particular is one of the most prominent measures of biodiversity and is based on the idea that biological features accumulate along the edges of phylogenetic trees that are summed. We argue that PD and many other phylogenetic biodiversity metrics fail to capture an essential process that we term attrition. Attrition is the gradual loss of features through causes other than extinction. Here we introduce "EvoHeritage", a generalization of PD that is founded on the joint processes of accumulation and attrition of features. We argue that while PD measures evolutionary history, EvoHeritage is required to capture a more pertinent subset of evolutionary history including only components that have survived attrition. We show that EvoHeritage is not the same as PD on a tree with scaled edges; instead, accumulation and attrition interact in a more complex non-monophyletic way that cannot be captured by edge lengths alone. This leads us to speculate that the one-dimensional edge lengths of classic trees may be insufficiently flexible to capture the nuances of evolutionary processes. We derive a measure of EvoHeritage and show that it elegantly reproduces species richness and PD at opposite ends of a continuum based on the intensity of attrition. We demonstrate the utility of EvoHeritage in ecology as a predictor of community productivity compared with species richness and PD. We also show how EvoHeritage can quantify living fossils and resolve their associated controversy. We suggest how the existing calculus of PD-based metrics and other phylogenetic biodiversity metrics can and should be recast in terms of EvoHeritage accumulation and attrition.


Assuntos
Biodiversidade , Filogenia , Evolução Biológica , Classificação/métodos , Modelos Biológicos
5.
Cereb Cortex ; 34(2)2024 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-38342691

RESUMO

Third-party punishment occurs in interpersonal interactions to sustain social norms, and is strongly influenced by the characteristics of the interacting individuals. During social interactions, height is the striking physical appearance features first observed, height disadvantage may critically influence men's behavior and mental health. Herein, we explored the influence of height disadvantage on third-party punishment through time-frequency analysis and electroencephalography hyperscanning. Two participants were randomly designated as the recipient and third party after height comparison and instructed to complete third-party punishment task. Compared with when the third party's height is higher than the recipient's height, when the third party's height is lower, the punishment rate and transfer amount were significantly higher. Only for highly unfair offers, the theta power was significantly greater when the third party's height was lower. The inter-brain synchronization between the recipient and the third party was significantly stronger when the third party's height was lower. Compared with the fair and medium unfair offers, the inter-brain synchronization was strongest for highly unfair offers. Our findings indicate that the height disadvantage-induced anger and reputation concern promote third-party punishment and inter-brain synchronization. This study enriches research perspective and expands the application of the theory of Napoleon complex.


Assuntos
Eletroencefalografia , Punição , Masculino , Humanos , Punição/psicologia , Relações Interpessoais , Interação Social , Encéfalo
6.
Proteomics ; 24(3-4): e2300354, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38088481

RESUMO

In recent years, there has been a tremendous evolution in the high-throughput, tandem mass spectrometry-based analysis of intact proteins, also known as top-down proteomics (TDP). Both hardware and software have developed to the point that the technique has largely entered the mainstream, and large-scale, ambitious, multi-laboratory initiatives have started to make their appearance in the literature. For this, however, more convenient and robust data sharing and reuse will be required. Walzer et al. have created TopDownApp, a customisable, open platform for visualisation and analysis of TDP data, which they hope will be a step in this direction. As they point out, other benefits of such data sharing and interoperability would include reanalysis of published datasets, as well as the prospect of using large amounts of data to train machine learning algorithms. In time, this work could prove to be a valuable resource in the move towards a future of greater TDP data findability, accessibility, interoperability and reusability.


Assuntos
Proteômica , Software , Proteômica/métodos , Algoritmos , Espectrometria de Massas em Tandem , Proteínas de Ligação a DNA
7.
BMC Bioinformatics ; 25(1): 138, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38553675

RESUMO

Even though high-throughput transcriptome sequencing is routinely performed in many laboratories, computational analysis of such data remains a cumbersome process often executed manually, hence error-prone and lacking reproducibility. For corresponding data processing, we introduce Curare, an easy-to-use yet versatile workflow builder for analyzing high-throughput RNA-Seq data focusing on differential gene expression experiments. Data analysis with Curare is customizable and subdivided into preprocessing, quality control, mapping, and downstream analysis stages, providing multiple options for each step while ensuring the reproducibility of the workflow. For a fast and straightforward exploration and visualization of differential gene expression results, we provide the gene expression visualizer software GenExVis. GenExVis can create various charts and tables from simple gene expression tables and DESeq2 results without the requirement to upload data or install software packages. In combination, Curare and GenExVis provide a comprehensive software environment that supports the entire data analysis process, from the initial handling of raw RNA-Seq data to the final DGE analyses and result visualizations, thereby significantly easing data processing and subsequent interpretation.


Assuntos
Curare , RNA-Seq , Reprodutibilidade dos Testes , Análise de Sequência de RNA/métodos , Transcriptoma , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Perfilação da Expressão Gênica/métodos
8.
J Lipid Res ; 65(9): 100621, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39151590

RESUMO

The rapid increase in lipidomic studies has led to a collaborative effort within the community to establish standards and criteria for producing, documenting, and disseminating data. Creating a dynamic easy-to-use checklist that condenses key information about lipidomic experiments into common terminology will enhance the field's consistency, comparability, and repeatability. Here, we describe the structure and rationale of the established Lipidomics Minimal Reporting Checklist to increase transparency in lipidomics research.


Assuntos
Lista de Checagem , Lipidômica , Lipidômica/métodos , Lipidômica/normas , Humanos , Lipídeos/análise , Lipídeos/química
9.
Plant J ; 116(4): 974-988, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37818860

RESUMO

In modern reproducible, hypothesis-driven plant research, scientists are increasingly relying on research data management (RDM) services and infrastructures to streamline the processes of collecting, processing, sharing, and archiving research data. FAIR (i.e., findable, accessible, interoperable, and reusable) research data play a pivotal role in enabling the integration of interdisciplinary knowledge and facilitating the comparison and synthesis of a wide range of analytical findings. The PLANTdataHUB offers a solution that realizes RDM of scientific (meta)data as evolving collections of files in a directory - yielding FAIR digital objects called ARCs - with tools that enable scientists to plan, communicate, collaborate, publish, and reuse data on the same platform while gaining continuous quality control insights. The centralized platform is scalable from personal use to global communities and provides advanced federation capabilities for institutions that prefer to host their own satellite instances. This approach borrows many concepts from software development and adapts them to fit the challenges of the field of modern plant science undergoing digital transformation. The PLANTdataHUB supports researchers in each stage of a scientific project with adaptable continuous quality control insights, from the early planning phase to data publication. The central live instance of PLANTdataHUB is accessible at (https://git.nfdi4plants.org), and it will continue to evolve as a community-driven and dynamic resource that serves the needs of contemporary plant science.


Assuntos
Bases de Dados como Assunto , Disseminação de Informação , Plantas
10.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34472587

RESUMO

Chemosensitivity assays are commonly used for preclinical drug discovery and clinical trial optimization. However, data from independent assays are often discordant, largely attributed to uncharacterized variation in the experimental materials and protocols. We report here the launching of Minimal Information for Chemosensitivity Assays (MICHA), accessed via https://micha-protocol.org. Distinguished from existing efforts that are often lacking support from data integration tools, MICHA can automatically extract publicly available information to facilitate the assay annotation including: 1) compounds, 2) samples, 3) reagents and 4) data processing methods. For example, MICHA provides an integrative web server and database to obtain compound annotation including chemical structures, targets and disease indications. In addition, the annotation of cell line samples, assay protocols and literature references can be greatly eased by retrieving manually curated catalogues. Once the annotation is complete, MICHA can export a report that conforms to the FAIR principle (Findable, Accessible, Interoperable and Reusable) of drug screening studies. To consolidate the utility of MICHA, we provide FAIRified protocols from five major cancer drug screening studies as well as six recently conducted COVID-19 studies. With the MICHA web server and database, we envisage a wider adoption of a community-driven effort to improve the open access of drug sensitivity assays.

11.
J Exp Biol ; 227(18)2024 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-39287119

RESUMO

JEB has broadened its scope to include non-hypothesis-led research. In this Perspective, based on our lab's lived experience, I argue that this is excellent news, because truly novel insights can occur from 'blue skies' idea-led experiments. Hypothesis-led and hypothesis-free experimentation are not philosophically antagonistic; rather, the latter can provide a short-cut to an unbiased view of organism function, and is intrinsically hypothesis generating. Insights derived from hypothesis-free research are commonly obtained by the generation and analysis of big datasets - for example, by genetic screens - or from omics-led approaches (notably transcriptomics). Furthermore, meta-analyses of existing datasets can also provide a lower-cost means to formulating new hypotheses, specifically if researchers take advantage of the FAIR principles (findability, accessibility, interoperability and reusability) to access relevant, publicly available datasets. The broadened scope will thus bring new, original work and novel insights to our journal, by expanding the range of fundamental questions that can be asked.


Assuntos
Big Data
12.
J Microsc ; 294(3): 350-371, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38752662

RESUMO

Bioimage data are generated in diverse research fields throughout the life and biomedical sciences. Its potential for advancing scientific progress via modern, data-driven discovery approaches reaches beyond disciplinary borders. To fully exploit this potential, it is necessary to make bioimaging data, in general, multidimensional microscopy images and image series, FAIR, that is, findable, accessible, interoperable and reusable. These FAIR principles for research data management are now widely accepted in the scientific community and have been adopted by funding agencies, policymakers and publishers. To remain competitive and at the forefront of research, implementing the FAIR principles into daily routines is an essential but challenging task for researchers and research infrastructures. Imaging core facilities, well-established providers of access to imaging equipment and expertise, are in an excellent position to lead this transformation in bioimaging research data management. They are positioned at the intersection of research groups, IT infrastructure providers, the institution´s administration, and microscope vendors. In the frame of German BioImaging - Society for Microscopy and Image Analysis (GerBI-GMB), cross-institutional working groups and third-party funded projects were initiated in recent years to advance the bioimaging community's capability and capacity for FAIR bioimage data management. Here, we provide an imaging-core-facility-centric perspective outlining the experience and current strategies in Germany to facilitate the practical adoption of the FAIR principles closely aligned with the international bioimaging community. We highlight which tools and services are ready to be implemented and what the future directions for FAIR bioimage data have to offer.


Assuntos
Microscopia , Pesquisa Biomédica/métodos , Gerenciamento de Dados/métodos , Processamento de Imagem Assistida por Computador/métodos , Microscopia/métodos
13.
J Microsc ; 2024 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-39275979

RESUMO

Modern bioimaging core facilities at research institutions are essential for managing and maintaining high-end instruments, providing training and support for researchers in experimental design, image acquisition and data analysis. An important task for these facilities is the professional management of complex multidimensional bioimaging data, which are often produced in large quantity and very different file formats. This article details the process that led to successfully implementing the OME Remote Objects system (OMERO) for bioimage-specific research data management (RDM) at the Core Facility Cellular Imaging (CFCI) at the Technische Universität Dresden (TU Dresden). Ensuring compliance with the FAIR (findable, accessible, interoperable, reusable) principles, we outline here the challenges that we faced in adapting data handling and storage to a new RDM system. These challenges included the introduction of a standardised group-specific naming convention, metadata curation with tagging and Key-Value pairs, and integration of existing image processing workflows. By sharing our experiences, this article aims to provide insights and recommendations for both individual researchers and educational institutions intending to implement OMERO as a management system for bioimaging data. We showcase how tailored decisions and structured approaches lead to successful outcomes in RDM practices.

14.
Value Health ; 27(10): 1348-1357, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39154910

RESUMO

OBJECTIVES: By September 2024, the Centers for Medicare and Medicaid Services (CMS) will publicly report the negotiated prices (Maximum Fair Prices) for the first 10 drugs selected for price negotiation. We estimate initial price offers based on net prices, statutorily defined ceilings, and comparative effectiveness data for the 10 drugs and their therapeutic alternatives. METHODS: We utilized net prices and other price benchmarks for the 10 drugs and their therapeutic alternatives. We searched for data on comparative clinical effectiveness for the primary indications. We outlined a range of plausible initial price offers based on CMS guidance and our interpretation of regulatory intent. RESULTS: For ibrutinib and ustekinumab, statutorily defined ceiling prices will likely determine the initial price offers. The integration of net pricing and clinical evidence from comparator branded products will inform the initial price offers for apixaban, empagliflozin, etanercept, and insulin aspart. Rivaroxaban and sacubitril/valsartan have therapeutic alternatives that are generics; therefore, CMS may apply a discount to current net prices. To achieve savings in the negotiation of dapagliflozin and sitagliptin, CMS will have to leverage additional negotiation factors because statutory defined ceilings and net prices of therapeutic alternatives are similar or higher. CONCLUSIONS: This analysis sheds light on important price benchmarks and clinical evidence factors for the determination of the initial price offers. Although we were not able to simulate the offer and counter-offer process, our findings provide a transparent and systematic way to produce initial offers that are consistent with CMS guidance.


Assuntos
Benchmarking , Custos de Medicamentos , Estados Unidos , Humanos , Negociação , Centers for Medicare and Medicaid Services, U.S. , Medicare/economia , Pesquisa Comparativa da Efetividade
15.
BMC Infect Dis ; 24(1): 338, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38515014

RESUMO

BACKGROUND: A dearth of studies showed that infectious diseases cause the majority of deaths among under-five children. Worldwide, Acute Respiratory Infection (ARI) continues to be the second most frequent cause of illness and mortality among children under the age of five. The paramount disease burden in developing nations, including Ethiopia, is still ARI. OBJECTIVE: This study aims to determine the magnitude and predictors of ARI among under-five children in Ethiopia using used state of the art machine learning algorithms. METHODS: Data for this study were derived from the 2016 Ethiopian Demographic and Health Survey. To predict the determinants of acute respiratory infections, we performed several experiments on ten machine learning algorithms (random forests, decision trees, support vector machines, Naïve Bayes, and K-nearest neighbors, Lasso regression, GBoost, XGboost), including one classic logistic regression model and an ensemble of the best performing models. The prediction ability of each machine-learning model was assessed using receiver operating characteristic curves, precision-recall curves, and classification metrics. RESULTS: The total ARI prevalence rate among 9501 under-five children in Ethiopia was 7.2%, according to the findings of the study. The overall performance of the ensemble model of SVM, GBoost, and XGBoost showed an improved performance in classifying ARI cases with an accuracy of 86%, a sensitivity of 84.6%, and an AUC-ROC of 0.87. The highest performing predictive model (the ensemble model) showed that the child's age, history of diarrhea, wealth index, type of toilet, mother's educational level, number of living children, mother's occupation, and type of fuel they used were an important predicting factor for acute respiratory infection among under-five children. CONCLUSION: The intricate web of factors contributing to ARI among under-five children was identified using an advanced machine learning algorithm. The child's age, history of diarrhea, wealth index, and type of toilet were among the top factors identified using the ensemble model that registered a performance of 86% accuracy. This study stands as a testament to the potential of advanced data-driven methodologies in unraveling the complexities of ARI in low-income settings.


Assuntos
Saúde da Criança , Infecções Respiratórias , Criança , Humanos , Teorema de Bayes , Infecções Respiratórias/diagnóstico , Infecções Respiratórias/epidemiologia , Aprendizado de Máquina , Diarreia/epidemiologia , Demografia , Poder Psicológico
16.
Int J Equity Health ; 23(1): 82, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38664773

RESUMO

BACKGROUND: In South Korea, Korean Chinese workers experience ethnic discrimination although they share physical similarities and ethnic heritage with native-born Koreans. This study aimed to examine whether perceived ethnic discrimination is associated with poor self-rated health and whether the association differs by gender among Korean Chinese waged workers in South Korea. METHODS: We conducted a pooled cross-sectional analysis using data of 13,443 Korean Chinese waged workers from the Survey on Immigrants' Living Conditions and Labor Force conducted in 2018, 2020, and 2022. Based on perceived ethnic discrimination, asking for fair treatment, and subsequent situational improvement, respondents were classified into the following four groups: "Not experienced," "Experienced, not asked for fair treatment," "Experienced, asked for fair treatment, not improved," and "Experienced, asked for fair treatment, improved." Poor self-rated health was assessed using a single question "How is your current overall health?" We applied logistic regression to examine the association between perceived ethnic discrimination and poor self-rated health, with gender-stratified analyses. RESULTS: We found an association between ethnic discrimination and poor self-rated health among Korean Chinese waged workers. In the gender-stratified analysis, the "Experienced, not asked for fair treatment" group was more likely to report poor self-rated health compared to the "Not experienced" group, regardless of gender. However, gender differences were observed in the group stratified by situational improvements. For male workers, no statistically significant association was found in the "Experienced, asked for fair treatment, improved" group with poor self-rated health (odd ratios: 0.87, 95% confidence intervals: 0.30-2.53). Conversely, among female workers, a statistically significant association was observed (odd ratios: 2.63, 95% confidence intervals: 1.29-5.38). CONCLUSIONS: This study is the first to find an association between perceived ethnic discrimination and poor self-rated health, along with gender differences in the association between situational improvements after asking for fair treatment and poor self-rated health among Korean Chinese waged workers in South Korea.


Assuntos
Autoavaliação Diagnóstica , Nível de Saúde , Discriminação Percebida , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem , China/etnologia , Estudos Transversais , População do Leste Asiático , Racismo , República da Coreia , Fatores Sexuais , Inquéritos e Questionários
17.
Philos Trans A Math Phys Eng Sci ; 382(2275): 20230121, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-38910400

RESUMO

The Facility for Antiproton and Ion Research (FAIR) is in its final construction stage next to the campus of the Gesellschaft für Schwerionenforschung Helmholtzzentrum for heavy-ion research in Darmstadt, Germany. Once it starts its operation, it will be the main nuclear physics research facility in many basic sciences and their applications in Europe for the coming decades. Owing to the ability of the new fragment separator, Super-FRagment Separator, to produce high-intensity radioactive ion beams in the energy range up to about 2 GeV/nucleon, these can be used in various nuclear reactions. This opens a unique opportunity for various nuclear structure studies across a range of fields and scales: from low-energy physics via the investigation of multi-neutron systems and halos to high-density nuclear matter and the equation of state, following heavy-ion collisions, fission and study of short-range correlations in nuclei and hypernuclei. The newly developed reactions with relativistic radioactive beams (R3B) set up at FAIR would be the most suitable and versatile for such studies. An overview of highlighted physics cases foreseen at R3B is given, along with possible future opportunities, at FAIR. This article is part of the theme issue 'The liminal position of Nuclear Physics: from hadrons to neutron stars'.

18.
J Biomed Inform ; 151: 104614, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38395099

RESUMO

OBJECTIVES: The objective of this study is to describe how OCRx (Canadian Drug Ontology) has been built to address the dual need for local drug information integration in Canada and alignment with international standards requirements. METHODS: This paper delves into (i) the implementation efforts to meet the Identification of Medicinal Product (IDMP) requirements in OCRx, alongside the ontology update strategy, (ii) the structure of the ontology itself, (iii) the alignment approach with several reference Knowledge Organization Systems, including SNOMED CT, RxNorm, and the list of "Code Identifiant de Spécialité" (CIS-Code), and (iv) the look-up services developed to facilitate its access and utilization. RESULTS: Each OCRx release contains two distinct versions: the full and the up-to-date version. The full version encompasses all drugs with a DIN code sanctioned by Health Canada, while the up-to-date version is limited to drugs currently marketed in Canada. In the last release of OCRx, the full version comprises 162,400 classes; meanwhile, the up-to-date version consists of 36,909 classes. In terms of mappings with OCRx, substances in RxNorm and SNOMED CT fall below 40%, registering at 37% and 22% respectively. Meanwhile, mappings for CIS-Code achieve coverage of 61%. The strength mappings are notably low for RxNorm at 40% and for CIS-code at 28%. This affects the mapping of clinical drugs, which are predominantly alignable through post-coordinated expressions: 56% for RxNorm, 80% for SNOMED CT, and 35% for CIS-Code. The main support service of OCRx is a look-up service known as PaperRx that displays OCRx's entities based on description logic queries (DL-queries) performed through the classified structure of OCRx. The look-up services also contain a SPARQL endpoint, an OCRx OWL file downloader, and a RESTful API. DISCUSSION: The OCRx ontology demonstrates a significant effort towards integrating Canadian drug information with international standards. However, there are areas for improvement. In the future, our focus will be on refining the structure of OCRx for better classification capability and improvement of dosage conversion. Additionally, we aim to harness OCRx in constructing an ontology-based annotator, setting our sights on its deployment in real-world data integration scenarios.


Assuntos
Systematized Nomenclature of Medicine , Vocabulário Controlado , Canadá , Padrões de Referência , Internacionalidade
19.
J Biomed Inform ; 151: 104622, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38452862

RESUMO

OBJECTIVE: The integration of artificial intelligence (AI) and machine learning (ML) in health care to aid clinical decisions is widespread. However, as AI and ML take important roles in health care, there are concerns about AI and ML associated fairness and bias. That is, an AI tool may have a disparate impact, with its benefits and drawbacks unevenly distributed across societal strata and subpopulations, potentially exacerbating existing health inequities. Thus, the objectives of this scoping review were to summarize existing literature and identify gaps in the topic of tackling algorithmic bias and optimizing fairness in AI/ML models using real-world data (RWD) in health care domains. METHODS: We conducted a thorough review of techniques for assessing and optimizing AI/ML model fairness in health care when using RWD in health care domains. The focus lies on appraising different quantification metrics for accessing fairness, publicly accessible datasets for ML fairness research, and bias mitigation approaches. RESULTS: We identified 11 papers that are focused on optimizing model fairness in health care applications. The current research on mitigating bias issues in RWD is limited, both in terms of disease variety and health care applications, as well as the accessibility of public datasets for ML fairness research. Existing studies often indicate positive outcomes when using pre-processing techniques to address algorithmic bias. There remain unresolved questions within the field that require further research, which includes pinpointing the root causes of bias in ML models, broadening fairness research in AI/ML with the use of RWD and exploring its implications in healthcare settings, and evaluating and addressing bias in multi-modal data. CONCLUSION: This paper provides useful reference material and insights to researchers regarding AI/ML fairness in real-world health care data and reveals the gaps in the field. Fair AI/ML in health care is a burgeoning field that requires a heightened research focus to cover diverse applications and different types of RWD.


Assuntos
Inteligência Artificial , Aprendizado de Máquina , Humanos , Benchmarking , Pesquisadores
20.
J Biomed Inform ; 154: 104647, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38692465

RESUMO

OBJECTIVE: To use software, datasets, and data formats in the domain of Infectious Disease Epidemiology as a test collection to evaluate a novel M1 use case, which we introduce in this paper. M1 is a machine that upon receipt of a new digital object of research exhaustively finds all valid compositions of it with existing objects. METHOD: We implemented a data-format-matching-only M1 using exhaustive search, which we refer to as M1DFM. We then ran M1DFM on the test collection and used error analysis to identify needed semantic constraints. RESULTS: Precision of M1DFM search was 61.7%. Error analysis identified needed semantic constraints and needed changes in handling of data services. Most semantic constraints were simple, but one data format was sufficiently complex to be practically impossible to represent semantic constraints over, from which we conclude limitatively that software developers will have to meet the machines halfway by engineering software whose inputs are sufficiently simple that their semantic constraints can be represented, akin to the simple APIs of services. We summarize these insights as M1-FAIR guiding principles for composability and suggest a roadmap for progressively capable devices in the service of reuse and accelerated scientific discovery. CONCLUSION: Algorithmic search of digital repositories for valid workflow compositions has potential to accelerate scientific discovery but requires a scalable solution to the problem of knowledge acquisition about semantic constraints on software inputs. Additionally, practical limitations on the logical complexity of semantic constraints must be respected, which has implications for the design of software.


Assuntos
Software , Humanos , Semântica , Aprendizado de Máquina , Algoritmos , Bases de Dados Factuais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA