Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 58
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 40(7)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38913850

RESUMEN

MOTIVATION: Human Phenotype Ontology (HPO)-based phenotype concept recognition (CR) underpins a faster and more effective mechanism to create patient phenotype profiles or to document novel phenotype-centred knowledge statements. While the increasing adoption of large language models (LLMs) for natural language understanding has led to several LLM-based solutions, we argue that their intrinsic resource-intensive nature is not suitable for realistic management of the phenotype CR lifecycle. Consequently, we propose to go back to the basics and adopt a dictionary-based approach that enables both an immediate refresh of the ontological concepts as well as efficient re-analysis of past data. RESULTS: We developed a dictionary-based approach using a pre-built large collection of clusters of morphologically equivalent tokens-to address lexical variability and a more effective CR step by reducing the entity boundary detection strictly to candidates consisting of tokens belonging to ontology concepts. Our method achieves state-of-the-art results (0.76 F1 on the GSC+ corpus) and a processing efficiency of 10 000 publication abstracts in 5 s. AVAILABILITY AND IMPLEMENTATION: FastHPOCR is available as a Python package installable via pip. The source code is available at https://github.com/tudorgroza/fast_hpo_cr. A Java implementation of FastHPOCR will be made available as part of the Fenominal Java library available at https://github.com/monarch-initiative/fenominal. The up-to-date GCS-2024 corpus is available at https://github.com/tudorgroza/code-for-papers/tree/main/gsc-2024.


Asunto(s)
Ontologías Biológicas , Fenotipo , Humanos , Procesamiento de Lenguaje Natural , Programas Informáticos , Algoritmos
2.
Nucleic Acids Res ; 51(D1): D1360-D1366, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36399494

RESUMEN

PDCM Finder (www.cancermodels.org) is a cancer research platform that aggregates clinical, genomic and functional data from patient-derived xenografts, organoids and cell lines. It was launched in April 2022 as a successor of the PDX Finder portal, which focused solely on patient-derived xenograft models. Currently the portal has over 6200 models across 13 cancer types, including rare paediatric models (17%) and models from minority ethnic backgrounds (33%), making it the largest free to consumer and open access resource of this kind. The PDCM Finder standardises, harmonises and integrates the complex and diverse data associated with PDCMs for the cancer community and displays over 90 million data points across a variety of data types (clinical metadata, molecular and treatment-based). PDCM data is FAIR and underpins the generation and testing of new hypotheses in cancer mechanisms and personalised medicine development.


Asunto(s)
Neoplasias , Humanos , Niño , Neoplasias/genética , Neoplasias/terapia , Organoides , Ensayos Antitumor por Modelo de Xenoinjerto
3.
Nucleic Acids Res ; 51(D1): D1038-D1045, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36305825

RESUMEN

The International Mouse Phenotyping Consortium (IMPC; https://www.mousephenotype.org/) web portal makes available curated, integrated and analysed knockout mouse phenotyping data generated by the IMPC project consisting of 85M data points and over 95,000 statistically significant phenotype hits mapped to human diseases. The IMPC portal delivers a substantial reference dataset that supports the enrichment of various domain-specific projects and databases, as well as the wider research and clinical community, where the IMPC genotype-phenotype knowledge contributes to the molecular diagnosis of patients affected by rare disorders. Data from 9,000 mouse lines and 750 000 images provides vital resources enabling the interpretation of the ignorome, and advancing our knowledge on mammalian gene function and the mechanisms underlying phenotypes associated with human diseases. The resource is widely integrated and the lines have been used in over 4,600 publications indicating the value of the data and the materials.


Asunto(s)
Bases de Datos Factuales , Modelos Animales de Enfermedad , Ratones Noqueados , Animales , Humanos , Ratones , Fenotipo
4.
Nucleic Acids Res ; 51(D1): D977-D985, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350656

RESUMEN

The NHGRI-EBI GWAS Catalog (www.ebi.ac.uk/gwas) is a FAIR knowledgebase providing detailed, structured, standardised and interoperable genome-wide association study (GWAS) data to >200 000 users per year from academic research, healthcare and industry. The Catalog contains variant-trait associations and supporting metadata for >45 000 published GWAS across >5000 human traits, and >40 000 full P-value summary statistics datasets. Content is curated from publications or acquired via author submission of prepublication summary statistics through a new submission portal and validation tool. GWAS data volume has vastly increased in recent years. We have updated our software to meet this scaling challenge and to enable rapid release of submitted summary statistics. The scope of the repository has expanded to include additional data types of high interest to the community, including sequencing-based GWAS, gene-based analyses and copy number variation analyses. Community outreach has increased the number of shared datasets from under-represented traits, e.g. cancer, and we continue to contribute to awareness of the lack of population diversity in GWAS. Interoperability of the Catalog has been enhanced through links to other resources including the Polygenic Score Catalog and the International Mouse Phenotyping Consortium, refinements to GWAS trait annotation, and the development of a standard format for GWAS data.


Asunto(s)
Estudio de Asociación del Genoma Completo , Bases del Conocimiento , Animales , Humanos , Ratones , Variaciones en el Número de Copia de ADN , National Human Genome Research Institute (U.S.) , Fenotipo , Polimorfismo de Nucleótido Simple , Programas Informáticos , Estados Unidos
5.
Bioinformatics ; 39(12)2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-38001031

RESUMEN

MOTIVATION: Methods for concept recognition (CR) in clinical texts have largely been tested on abstracts or articles from the medical literature. However, texts from electronic health records (EHRs) frequently contain spelling errors, abbreviations, and other nonstandard ways of representing clinical concepts. RESULTS: Here, we present a method inspired by the BLAST algorithm for biosequence alignment that screens texts for potential matches on the basis of matching k-mer counts and scores candidates based on conformance to typical patterns of spelling errors derived from 2.9 million clinical notes. Our method, the Term-BLAST-like alignment tool (TBLAT) leverages a gold standard corpus for typographical errors to implement a sequence alignment-inspired method for efficient entity linkage. We present a comprehensive experimental comparison of TBLAT with five widely used tools. Experimental results show an increase of 10% in recall on scientific publications and 20% increase in recall on EHR records (when compared against the next best method), hence supporting a significant enhancement of the entity linking task. The method can be used stand-alone or as a complement to existing approaches. AVAILABILITY AND IMPLEMENTATION: Fenominal is a Java library that implements TBLAT for named CR of Human Phenotype Ontology terms and is available at https://github.com/monarch-initiative/fenominal under the GNU General Public License v3.0.


Asunto(s)
Algoritmos , Lenguaje , Humanos , Alineación de Secuencia , Registros Electrónicos de Salud , Publicaciones
6.
BMC Med Inform Decis Mak ; 24(1): 30, 2024 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-38297371

RESUMEN

OBJECTIVE: Clinical deep phenotyping and phenotype annotation play a critical role in both the diagnosis of patients with rare disorders as well as in building computationally-tractable knowledge in the rare disorders field. These processes rely on using ontology concepts, often from the Human Phenotype Ontology, in conjunction with a phenotype concept recognition task (supported usually by machine learning methods) to curate patient profiles or existing scientific literature. With the significant shift in the use of large language models (LLMs) for most NLP tasks, we examine the performance of the latest Generative Pre-trained Transformer (GPT) models underpinning ChatGPT as a foundation for the tasks of clinical phenotyping and phenotype annotation. MATERIALS AND METHODS: The experimental setup of the study included seven prompts of various levels of specificity, two GPT models (gpt-3.5-turbo and gpt-4.0) and two established gold standard corpora for phenotype recognition, one consisting of publication abstracts and the other clinical observations. RESULTS: The best run, using in-context learning, achieved 0.58 document-level F1 score on publication abstracts and 0.75 document-level F1 score on clinical observations, as well as a mention-level F1 score of 0.7, which surpasses the current best in class tool. Without in-context learning, however, performance is significantly below the existing approaches. CONCLUSION: Our experiments show that gpt-4.0 surpasses the state of the art performance if the task is constrained to a subset of the target ontology where there is prior knowledge of the terms that are expected to be matched. While the results are promising, the non-deterministic nature of the outcomes, the high cost and the lack of concordance between different runs using the same prompt and input make the use of these LLMs challenging for this particular task.


Asunto(s)
Conocimiento , Lenguaje , Humanos , Aprendizaje Automático , Fenotipo , Enfermedades Raras
7.
Mamm Genome ; 34(3): 379-388, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37154937

RESUMEN

Experiments in which data are collected by multiple independent resources, including multicentre data, different laboratories within the same centre or with different operators, are challenging in design, data collection and interpretation. Indeed, inconsistent results across the resources are possible. In this paper, we propose a statistical solution for the problem of multi-resource consensus inferences when statistical results from different resources show variation in magnitude, directionality, and significance. Our proposed method allows combining the corrected p-values, effect sizes and the total number of centres into a global consensus score. We apply this method to obtain a consensus score for data collected by the International Mouse Phenotyping Consortium (IMPC) across 11 centres. We show the application of this method to detect sexual dimorphism in haematological data and discuss the suitability of the methodology.


Asunto(s)
Consenso , Ratones , Animales , Recolección de Datos/métodos
8.
J Med Genet ; 57(7): 479-486, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-31980565

RESUMEN

BACKGROUND: This study provides an integrated assessment of the economic and social impacts of genomic sequencing for the detection of monogenic disorders resulting in intellectual disability (ID). METHODS: Multiple knowledge bases were cross-referenced and analysed to compile a reference list of monogenic disorders associated with ID. Multiple literature searches were used to quantify the health and social costs for the care of people with ID. Health and social expenditures and the current cost of whole-exome sequencing and whole-genome sequencing were quantified in relation to the more common causes of ID and their impact on lifespan. RESULTS: On average, individuals with ID incur annual costs in terms of health costs, disability support, lost income and other social costs of US$172 000, accumulating to many millions of dollars over a lifetime. CONCLUSION: The diagnosis of monogenic disorders through genomic testing provides the opportunity to improve the diagnosis and management, and to reduce the costs of ID through informed reproductive decisions, reductions in unproductive diagnostic tests and increasingly targeted therapies.


Asunto(s)
Secuenciación del Exoma/economía , Genómica/economía , Discapacidad Intelectual/economía , Discapacidad Intelectual/genética , Costos de la Atención en Salud/estadística & datos numéricos , Humanos , Discapacidad Intelectual/diagnóstico , Discapacidad Intelectual/epidemiología
9.
Am J Hum Genet ; 99(3): 595-606, 2016 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-27569544

RESUMEN

The interpretation of non-coding variants still constitutes a major challenge in the application of whole-genome sequencing in Mendelian disease, especially for single-nucleotide and other small non-coding variants. Here we present Genomiser, an analysis framework that is able not only to score the relevance of variation in the non-coding genome, but also to associate regulatory variants to specific Mendelian diseases. Genomiser scores variants through either existing methods such as CADD or a bespoke machine learning method and combines these with allele frequency, regulatory sequences, chromosomal topological domains, and phenotypic relevance to discover variants associated to specific Mendelian disorders. Overall, Genomiser is able to identify causal regulatory variants as the top candidate in 77% of simulated whole genomes, allowing effective detection and discovery of regulatory variants in Mendelian disease.


Asunto(s)
Algoritmos , Enfermedades Genéticas Congénitas/genética , Genoma Humano/genética , Mutación/genética , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Humanos , Aprendizaje Automático , Sistemas de Lectura Abierta/genética , Fenotipo , Mutación Puntual/genética
10.
Nucleic Acids Res ; 45(D1): D712-D722, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899636

RESUMEN

The correlation of phenotypic outcomes with genetic variation and environmental factors is a core pursuit in biology and biomedicine. Numerous challenges impede our progress: patient phenotypes may not match known diseases, candidate variants may be in genes that have not been characterized, model organisms may not recapitulate human or veterinary diseases, filling evolutionary gaps is difficult, and many resources must be queried to find potentially significant genotype-phenotype associations. Non-human organisms have proven instrumental in revealing biological mechanisms. Advanced informatics tools can identify phenotypically relevant disease models in research and diagnostic contexts. Large-scale integration of model organism and clinical research data can provide a breadth of knowledge not available from individual sources and can provide contextualization of data back to these sources. The Monarch Initiative (monarchinitiative.org) is a collaborative, open science effort that aims to semantically integrate genotype-phenotype data from many species and sources in order to support precision medicine, disease modeling, and mechanistic exploration. Our integrated knowledge graph, analytic tools, and web services enable diverse users to explore relationships between phenotypes and genotypes across species.


Asunto(s)
Bases de Datos Genéticas , Estudios de Asociación Genética/métodos , Genotipo , Fenotipo , Animales , Evolución Biológica , Biología Computacional/métodos , Curaduría de Datos , Humanos , Motor de Búsqueda , Programas Informáticos , Especificidad de la Especie , Interfaz Usuario-Computador , Navegador Web
11.
Am J Hum Genet ; 97(1): 111-24, 2015 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-26119816

RESUMEN

The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential diagnostics, phenotype-driven analysis of next-generation sequence-variation data, and translational research, but a comparable resource has not been available for common disease. Here, we have developed a concept-recognition procedure that analyzes the frequencies of HPO disease annotations as identified in over five million PubMed abstracts by employing an iterative procedure to optimize precision and recall of the identified terms. We derived disease models for 3,145 common human diseases comprising a total of 132,006 HPO annotations. The HPO now comprises over 250,000 phenotypic annotations for over 10,000 rare and common diseases and can be used for examining the phenotypic overlap among common diseases that share risk alleles, as well as between Mendelian diseases and common diseases linked by genomic location. The annotations, as well as the HPO itself, are freely available.


Asunto(s)
Ontología de Genes/tendencias , Enfermedades Genéticas Congénitas/clasificación , Enfermedades Genéticas Congénitas/genética , Fenotipo , Terminología como Asunto , Enfermedades Genéticas Congénitas/patología , Humanos , MEDLINE , Modelos Biológicos
12.
Brief Bioinform ; 17(5): 819-30, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-26420780

RESUMEN

Phenotypes have gained increased notoriety in the clinical and biological domain owing to their application in numerous areas such as the discovery of disease genes and drug targets, phylogenetics and pharmacogenomics. Phenotypes, defined as observable characteristics of organisms, can be seen as one of the bridges that lead to a translation of experimental findings into clinical applications and thereby support 'bench to bedside' efforts. However, to build this translational bridge, a common and universal understanding of phenotypes is required that goes beyond domain-specific definitions. To achieve this ambitious goal, a digital revolution is ongoing that enables the encoding of data in computer-readable formats and the data storage in specialized repositories, ready for integration, enabling translational research. While phenome research is an ongoing endeavor, the true potential hidden in the currently available data still needs to be unlocked, offering exciting opportunities for the forthcoming years. Here, we provide insights into the state-of-the-art in digital phenotyping, by means of representing, acquiring and analyzing phenotype data. In addition, we provide visions of this field for future research work that could enable better applications of phenotype data.


Asunto(s)
Fenotipo , Humanos , Almacenamiento y Recuperación de la Información , Proyectos de Investigación , Investigación Biomédica Traslacional
13.
BMC Med Inform Decis Mak ; 18(1): 47, 2018 06 25.
Artículo en Inglés | MEDLINE | ID: mdl-29941004

RESUMEN

BACKGROUND: Traditional health information systems are generally devised to support clinical data collection at the point of care. However, as the significance of the modern information economy expands in scope and permeates the healthcare domain, there is an increasing urgency for healthcare organisations to offer information systems that address the expectations of clinicians, researchers and the business intelligence community alike. Amongst other emergent requirements, the principal unmet need might be defined as the 3R principle (right data, right place, right time) to address deficiencies in organisational data flow while retaining the strict information governance policies that apply within the UK National Health Service (NHS). Here, we describe our work on creating and deploying a low cost structured and unstructured information retrieval and extraction architecture within King's College Hospital, the management of governance concerns and the associated use cases and cost saving opportunities that such components present. RESULTS: To date, our CogStack architecture has processed over 300 million lines of clinical data, making it available for internal service improvement projects at King's College London. On generated data designed to simulate real world clinical text, our de-identification algorithm achieved up to 94% precision and up to 96% recall. CONCLUSION: We describe a toolkit which we feel is of huge value to the UK (and beyond) healthcare community. It is the only open source, easily deployable solution designed for the UK healthcare environment, in a landscape populated by expensive proprietary systems. Solutions such as these provide a crucial foundation for the genomic revolution in medicine.


Asunto(s)
Registros Electrónicos de Salud , Hospitales , Almacenamiento y Recuperación de la Información/métodos , Programas Nacionales de Salud , Procesamiento de Lenguaje Natural , Humanos , Reino Unido
14.
Adv Exp Med Biol ; 1031: 55-94, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29214566

RESUMEN

Public health relies on technologies to produce and analyse data, as well as effectively develop and implement policies and practices. An example is the public health practice of epidemiology, which relies on computational technology to monitor the health status of populations, identify disadvantaged or at risk population groups and thereby inform health policy and priority setting. Critical to achieving health improvements for the underserved population of people living with rare diseases is early diagnosis and best care. In the rare diseases field, the vast majority of diseases are caused by destructive but previously difficult to identify protein-coding gene mutations. The reduction in cost of genetic testing and advances in the clinical use of genome sequencing, data science and imaging are converging to provide more precise understandings of the 'person-time-place' triad. That is: who is affected (people); when the disease is occurring (time); and where the disease is occurring (place). Consequently we are witnessing a paradigm shift in public health policy and practice towards 'precision public health'.Patient and stakeholder engagement has informed the need for a national public health policy framework for rare diseases. The engagement approach in different countries has produced highly comparable outcomes and objectives. Knowledge and experience sharing across the international rare diseases networks and partnerships has informed the development of the Western Australian Rare Diseases Strategic Framework 2015-2018 (RD Framework) and Australian government health briefings on the need for a National plan.The RD Framework is guiding the translation of genomic and other technologies into the Western Australian health system, leading to greater precision in diagnostic pathways and care, and is an example of how a precision public health framework can improve health outcomes for the rare diseases population.Five vignettes are used to illustrate how policy decisions provide the scaffolding for translation of new genomics knowledge, and catalyze transformative change in delivery of clinical services. The vignettes presented here are from an Australian perspective and are not intended to be comprehensive, but rather to provide insights into how a new and emerging 'precision public health' paradigm can improve the experiences of patients living with rare diseases, their caregivers and families.The conclusion is that genomic public health is informed by the individual and family needs, and the population health imperatives of an early and accurate diagnosis; which is the portal to best practice care. Knowledge sharing is critical for public health policy development and improving the lives of people living with rare diseases.


Asunto(s)
Genómica/métodos , Política de Salud , Medicina de Precisión , Salud Pública , Enfermedades Raras/terapia , Predisposición Genética a la Enfermedad , Genómica/organización & administración , Política de Salud/legislación & jurisprudencia , Humanos , Fenotipo , Formulación de Políticas , Valor Predictivo de las Pruebas , Pronóstico , Desarrollo de Programa , Evaluación de Programas y Proyectos de Salud , Salud Pública/legislación & jurisprudencia , Enfermedades Raras/diagnóstico , Enfermedades Raras/epidemiología , Enfermedades Raras/genética
15.
Hum Mutat ; 36(10): 979-84, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26269093

RESUMEN

The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases.


Asunto(s)
Bases de Datos Genéticas , Enfermedad/genética , Predisposición Genética a la Enfermedad/genética , Animales , Modelos Animales de Enfermedad , Variación Genética , Humanos , Difusión de la Información , Fenotipo , Interfaz Usuario-Computador
16.
J Paediatr Child Health ; 51(4): 381-6, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25109851

RESUMEN

There are many current and evolving tools to assist clinicians in their daily work of phenotyping. In medicine, the term 'phenotype' is usually taken to mean some deviation from normal morphology, physiology and behaviour. It is ascertained via history, examination and investigations, and a primary aim is diagnosis. Therefore, doctors are, by necessity, expert 'phenotypers'. There is an inherent and partially realised power in phenotypic information that when harnessed can improve patient care. Furthermore, phenotyping developments are increasingly important in an era of rapid advances in genomic technology. Fortunately, there is an expanding network of phenotyping tools that are poised for clinical translation. These tools will preferentially be implemented to mirror clinical workflows and to integrate with advances in genomic and information-sharing technologies. This will synergise with and augment the clinical acumen of medical practitioners. We outline key enablers of the ascertainment, integration and interrogation of clinical phenotype by using genetic diseases, particularly rare ones, as a theme. Successes from the test bed or rare diseases will support approaches to common disease.


Asunto(s)
Enfermedades Genéticas Congénitas/diagnóstico , Genotipo , Fenotipo , Enfermedades Genéticas Congénitas/genética , Humanos , Anamnesis , Examen Físico , Medicina de Precisión
17.
J Biomed Inform ; 49: 159-70, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24530879

RESUMEN

Evidence Based Medicine (EBM) provides a framework that makes use of the current best evidence in the domain to support clinicians in the decision making process. In most cases, the underlying foundational knowledge is captured in scientific publications that detail specific clinical studies or randomised controlled trials. Over the course of the last two decades, research has been performed on modelling key aspects described within publications (e.g., aims, methods, results), to enable the successful realisation of the goals of EBM. A significant outcome of this research has been the PICO (Population/Problem-Intervention-Comparison-Outcome) structure, and its refined version PIBOSO (Population-Intervention-Background-Outcome-Study Design-Other), both of which provide a formalisation of these scientific artefacts. Subsequently, using these schemes, diverse automatic extraction techniques have been proposed to streamline the knowledge discovery and exploration process in EBM. In this paper, we present a Machine Learning approach that aims to classify sentences according to the PIBOSO scheme. We use a discriminative set of features that do not rely on any external resources to achieve results comparable to the state of the art. A corpus of 1000 structured and unstructured abstracts - i.e., the NICTA-PIBOSO corpus - is used for training and testing. Our best CRF classifier achieves a micro-average F-score of 90.74% and 87.21%, respectively, over structured and unstructured abstracts, which represents an increase of 25.48 percentage points and 26.6 percentage points in F-score when compared to the best existing approaches.


Asunto(s)
Artefactos , Medicina Basada en la Evidencia , Edición
18.
J Biomed Inform ; 48: 73-83, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24333481

RESUMEN

Finding, capturing and describing characteristic features represents a key aspect in disorder definition, diagnosis and management. This process is particularly challenging in the case of rare disorders, due to the sparse nature of data and expertise. From a computational perspective, finding characteristic features is associated with some additional major challenges, such as formulating a computationally tractable definition, devising appropriate inference algorithms or defining sound validation mechanisms. In this paper we aim to deal with each of these problems in the context provided by the skeletal dysplasia domain. We propose a clear definition for characteristic phenotypes, we experiment with a novel, class association rule mining algorithm and we discuss our lessons learned from both an automatic and human-based validation of our approach.


Asunto(s)
Enfermedades del Desarrollo Óseo/diagnóstico , Minería de Datos/métodos , Informática Médica/métodos , Algoritmos , Automatización , Enfermedades del Desarrollo Óseo/patología , Bases de Datos Factuales , Humanos , Almacenamiento y Recuperación de la Información , Fenotipo , Reproducibilidad de los Resultados , Programas Informáticos
19.
medRxiv ; 2024 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-37503093

RESUMEN

Objective: Large Language Models such as GPT-4 previously have been applied to differential diagnostic challenges based on published case reports. Published case reports have a sophisticated narrative style that is not readily available from typical electronic health records (EHR). Furthermore, even if such a narrative were available in EHRs, privacy requirements would preclude sending it outside the hospital firewall. We therefore tested a method for parsing clinical texts to extract ontology terms and programmatically generating prompts that by design are free of protected health information. Materials and Methods: We investigated different methods to prepare prompts from 75 recently published case reports. We transformed the original narratives by extracting structured terms representing phenotypic abnormalities, comorbidities, treatments, and laboratory tests and creating prompts programmatically. Results: Performance of all of these approaches was modest, with the correct diagnosis ranked first in only 5.3-17.6% of cases. The performance of the prompts created from structured data was substantially worse than that of the original narrative texts, even if additional information was added following manual review of term extraction. Moreover, different versions of GPT-4 demonstrated substantially different performance on this task. Discussion: The sensitivity of the performance to the form of the prompt and the instability of results over two GPT-4 versions represent important current limitations to the use of GPT-4 to support diagnosis in real-life clinical settings. Conclusion: Research is needed to identify the best methods for creating prompts from typically available clinical data to support differential diagnostics.

20.
Lancet Glob Health ; 12(7): e1192-e1199, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38876765

RESUMEN

Rare diseases affect over 300 million people worldwide and are gaining recognition as a global health priority. Their inclusion in the UN Sustainable Development Goals, the UN Resolution on Addressing the Challenges of Persons Living with a Rare Disease, and the anticipated WHO Global Network for Rare Diseases and WHO Resolution on Rare Diseases, which is yet to be announced, emphasise their significance. People with rare diseases often face unmet health needs, including access to screening, diagnosis, therapy, and comprehensive health care. These challenges highlight the need for awareness and targeted interventions, including comprehensive education, especially in primary care. The majority of rare disease research, clinical services, and health systems are addressed with specialist care. WHO Member States have committed to focusing on primary health care in both universal health coverage and health-related Sustainable Development Goals. Recognising this opportunity, the International Rare Diseases Research Consortium (IRDiRC) assembled a global, multistakeholder task force to identify key barriers and opportunities for empowering primary health-care providers in addressing rare disease challenges.


Asunto(s)
Salud Global , Atención Primaria de Salud , Enfermedades Raras , Humanos , Accesibilidad a los Servicios de Salud , Atención Primaria de Salud/organización & administración , Enfermedades Raras/terapia , Enfermedades Raras/epidemiología , Organización Mundial de la Salud , Política de Salud
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA