Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Am J Hum Genet ; 109(9): 1591-1604, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-35998640

RESUMO

Diagnosis for rare genetic diseases often relies on phenotype-driven methods, which hinge on the accuracy and completeness of the rare disease phenotypes in the underlying annotation knowledgebase. Existing knowledgebases are often manually curated with additional annotations found in published case reports. Despite their potential, real-world data such as electronic health records (EHRs) have not been fully exploited to derive rare disease annotations. Here, we present open annotation for rare diseases (OARD), a real-world-data-derived resource with annotation for rare-disease-related phenotypes. This resource is derived from the EHRs of two academic health institutions containing more than 10 million individuals spanning wide age ranges and different disease subgroups. By leveraging ontology mapping and advanced natural-language-processing (NLP) methods, OARD automatically and efficiently extracts concepts for both rare diseases and their phenotypic traits from billing codes and lab tests as well as over 100 million clinical narratives. The rare disease prevalence derived by OARD is highly correlated with those annotated in the original rare disease knowledgebase. By performing association analysis, we identified more than 1 million novel disease-phenotype association pairs that were previously missed by human annotation, and >60% were confirmed true associations via manual review of a list of sampled pairs. Compared to the manual curated annotation, OARD is 100% data driven and its pipeline can be shared across different institutions. By supporting privacy-preserving sharing of aggregated summary statistics, such as term frequencies and disease-phenotype associations, it fills an important gap to facilitate data-driven research in the rare disease community.


Assuntos
Processamento de Linguagem Natural , Doenças Raras , Registros Eletrônicos de Saúde , Humanos , Fenótipo , Doenças Raras/genética
2.
J Biomed Inform ; 155: 104659, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38777085

RESUMO

OBJECTIVE: This study aims to promote interoperability in precision medicine and translational research by aligning the Observational Medical Outcomes Partnership (OMOP) and Phenopackets data models. Phenopackets is an expert knowledge-driven schema designed to facilitate the storage and exchange of multimodal patient data, and support downstream analysis. The first goal of this paper is to explore model alignment by characterizing the common data models using a newly developed data transformation process and evaluation method. Second, using OMOP normalized clinical data, we evaluate the mapping of real-world patient data to Phenopackets. We evaluate the suitability of Phenopackets as a patient data representation for real-world clinical cases. METHODS: We identified mappings between OMOP and Phenopackets and applied them to a real patient dataset to assess the transformation's success. We analyzed gaps between the models and identified key considerations for transforming data between them. Further, to improve ambiguous alignment, we incorporated Unified Medical Language System (UMLS) semantic type-based filtering to direct individual concepts to their most appropriate domain and conducted a domain-expert evaluation of the mapping's clinical utility. RESULTS: The OMOP to Phenopacket transformation pipeline was executed for 1,000 Alzheimer's disease patients and successfully mapped all required entities. However, due to missing values in OMOP for required Phenopacket attributes, 10.2 % of records were lost. The use of UMLS-semantic type filtering for ambiguous alignment of individual concepts resulted in 96 % agreement with clinical thinking, increased from 68 % when mapping exclusively by domain correspondence. CONCLUSION: This study presents a pipeline to transform data from OMOP to Phenopackets. We identified considerations for the transformation to ensure data quality, handling restrictions for successful Phenopacket validation and discrepant data formats. We identified unmappable Phenopacket attributes that focus on specialty use cases, such as genomics or oncology, which OMOP does not currently support. We introduce UMLS semantic type filtering to resolve ambiguous alignment to Phenopacket entities to be most appropriate for real-world interpretation. We provide a systematic approach to align OMOP and Phenopackets schemas. Our work facilitates future use of Phenopackets in clinical applications by addressing key barriers to interoperability when deriving a Phenopacket from real-world patient data.


Assuntos
Unified Medical Language System , Humanos , Semântica , Registros Eletrônicos de Saúde , Medicina de Precisão/métodos , Pesquisa Translacional Biomédica , Informática Médica/métodos , Processamento de Linguagem Natural , Doença de Alzheimer
3.
J Am Soc Nephrol ; 34(6): 1105-1119, 2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-36995132

RESUMO

SIGNIFICANCE STATEMENT: Congenital obstructive uropathy (COU) is a prevalent human developmental defect with highly heterogeneous clinical presentations and outcomes. Genetics may refine diagnosis, prognosis, and treatment, but the genomic architecture of COU is largely unknown. Comprehensive genomic screening study of 733 cases with three distinct COU subphenotypes revealed disease etiology in 10.0% of them. We detected no significant differences in the overall diagnostic yield among COU subphenotypes, with characteristic variable expressivity of several mutant genes. Our findings therefore may legitimize a genetic first diagnostic approach for COU, especially when burdening clinical and imaging characterization is not complete or available. BACKGROUND: Congenital obstructive uropathy (COU) is a common cause of developmental defects of the urinary tract, with heterogeneous clinical presentation and outcome. Genetic analysis has the potential to elucidate the underlying diagnosis and help risk stratification. METHODS: We performed a comprehensive genomic screen of 733 independent COU cases, which consisted of individuals with ureteropelvic junction obstruction ( n =321), ureterovesical junction obstruction/congenital megaureter ( n =178), and COU not otherwise specified (COU-NOS; n =234). RESULTS: We identified pathogenic single nucleotide variants (SNVs) in 53 (7.2%) cases and genomic disorders (GDs) in 23 (3.1%) cases. We detected no significant differences in the overall diagnostic yield between COU sub-phenotypes, and pathogenic SNVs in several genes were associated to any of the three categories. Hence, although COU may appear phenotypically heterogeneous, COU phenotypes are likely to share common molecular bases. On the other hand, mutations in TNXB were more often identified in COU-NOS cases, demonstrating the diagnostic challenge in discriminating COU from hydronephrosis secondary to vesicoureteral reflux, particularly when diagnostic imaging is incomplete. Pathogenic SNVs in only six genes were found in more than one individual, supporting high genetic heterogeneity. Finally, convergence between data on SNVs and GDs suggest MYH11 as a dosage-sensitive gene possibly correlating with severity of COU. CONCLUSIONS: We established a genomic diagnosis in 10.0% of COU individuals. The findings underscore the urgent need to identify novel genetic susceptibility factors to COU to better define the natural history of the remaining 90% of cases without a molecular diagnosis.


Assuntos
Hidronefrose , Obstrução Ureteral , Refluxo Vesicoureteral , Humanos , Variações do Número de Cópias de DNA , Obstrução Ureteral/complicações , Obstrução Ureteral/genética , Refluxo Vesicoureteral/diagnóstico , Refluxo Vesicoureteral/genética , Pelve Renal/patologia
4.
Am J Kidney Dis ; 81(3): 318-328.e1, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36191724

RESUMO

RATIONALE & OBJECTIVE: The effects of race, ethnicity, socioeconomic status (SES), and disease severity on acute care utilization in patients with glomerular disease are unknown. STUDY DESIGN: Prospective cohort study. SETTING & PARTICIPANTS: 1,456 adults and 768 children with biopsy-proven glomerular disease enrolled in the Cure Glomerulonephropathy (CureGN) cohort. EXPOSURE: Race and ethnicity as a participant-reported social factor. OUTCOME: Acute care utilization defined as hospitalizations or emergency department visits. ANALYTICAL APPROACH: Multivariable recurrent event proportional rate models were used to estimate associations between race and ethnicity and acute care utilization. RESULTS: Black or Hispanic participants had lower SES and more severe glomerular disease than White or Asian participants. Acute care utilization rates were 45.6, 29.5, 25.8, and 19.2 per 100 person-years in Black, Hispanic, White, and Asian adults, respectively, and 55.8, 42.5, 40.8, and 13.0, respectively, for children. Compared with the White race (reference group), Black race was significantly associated with acute care utilization in adults (rate ratio [RR], 1.76 [95% CI, 1.37-2.27]), although this finding was attenuated after multivariable adjustment (RR, 1.31 [95% CI, 1.03-1.68]). Black race was not significantly associated with acute care utilization in children; Asian race was significantly associated with lower acute care utilization in children (RR, 0.32 [95% CI 0.14-0.70]); no significant associations between Hispanic ethnicity and acute care utilization were identified. LIMITATIONS: We used proxies for SES and lacked direct information on income, household unemployment, or disability. CONCLUSIONS: Significant differences in acute care utilization rates were observed across racial and ethnic groups in persons with prevalent glomerular disease, although many of these difference were explained by differences in SES and disease severity. Measures to combat socioeconomic disadvantage in Black patients and to more effectively prevent and treat glomerular disease are needed to reduce disparities in acute care utilization, improve patient wellbeing, and reduce health care costs.


Assuntos
Etnicidade , Disparidades em Assistência à Saúde , Nefropatias , Aceitação pelo Paciente de Cuidados de Saúde , Adulto , Criança , Humanos , População Negra , Hispânico ou Latino , Estudos Prospectivos , Classe Social , Povo Asiático , População Branca , Aceitação pelo Paciente de Cuidados de Saúde/etnologia
5.
Am J Med Genet C Semin Med Genet ; 190(3): 289-301, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-36161695

RESUMO

Studies have shown that as many as 1 in 10 adults with chronic kidney disease has a monogenic form of disease. However, genetic services in adult nephrology are limited. An adult Kidney Genetics Clinic was established within the nephrology division at a large urban academic medical center to increase access to genetic services and testing in adults with kidney disease. Between June 2019 and December 2021, a total of 363 patients were referred to the adult Kidney Genetics Clinic. Of those who completed genetic testing, a positive diagnostic finding was identified in 27.1%, a candidate diagnostic finding was identified in 6.7% of patients, and a nondiagnostic positive finding was identified in an additional 8.6% of patients, resulting in an overall yield of 42.4% for clinically relevant genetic findings in tested patients. A genetic diagnosis had implications for medical management, family member testing, and eligibility for clinical trials. With the utilization of telemedicine, genetic services reached a diverse geographic and patient population. Genetic education efforts were integral to the clinic's success, as they increased visibility and helped providers identify appropriate referrals. Ongoing access to genomic services will remain a fundamental component of patient care in adults with kidney disease.


Assuntos
Nefrologia , Insuficiência Renal Crônica , Adulto , Humanos , Serviços em Genética , Nefrologia/métodos , Testes Genéticos/métodos , Encaminhamento e Consulta , Insuficiência Renal Crônica/diagnóstico , Insuficiência Renal Crônica/genética , Insuficiência Renal Crônica/terapia
6.
N Engl J Med ; 380(2): 142-151, 2019 01 10.
Artigo em Inglês | MEDLINE | ID: mdl-30586318

RESUMO

BACKGROUND: Exome sequencing is emerging as a first-line diagnostic method in some clinical disciplines, but its usefulness has yet to be examined for most constitutional disorders in adults, including chronic kidney disease, which affects more than 1 in 10 persons globally. METHODS: We conducted exome sequencing and diagnostic analysis in two cohorts totaling 3315 patients with chronic kidney disease. We assessed the diagnostic yield and, among the patients for whom detailed clinical data were available, the clinical implications of diagnostic and other medically relevant findings. RESULTS: In all, 3037 patients (91.6%) were over 21 years of age, and 1179 (35.6%) were of self-identified non-European ancestry. We detected diagnostic variants in 307 of the 3315 patients (9.3%), encompassing 66 different monogenic disorders. Of the disorders detected, 39 (59%) were found in only a single patient. Diagnostic variants were detected across all clinically defined categories, including congenital or cystic renal disease (127 of 531 patients [23.9%]) and nephropathy of unknown origin (48 of 281 patients [17.1%]). Of the 2187 patients assessed, 34 (1.6%) had genetic findings for medically actionable disorders that, although unrelated to their nephropathy, would also lead to subspecialty referral and inform renal management. CONCLUSIONS: Exome sequencing in a combined cohort of more than 3000 patients with chronic kidney disease yielded a genetic diagnosis in just under 10% of cases. (Funded by the National Institutes of Health and others.).


Assuntos
Exoma , Predisposição Genética para Doença , Mutação , Insuficiência Renal Crônica/genética , Análise de Sequência de DNA/métodos , Adulto , Idoso , Estudos de Coortes , Variação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Insuficiência Renal Crônica/etnologia , Adulto Jovem
7.
Clin Transplant ; 36(1): e14516, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34661305

RESUMO

It is unknown how providing prospective living donors with information about APOL1, including the benefits and drawbacks of testing, influences their desire for testing. In this study, we surveyed 102 participants with self-reported African ancestry and positive family history of kidney disease, recruited from our nephrology waiting room. We assessed views on APOL1 testing before and after presentation of a set of potential benefits and drawbacks of testing and quantified the self-reported level of influence individual benefits and drawbacks had on participants' desire for testing in the proposed context of living donation. The majority of participants (92%) were aware of organ donation and more than half (56%) had considered living donation. And though we found no significant change in response following presentation of the potential benefits and the drawbacks of APOL1 testing by study end significance, across all participants, "becoming aware of the potential risk of kidney disease among your immediate family" was the benefit with the highest mean influence (3.3±1.4), while the drawback with the highest mean influence (2.9±1.5) was "some transplant centers may not allow you to donate to a loved one". This study provides insights into the priorities of prospective living donors and suggests concern for how the information affects family members may strongly influence desires for testing. It also highlights the need for greater community engagement to gain a deeper understanding of the priorities that influence decision making on APOL1 testing.


Assuntos
Apolipoproteína L1 , Transplante de Rim , Negro ou Afro-Americano , Apolipoproteína L1/genética , Atitude , Testes Genéticos , Humanos , Doadores Vivos , Estudos Prospectivos
8.
Int J Obes (Lond) ; 45(1): 155-169, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-32952152

RESUMO

BACKGROUND/OBJECTIVES: Melanocortin-4 receptor (MC4R) plays an essential role in food intake and energy homeostasis. More than 170 MC4R variants have been described over the past two decades, with conflicting reports regarding the prevalence and phenotypic effects of these variants in diverse cohorts. To determine the frequency of MC4R variants in large cohort of different ancestries, we evaluated the MC4R coding region for 20,537 eMERGE participants with sequencing data plus additional 77,454 independent individuals with genome-wide genotyping data at this locus. SUBJECTS/METHODS: The sequencing data were obtained from the eMERGE phase III study, in which multisample variant call format calls have been generated, curated, and annotated. In addition to penetrance estimation using body mass index (BMI) as a binary outcome, GWAS and PheWAS were performed using median BMI in linear regression analyses. All results were adjusted for principal components, age, sex, and sites of genotyping. RESULTS: Targeted sequencing data of MC4R revealed 125 coding variants in 1839 eMERGE participants including 30 unreported coding variants that were predicted to be functionally damaging. Highly penetrant unreported variants included (L325I, E308K, D298N, S270F, F261L, T248A, D111V, and Y80F) in which seven participants had obesity class III defined as BMI ≥ 40 kg/m2. In GWAS analysis, in addition to known risk haplotype upstream of MC4R (best variant rs6567160 (P = 5.36 × 10-25, Beta = 0.37), a novel rare haplotype was detected which was protective against obesity and encompassed the V103I variant with known gain-of-function properties (P = 6.23 × 10-08, Beta = -0.62). PheWAS analyses extended this protective effect of V103I to type 2 diabetes, diabetic nephropathy, and chronic renal failure independent of BMI. CONCLUSIONS: MC4R screening in a large eMERGE cohort confirmed many previous findings, extend the MC4R pleotropic effects, and discovered additional MC4R rare alleles that probably contribute to obesity.


Assuntos
Variação Genética/genética , Estudo de Associação Genômica Ampla , Obesidade , Receptor Tipo 4 de Melanocortina/genética , Adulto , Idoso , Índice de Massa Corporal , Estudos de Coortes , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Obesidade/epidemiologia , Obesidade/genética
9.
J Biomed Inform ; 118: 103795, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33930535

RESUMO

Structured representation of clinical genetic results is necessary for advancing precision medicine. The Electronic Medical Records and Genomics (eMERGE) Network's Phase III program initially used a commercially developed XML message format for standardized and structured representation of genetic results for electronic health record (EHR) integration. In a desire to move towards a standard representation, the network created a new standardized format based upon Health Level Seven Fast Healthcare Interoperability Resources (HL7® FHIR®), to represent clinical genomics results. These new standards improve the utility of HL7® FHIR® as an international healthcare interoperability standard for management of genetic data from patients. This work advances the establishment of standards that are being designed for broad adoption in the current health information technology landscape.


Assuntos
Registros Eletrônicos de Saúde , Informática Médica , Genômica , Nível Sete de Saúde , Humanos , Medicina de Precisão
10.
Genet Med ; 21(10): 2371-2380, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-30930462

RESUMO

PURPOSE: Recruitment of participants from diverse backgrounds is crucial to the generalizability of genetic research, but has proven challenging. We retrospectively evaluated recruitment methods used for a study on return of genetic results. METHODS: The costs of study design, development, and participant enrollment were calculated, and the characteristics of the participants enrolled through the seven recruitment methods were examined. RESULTS: A total of 1118 participants provided consent, a blood sample, and questionnaire data. The estimated cost across recruitment methods ranged from $579 to $1666 per participant and required a large recruitment team. Recruitment methods using flyers and staff networks were the most cost-efficient and resulted in the highest completion rate. Targeted sampling that emphasized the importance of Latino/a participation, utilization of translated materials, and in-person recruitments contributed to enrolling a demographically diverse sample. CONCLUSIONS: Although all methods were deployed in the same hospital or neighborhood and shared the same staff, each recruitment method was different in terms of cost and characteristics of the enrolled participants, suggesting the importance of carefully choosing the recruitment methods based on the desired composition of the final study sample. This analysis provides information about the effectiveness and cost of different methods to recruit adults for genetic research.


Assuntos
Ensaios Clínicos como Assunto/economia , Testes Genéticos/economia , Seleção de Pacientes/ética , Adulto , Ensaios Clínicos como Assunto/métodos , Custos e Análise de Custo , Etnicidade , Feminino , Genômica/economia , Genômica/métodos , Humanos , Masculino , Programas de Rastreamento/economia , Pessoa de Meia-Idade , Projetos de Pesquisa , Estudos Retrospectivos
12.
Ann Intern Med ; 168(2): 100-109, 2018 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-29204651

RESUMO

Background: The utility of whole-exome sequencing (WES) for the diagnosis and management of adult-onset constitutional disorders has not been adequately studied. Genetic diagnostics may be advantageous in adults with chronic kidney disease (CKD), in whom the cause of kidney failure often remains unknown. Objective: To study the diagnostic utility of WES in a selected referral population of adults with CKD. Design: Observational cohort. Setting: A major academic medical center. Patients: 92 adults with CKD of unknown cause or familial nephropathy or hypertension. Measurements: The diagnostic yield of WES and its potential effect on clinical management. Results: Whole-exome sequencing provided a diagnosis in 22 of 92 patients (24%), including 9 probands with CKD of unknown cause and encompassing 13 distinct genetic disorders. Among these, loss-of-function mutations were identified in PARN in 2 probands with tubulointerstitial fibrosis. PARN mutations have been implicated in a short telomere syndrome characterized by lung, bone marrow, and liver fibrosis; these findings extend the phenotype of PARN mutations to renal fibrosis. In addition, review of the American College of Medical Genetics actionable genes identified a pathogenic BRCA2 mutation in a proband who was diagnosed with breast cancer on follow-up. The results affected clinical management in most identified cases, including initiation of targeted surveillance, familial screening to guide donor selection for transplantation, and changes in therapy. Limitation: The small sample size and recruitment at a tertiary care academic center limit generalizability of findings among the broader CKD population. Conclusion: Whole-exome sequencing identified diagnostic mutations in a substantial number of adults with CKD of many causes. Further study of the utility of WES in the evaluation and care of patients with CKD in additional settings is warranted. Primary Funding Source: New York State Empire Clinical Research Investigator Program, Renal Research Institute, and National Human Genome Research Institute of the National Institutes of Health.


Assuntos
Exoma/genética , Insuficiência Renal Crônica/genética , Análise de Sequência de DNA/métodos , Adulto , Feminino , Predisposição Genética para Doença , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Cidade de Nova Iorque
14.
Kidney Int Rep ; 9(8): 2420-2431, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39156149

RESUMO

Introduction: Genomic medicine holds transformative potential for personalized nephrology care; however, its clinical integration poses challenges. Automated clinical decision support (CDS) systems in the electronic health record (EHR) offer a promising solution but have shown limited impact. This study aims to glean practical insights into nephrologists' challenges using genomic resources, informing precision nephrology decision support tools. Methods: We conducted an anonymous electronic survey among US nephrologists from January 19, 2021 to May 19, 2021, guided by the Consolidated Framework for Implementation Research. It assessed practice characteristics, genomic resource utilization, attitudes, perceived knowledge, self-efficacy, and factors influencing genetic testing decisions. Survey links were primarily shared with National Kidney Foundation members. Results: We analyzed 319 surveys, with most respondents specializing in adult nephrology. Although respondents generally acknowledged the clinical use of genomic resources, varying levels of perceived knowledge and self-efficacy were evident regarding precision nephrology workflows. Barriers to genetic testing included cost/insurance coverage and limited genomics experience. Conclusion: The study illuminates specific hurdles nephrologists face using genomic resources. The findings are a valuable contribution to genomic implementation research, highlighting the significance of developing tailored interventions to support clinicians in using genomic resources effectively. These findings can guide the future development of CDS systems in the EHR. Addressing unmet informational and workflow support needs can enhance the integration of genomics into clinical practice, advancing personalized nephrology care and improving kidney disease outcomes. Further research should focus on interventions promoting seamless precision nephrology care integration.

15.
JAMIA Open ; 7(1): ooae021, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38455840

RESUMO

Objective: To automate scientific claim verification using PubMed abstracts. Materials and Methods: We developed CliVER, an end-to-end scientific Claim VERification system that leverages retrieval-augmented techniques to automatically retrieve relevant clinical trial abstracts, extract pertinent sentences, and use the PICO framework to support or refute a scientific claim. We also created an ensemble of three state-of-the-art deep learning models to classify rationale of support, refute, and neutral. We then constructed CoVERt, a new COVID VERification dataset comprising 15 PICO-encoded drug claims accompanied by 96 manually selected and labeled clinical trial abstracts that either support or refute each claim. We used CoVERt and SciFact (a public scientific claim verification dataset) to assess CliVER's performance in predicting labels. Finally, we compared CliVER to clinicians in the verification of 19 claims from 6 disease domains, using 189 648 PubMed abstracts extracted from January 2010 to October 2021. Results: In the evaluation of label prediction accuracy on CoVERt, CliVER achieved a notable F1 score of 0.92, highlighting the efficacy of the retrieval-augmented models. The ensemble model outperforms each individual state-of-the-art model by an absolute increase from 3% to 11% in the F1 score. Moreover, when compared with four clinicians, CliVER achieved a precision of 79.0% for abstract retrieval, 67.4% for sentence selection, and 63.2% for label prediction, respectively. Conclusion: CliVER demonstrates its early potential to automate scientific claim verification using retrieval-augmented strategies to harness the wealth of clinical trial abstracts in PubMed. Future studies are warranted to further test its clinical utility.

16.
ArXiv ; 2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39371088

RESUMO

Large language models (LLMs) hold great promise in summarizing medical evidence. Most recent studies focus on the application of proprietary LLMs. Using proprietary LLMs introduces multiple risk factors, including a lack of transparency and vendor dependency. While open-source LLMs allow better transparency and customization, their performance falls short compared to proprietary ones. In this study, we investigated to what extent fine-tuning open-source LLMs can further improve their performance in summarizing medical evidence. Utilizing a benchmark dataset, MedReview, consisting of 8,161 pairs of systematic reviews and summaries, we fine-tuned three broadly-used, open-sourced LLMs, namely PRIMERA, LongT5, and Llama-2. Overall, the fine-tuned LLMs obtained an increase of 9.89 in ROUGE-L (95% confidence interval: 8.94-10.81), 13.21 in METEOR score (95% confidence interval: 12.05-14.37), and 15.82 in CHRF score (95% confidence interval: 13.89-16.44). The performance of fine-tuned LongT5 is close to GPT-3.5 with zero-shot settings. Furthermore, smaller fine-tuned models sometimes even demonstrated superior performance compared to larger zero-shot models. The above trends of improvement were also manifested in both human and GPT4-simulated evaluations. Our results can be applied to guide model selection for tasks demanding particular domain knowledge, such as medical evidence summarization.

17.
NPJ Digit Med ; 7(1): 239, 2024 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-39251804

RESUMO

Large language models (LLMs) hold great promise in summarizing medical evidence. Most recent studies focus on the application of proprietary LLMs. Using proprietary LLMs introduces multiple risk factors, including a lack of transparency and vendor dependency. While open-source LLMs allow better transparency and customization, their performance falls short compared to the proprietary ones. In this study, we investigated to what extent fine-tuning open-source LLMs can further improve their performance. Utilizing a benchmark dataset, MedReview, consisting of 8161 pairs of systematic reviews and summaries, we fine-tuned three broadly-used, open-sourced LLMs, namely PRIMERA, LongT5, and Llama-2. Overall, the performance of open-source models was all improved after fine-tuning. The performance of fine-tuned LongT5 is close to GPT-3.5 with zero-shot settings. Furthermore, smaller fine-tuned models sometimes even demonstrated superior performance compared to larger zero-shot models. The above trends of improvement were manifested in both a human evaluation and a larger-scale GPT4-simulated evaluation.

18.
medRxiv ; 2023 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-37214819

RESUMO

Background: Chronic kidney disease (CKD) is a genetically complex disease determined by an interplay of monogenic, polygenic, and environmental risks. Most forms of monogenic kidney diseases have incomplete penetrance and variable expressivity. It is presently unknown if some of the variability in penetrance can be attributed to polygenic factors. Methods: Using the UK Biobank (N=469,835 participants) and the All of Us (N=98,622 participants) datasets, we examined two most common forms of monogenic kidney disorders, autosomal dominant polycystic kidney disease (ADPKD) caused by deleterious variants in the PKD1 or PKD2 genes, and COL4A-associated nephropathy (COL4A-AN caused by deleterious variants in COL4A3, COL4A4, or COL4A5 genes). We used the eMERGE-III electronic CKD phenotype to define cases (estimated glomerular filtration rate (eGFR) <60 mL/min/1.73m2 or kidney failure) and controls (eGFR >90 mL/min/1.73m2 in the absence of kidney disease diagnoses). The effects of the genome-wide polygenic score (GPS) for CKD were tested in monogenic variant carriers and non-carriers using logistic regression controlling for age, sex, diabetes, and genetic ancestry. Results: As expected, the carriers of known pathogenic and rare predicted loss-of-function variants in PKD1 or PKD2 had a high risk of CKD (ORmeta=17.1, 95% CI: 11.1-26.4, P=1.8E-37). The GPS was comparably predictive of CKD in both ADPKD variant carriers (ORmeta=2.28 per SD, 95%CI: 1.55-3.37, P=2.6E-05) and non-carriers (ORmeta=1.72 per SD, 95% CI=1.69-1.76, P< E-300) independent of age, sex, diabetes, and genetic ancestry. Compared to the middle tertile of the GPS distribution for non-carriers, ADPKD variant carriers in the top tertile had a 54-fold increased risk of CKD, while ADPKD variant carriers in the bottom tertile had only a 3-fold increased risk of CKD. Similarly, the GPS was predictive of CKD in both COL4-AN variant carriers (ORmeta=1.78, 95% CI=1.22-2.58, P=2.38E-03) and non-carriers (ORmeta=1.70, 95%CI: 1.68-1.73 P

19.
Nat Commun ; 14(1): 8318, 2023 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-38097619

RESUMO

Chronic kidney disease (CKD) is determined by an interplay of monogenic, polygenic, and environmental risks. Autosomal dominant polycystic kidney disease (ADPKD) and COL4A-associated nephropathy (COL4A-AN) represent the most common forms of monogenic kidney diseases. These disorders have incomplete penetrance and variable expressivity, and we hypothesize that polygenic factors explain some of this variability. By combining SNP array, exome/genome sequence, and electronic health record data from the UK Biobank and All-of-Us cohorts, we demonstrate that the genome-wide polygenic score (GPS) significantly predicts CKD among ADPKD monogenic variant carriers. Compared to the middle tertile of the GPS for noncarriers, ADPKD variant carriers in the top tertile have a 54-fold increased risk of CKD, while ADPKD variant carriers in the bottom tertile have only a 3-fold increased risk of CKD. Similarly, the GPS significantly predicts CKD in COL4A-AN carriers. The carriers in the top tertile of the GPS have a 2.5-fold higher risk of CKD, while the risk for carriers in the bottom tertile is not different from the average population risk. These results suggest that accounting for polygenic risk improves risk stratification in monogenic kidney disease.


Assuntos
Rim Policístico Autossômico Dominante , Insuficiência Renal Crônica , Humanos , Penetrância , Insuficiência Renal Crônica/genética , Insuficiência Renal Crônica/complicações , Herança Multifatorial/genética , Fatores de Risco
20.
NPJ Digit Med ; 6(1): 158, 2023 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-37620423

RESUMO

Recent advances in large language models (LLMs) have demonstrated remarkable successes in zero- and few-shot performance on various downstream tasks, paving the way for applications in high-stakes domains. In this study, we systematically examine the capabilities and limitations of LLMs, specifically GPT-3.5 and ChatGPT, in performing zero-shot medical evidence summarization across six clinical domains. We conduct both automatic and human evaluations, covering several dimensions of summary quality. Our study demonstrates that automatic metrics often do not strongly correlate with the quality of summaries. Furthermore, informed by our human evaluations, we define a terminology of error types for medical evidence summarization. Our findings reveal that LLMs could be susceptible to generating factually inconsistent summaries and making overly convincing or uncertain statements, leading to potential harm due to misinformation. Moreover, we find that models struggle to identify the salient information and are more error-prone when summarizing over longer textual contexts.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa