RESUMO
Large language models (LLMs) are generating interest in medical settings. For example, LLMs can respond coherently to medical queries by providing plausible differential diagnoses based on clinical notes. However, there are many questions to explore, such as evaluating differences between open- and closed-source LLMs as well as LLM performance on queries from both medical and non-medical users. In this study, we assessed multiple LLMs, including Llama-2-chat, Vicuna, Medllama2, Bard/Gemini, Claude, ChatGPT3.5, and ChatGPT-4, as well as non-LLM approaches (Google search and Phenomizer) regarding their ability to identify genetic conditions from textbook-like clinician questions and their corresponding layperson translations related to 63 genetic conditions. For open-source LLMs, larger models were more accurate than smaller LLMs: 7b, 13b, and larger than 33b parameter models obtained accuracy ranges from 21%-49%, 41%-51%, and 54%-68%, respectively. Closed-source LLMs outperformed open-source LLMs, with ChatGPT-4 performing best (89%-90%). Three of 11 LLMs and Google search had significant performance gaps between clinician and layperson prompts. We also evaluated how in-context prompting and keyword removal affected open-source LLM performance. Models were provided with 2 types of in-context prompts: list-type prompts, which improved LLM performance, and definition-type prompts, which did not. We further analyzed removal of rare terms from descriptions, which decreased accuracy for 5 of 7 evaluated LLMs. Finally, we observed much lower performance with real individuals' descriptions; LLMs answered these questions with a maximum 21% accuracy.
Assuntos
Autorrelato , Humanos , Idioma , Doenças Genéticas Inatas/genéticaRESUMO
The precise regulation of DNA replication is vital for cellular division and genomic integrity. Central to this process is the replication factor C (RFC) complex, encompassing five subunits, which loads proliferating cell nuclear antigen onto DNA to facilitate the recruitment of replication and repair proteins and enhance DNA polymerase processivity. While RFC1's role in cerebellar ataxia, neuropathy, and vestibular areflexia syndrome (CANVAS) is known, the contributions of RFC2-5 subunits on human Mendelian disorders is largely unexplored. Our research links bi-allelic variants in RFC4, encoding a core RFC complex subunit, to an undiagnosed disorder characterized by incoordination and muscle weakness, hearing impairment, and decreased body weight. We discovered across nine affected individuals rare, conserved, predicted pathogenic variants in RFC4, all likely to disrupt the C-terminal domain indispensable for RFC complex formation. Analysis of a previously determined cryo-EM structure of RFC bound to proliferating cell nuclear antigen suggested that the variants disrupt interactions within RFC4 and/or destabilize the RFC complex. Cellular studies using RFC4-deficient HeLa cells and primary fibroblasts demonstrated decreased RFC4 protein, compromised stability of the other RFC complex subunits, and perturbed RFC complex formation. Additionally, functional studies of the RFC4 variants affirmed diminished RFC complex formation, and cell cycle studies suggested perturbation of DNA replication and cell cycle progression. Our integrated approach of combining in silico, structural, cellular, and functional analyses establishes compelling evidence that bi-allelic loss-of-function RFC4 variants contribute to the pathogenesis of this multisystemic disorder. These insights broaden our understanding of the RFC complex and its role in human health and disease.
Assuntos
Proteína de Replicação C , Humanos , Proteína de Replicação C/genética , Proteína de Replicação C/metabolismo , Masculino , Células HeLa , Feminino , Fenótipo , Replicação do DNA/genética , Adulto , Mutação , Antígeno Nuclear de Célula em Proliferação/metabolismo , Antígeno Nuclear de Célula em Proliferação/genética , AlelosRESUMO
Balanced chromosomal abnormalities (BCAs) represent a relatively untapped reservoir of single-gene disruptions in neurodevelopmental disorders (NDDs). We sequenced BCAs in patients with autism or related NDDs, revealing disruption of 33 loci in four general categories: (1) genes previously associated with abnormal neurodevelopment (e.g., AUTS2, FOXP1, and CDKL5), (2) single-gene contributors to microdeletion syndromes (MBD5, SATB2, EHMT1, and SNURF-SNRPN), (3) novel risk loci (e.g., CHD8, KIRREL3, and ZNF507), and (4) genes associated with later-onset psychiatric disorders (e.g., TCF4, ZNF804A, PDE10A, GRIN2B, and ANK3). We also discovered among neurodevelopmental cases a profoundly increased burden of copy-number variants from these 33 loci and a significant enrichment of polygenic risk alleles from genome-wide association studies of autism and schizophrenia. Our findings suggest a polygenic risk model of autism and reveal that some neurodevelopmental genes are sensitive to perturbation by multiple mutational mechanisms, leading to variable phenotypic outcomes that manifest at different life stages.
Assuntos
Transtornos Globais do Desenvolvimento Infantil/genética , Aberrações Cromossômicas , Transtorno Autístico/diagnóstico , Transtorno Autístico/genética , Criança , Transtornos Globais do Desenvolvimento Infantil/diagnóstico , Quebra Cromossômica , Deleção Cromossômica , Variações do Número de Cópias de DNA , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Sistema Nervoso/crescimento & desenvolvimento , Esquizofrenia/genética , Análise de Sequência de DNA , Transdução de SinaisRESUMO
Artificial intelligence (AI) for facial diagnostics is increasingly used in the genetics clinic to evaluate patients with potential genetic conditions. Current approaches focus on one type of AI called Deep Learning (DL). While DL- based facial diagnostic platforms have a high accuracy rate for many conditions, less is understood about how this technology assesses and classifies (categorizes) images, and how this compares to humans. To compare human and computer attention, we performed eye-tracking analyses of geneticist clinicians (n = 22) and non-clinicians (n = 22) who viewed images of people with 10 different genetic conditions, as well as images of unaffected individuals. We calculated the Intersection-over-Union (IoU) and Kullback-Leibler divergence (KL) to compare the visual attentions of the two participant groups, and then the clinician group against the saliency maps of our deep learning classifier. We found that human visual attention differs greatly from DL model's saliency results. Averaging over all the test images, IoU and KL metric for the successful (accurate) clinician visual attentions versus the saliency maps were 0.15 and 11.15, respectively. Individuals also tend to have a specific pattern of image inspection, and clinicians demonstrate different visual attention patterns than non-clinicians (IoU and KL of clinicians versus non-clinicians were 0.47 and 2.73, respectively). This study shows that humans (at different levels of expertise) and a computer vision model examine images differently. Understanding these differences can improve the design and use of AI tools, and lead to more meaningful interactions between clinicians and AI technologies.
Assuntos
Inteligência Artificial , Computadores , Humanos , Simulação por ComputadorRESUMO
Starting with the launch of the Human Genome Project three decades ago, and continuing after its completion in 2003, genomics has progressively come to have a central and catalytic role in basic and translational research. In addition, studies increasingly demonstrate how genomic information can be effectively used in clinical care. In the future, the anticipated advances in technology development, biological insights, and clinical applications (among others) will lead to more widespread integration of genomics into almost all areas of biomedical research, the adoption of genomics into mainstream medical and public-health practices, and an increasing relevance of genomics for everyday life. On behalf of the research community, the National Human Genome Research Institute recently completed a multi-year process of strategic engagement to identify future research priorities and opportunities in human genomics, with an emphasis on health applications. Here we describe the highest-priority elements envisioned for the cutting-edge of human genomics going forward-that is, at 'The Forefront of Genomics'.
Assuntos
Pesquisa Biomédica/tendências , Genoma Humano/genética , Genômica/tendências , Saúde Pública/normas , Pesquisa Translacional Biomédica/tendências , Pesquisa Biomédica/economia , COVID-19/genética , Genômica/economia , Humanos , National Human Genome Research Institute (U.S.)/economia , Mudança Social , Pesquisa Translacional Biomédica/economia , Estados UnidosRESUMO
Artificial intelligence (AI) is increasingly used in genomics research and practice, and generative AI has garnered significant recent attention. In clinical applications of generative AI, aspects of the underlying datasets can impact results, and confounders should be studied and mitigated. One example involves the facial expressions of people with genetic conditions. Stereotypically, Williams (WS) and Angelman (AS) syndromes are associated with a "happy" demeanor, including a smiling expression. Clinical geneticists may be more likely to identify these conditions in images of smiling individuals. To study the impact of facial expression, we analyzed publicly available facial images of approximately 3500 individuals with genetic conditions. Using a deep learning (DL) image classifier, we found that WS and AS images with non-smiling expressions had significantly lower prediction probabilities for the correct syndrome labels than those with smiling expressions. This was not seen for 22q11.2 deletion and Noonan syndromes, which are not associated with a smiling expression. To further explore the effect of facial expressions, we computationally altered the facial expressions for these images. We trained HyperStyle, a GAN-inversion technique compatible with StyleGAN2, to determine the vector representations of our images. Then, following the concept of InterfaceGAN, we edited these vectors to recreate the original images in a phenotypically accurate way but with a different facial expression. Through online surveys and an eye-tracking experiment, we examined how altered facial expressions affect the performance of human experts. We overall found that facial expression is associated with diagnostic accuracy variably in different genetic conditions.
Assuntos
Expressão Facial , Humanos , Aprendizado Profundo , Inteligência Artificial , Genética Médica/métodos , Síndrome de Williams/genéticaRESUMO
BACKGROUND: The rate of diagnosis of mast cell activation syndrome (MCAS) has increased since the disorder's original description as a mastocytosis-like phenotype. While a set of consortium MCAS criteria is well described and widely accepted, this increase occurs in the setting of a broader set of proposed alternative MCAS criteria. OBJECTIVE: Effective diagnostic criteria must minimize the range of unrelated diagnoses that can be erroneously classified as the condition of interest. We sought to determine if the symptoms associated with alternative MCAS criteria result in less concise or consistent diagnostic alternatives, reducing diagnostic specificity. METHODS: We used multiple large language models, including ChatGPT, Claude, and Gemini, to bootstrap the probabilities of diagnoses that are compatible with consortium or alternative MCAS criteria. We utilized diversity and network analyses to quantify diagnostic precision and specificity compared to control diagnostic criteria including systemic lupus erythematosus, Kawasaki disease, and migraines. RESULTS: Compared to consortium MCAS criteria, alternative MCAS criteria are associated with more variable (Shannon diversity 5.8 vs 4.6, respectively; P = .004) and less precise (mean Bray-Curtis similarity 0.07 vs 0.19, respectively; P = .004) diagnoses. The diagnosis networks derived from consortium and alternative MCAS criteria had lower between-network similarity compared to the similarity between diagnosis networks derived from 2 distinct systemic lupus erythematosus criteria (cosine similarity 0.55 vs 0.86, respectively; P = .0022). CONCLUSION: Alternative MCAS criteria are associated with a distinct set of diagnoses compared to consortium MCAS criteria and have lower diagnostic consistency. This lack of specificity is pronounced in relation to multiple control criteria, raising the concern that alternative criteria could disproportionately contribute to MCAS overdiagnosis, to the exclusion of more appropriate diagnoses.
RESUMO
Genetic conditions affect people throughout their entire lifespan; however, many clinical geneticists focus on the care of pediatric individuals. We analyzed the medical literature and related resources to help assess to what extent adults with genetic diseases were represented. This included general literature searches of PubMed (from 2001 through 2022), specific databases (the FDA orphan drug list and the Clinical Genomic Database) related to management and direct treatment of genetic conditions, and textbooks and morphology guides relevant to the diagnosis of genetic conditions. In the field of genetics/genomics in general, we overall detected a statistically significant emphasis on pediatric populations in the medical literature compared to select other disciplines and compared with the global population distribution. Clinical genetics articles about adults tended to focus on younger adult ages. In clinical genetics, management and treatments, as well as illustrations in several educational/diagnostic resources tended to focus on pediatric populations.
Assuntos
Genética Médica , Genômica , Adulto , Humanos , CriançaRESUMO
Virtually all areas of biomedicine will be increasingly affected by applications of artificial intelligence (AI). We discuss how AI may affect fields of medical genetics, including both clinicians and laboratorians. In addition to reviewing the anticipated impact, we provide recommendations for ways in which these groups may want to evolve in light of the influence of AI. We also briefly discuss how educational and training programs can play a key role in preparing the future workforce given these anticipated changes.
Assuntos
Inteligência Artificial , Genética Médica , HumanosRESUMO
The SARS-CoV-2 pandemic raises many scientific and clinical questions. These include how host genetic factors affect disease susceptibility and pathogenesis. New work is emerging related to SARS-CoV-2; previous work has been conducted on other coronaviruses that affect different species. We reviewed the literature on host genetic factors related to coronaviruses, systematically focusing on human studies. We identified 1,832 articles of potential relevance. Seventy-five involved human host genetic factors, 36 of which involved analysis of specific genes or loci; aside from one meta-analysis, all were candidate-driven studies, typically investigating small numbers of research subjects and loci. Three additional case reports were described. Multiple significant loci were identified, including 16 related to susceptibility (seven of which identified protective alleles) and 16 related to outcomes (three of which identified protective alleles). The types of cases and controls used varied considerably; four studies used traditional replication/validation cohorts. Among other studies, 30 involved both human and non-human host genetic factors related to coronavirus, 178 involved study of non-human (animal) host genetic factors related to coronavirus, and 984 involved study of non-genetic host factors related to coronavirus, including involving immunopathogenesis. Previous human studies have been limited by issues that may be less impactful now, including low numbers of eligible participants and limited availability of advanced genomic methods; however, these may raise additional considerations. We outline key genes and loci from animal and human host genetic studies that may bear investigation in the study of COVID-19. We also discuss how previous studies may direct current lines of inquiry.
Assuntos
Infecções por Coronavirus/genética , Predisposição Genética para Doença , Pneumonia Viral/genética , Animais , Betacoronavirus , COVID-19 , Reservatórios de Doenças/veterinária , Humanos , Pandemias , Receptores Virais/genética , SARS-CoV-2 , Especificidade da EspécieRESUMO
Secondary genomic findings are increasingly being returned to individuals as opportunistic screening results. A secondary finding offers the chance to identify and mitigate disease that may otherwise be unrecognized in an individual. As a form of screening, secondary findings must be considered differently from sequencing results in a diagnostic setting. For these reasons, clinicians should employ an evaluation and long-term management strategy that accounts for both the increased disease risk associated with a secondary finding and the lower positive predictive value of a screening result compared to an indication-based testing result. Here we describe an approach to the clinical evaluation and management of an individual who presents with a secondary finding. This approach enumerates five domains of evaluation-(1) medical history, (2) physical exam, (3) family history, (4) diagnostic phenotypic testing, and (5) variant correlation-through which a clinician can distinguish a molecular finding from a clinicomolecular diagnosis of genomic disease. With this framework, both geneticists and non-geneticist clinicians can optimize their ability to detect and mitigate genomic disease while avoiding the pitfalls of overdiagnosis. Our goal with this approach is to help clinicians translate secondary findings into meaningful recognition, treatment, and prevention of disease.
Assuntos
Doenças Genéticas Inatas/genética , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/prevenção & controle , Genômica/métodos , Humanos , AnamneseRESUMO
Social media has become ubiquitous in daily life, and increasingly impacts medical and scientific fields, including related to clinical genetics. Recent events have led to questions about the use of certain social media platforms, as well as social media more generally. We discuss these considerations, including alternative and emerging platforms that can offer forums for the clinical genetics and related communities.
Assuntos
Genética Médica , Mídias Sociais , HumanosRESUMO
The field of clinical genetics and genomics continues to evolve. In the past few decades, milestones like the initial sequencing of the human genome, dramatic changes in sequencing technologies, and the introduction of artificial intelligence, have upended the field and offered fascinating new insights. Though difficult to predict the precise paths the field will follow, rapid change may continue to be inevitable. Within genetics, the practice of dysmorphology, as defined by pioneering geneticist David W. Smith in the 1960s as "the study of, or general subject of abnormal development of tissue form" has also been affected by technological advances as well as more general trends in biomedicine. To address possibilities, potential, and perils regarding the future of dysmorphology, a group of clinical geneticists, representing different career stages, areas of focus, and geographic regions, have contributed to this piece by providing insights about how the practice of dysmorphology will develop over the next several decades.
Assuntos
Inteligência Artificial , Genômica , Humanos , Genoma HumanoRESUMO
PURPOSE OF REVIEW: There are thousands of different clinical genetic tests currently available. Genetic testing and its applications continue to change rapidly for multiple reasons. These reasons include technological advances, accruing evidence about the impact and effects of testing, and many complex financial and regulatory factors. RECENT FINDINGS: This article considers a number of key issues and axes related to the current and future state of clinical genetic testing, including targeted versus broad testing, simple/Mendelian versus polygenic and multifactorial testing models, genetic testing for individuals with high suspicion of genetic conditions versus ascertainment through population screening, the rise of artificial intelligence in multiple aspects of the genetic testing process, and how developments such as rapid genetic testing and the growing availability of new therapies for genetic conditions may affect the field. SUMMARY: Genetic testing is expanding and evolving, including into new clinical applications. Developments in the field of genetics will likely result in genetic testing becoming increasingly in the purview of a very broad range of clinicians, including general paediatricians as well as paediatric subspecialists.
Assuntos
Inteligência Artificial , Testes Genéticos , Humanos , CriançaRESUMO
Deep learning (DL) is applied in many biomedical areas. We performed a scoping review on DL in medical genetics. We first assessed 14,002 articles, of which 133 involved DL in medical genetics. DL in medical genetics increased rapidly during the studied period. In medical genetics, DL has largely been applied to small data sets of affected individuals (mean = 95, median = 29) with genetic conditions (71 different genetic conditions were studied; 24 articles studied multiple conditions). A variety of data types have been used in medical genetics, including radiologic (20%), ophthalmologic (14%), microscopy (8%), and text-based data (4%); the most common data type was patient facial photographs (46%). DL authors and research subjects overrepresent certain geographic areas (United States, Asia, and Europe). Convolutional neural networks (89%) were the most common method. Results were compared with human performance in 31% of studies. In total, 51% of articles provided data access; 16% released source code. To further explore DL in genomics, we conducted an additional analysis, the results of which highlight future opportunities for DL in medical genetics. Finally, we expect DL applications to increase in the future. To aid data curation, we evaluated a DL, random forest, and rule-based classifier at categorizing article abstracts.
Assuntos
Aprendizado Profundo , Genética Médica , Ásia , Genômica , Humanos , Redes Neurais de ComputaçãoRESUMO
Preterm birth (PTB) complications are the leading cause of long-term morbidity and mortality in children. By using whole blood samples, we integrated whole-genome sequencing (WGS), RNA sequencing (RNA-seq), and DNA methylation data for 270 PTB and 521 control families. We analyzed this combined dataset to identify genomic variants associated with PTB and secondary analyses to identify variants associated with very early PTB (VEPTB) as well as other subcategories of disease that may contribute to PTB. We identified differentially expressed genes (DEGs) and methylated genomic loci and performed expression and methylation quantitative trait loci analyses to link genomic variants to these expression and methylation changes. We performed enrichment tests to identify overlaps between new and known PTB candidate gene systems. We identified 160 significant genomic variants associated with PTB-related phenotypes. The most significant variants, DEGs, and differentially methylated loci were associated with VEPTB. Integration of all data types identified a set of 72 candidate biomarker genes for VEPTB, encompassing genes and those previously associated with PTB. Notably, PTB-associated genes RAB31 and RBPJ were identified by all three data types (WGS, RNA-seq, and methylation). Pathways associated with VEPTB include EGFR and prolactin signaling pathways, inflammation- and immunity-related pathways, chemokine signaling, IFN-γ signaling, and Notch1 signaling. Progress in identifying molecular components of a complex disease is aided by integrated analyses of multiple molecular data types and clinical data. With these data, and by stratifying PTB by subphenotype, we have identified associations between VEPTB and the underlying biology.
Assuntos
Predisposição Genética para Doença/genética , Nascimento Prematuro/genética , Metilação de DNA/genética , Feminino , Genômica/métodos , Humanos , Recém-Nascido , Masculino , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Transdução de Sinais/genética , Sequenciamento Completo do Genoma/métodosRESUMO
The conserved oligomeric Golgi (COG) complex is involved in intracellular vesicular transport, and is composed of eight subunits distributed in two lobes, lobe A (COG1-4) and lobe B (COG5-8). We describe fourteen individuals with Saul-Wilson syndrome, a rare form of primordial dwarfism with characteristic facial and radiographic features. All affected subjects harbored heterozygous de novo variants in COG4, giving rise to the same recurrent amino acid substitution (p.Gly516Arg). Affected individuals' fibroblasts, whose COG4 mRNA and protein were not decreased, exhibited delayed anterograde vesicular trafficking from the ER to the Golgi and accelerated retrograde vesicular recycling from the Golgi to the ER. This altered steady-state equilibrium led to a decrease in Golgi volume, as well as morphologic abnormalities with collapse of the Golgi stacks. Despite these abnormalities of the Golgi apparatus, protein glycosylation in sera and fibroblasts from affected subjects was not notably altered, but decorin, a proteoglycan secreted into the extracellular matrix, showed altered Golgi-dependent glycosylation. In summary, we define a specific heterozygous COG4 substitution as the molecular basis of Saul-Wilson syndrome, a rare skeletal dysplasia distinct from biallelic COG4-CDG.
Assuntos
Síndrome do Cromossomo X Frágil/genética , Transporte Proteico/genética , Proteoglicanas/genética , Proteínas de Transporte Vesicular/genética , Adulto , Substituição de Aminoácidos/genética , Animais , Animais Geneticamente Modificados/genética , Linhagem Celular , Criança , Pré-Escolar , Retículo Endoplasmático/genética , Matriz Extracelular/genética , Feminino , Fibroblastos/patologia , Glicosilação , Complexo de Golgi/genética , Heterozigoto , Humanos , Lactente , Masculino , Peixe-ZebraRESUMO
PURPOSE: Reports have questioned the dogma of exclusive maternal transmission of human mitochondrial DNA (mtDNA), including the recent report of an admixture of two mtDNA haplogroups in individuals from three multigeneration families. This was interpreted as being consistent with biparental transmission of mtDNA in an autosomal dominant-like mode. The authenticity and frequency of these findings are debated. METHODS: We retrospectively analyzed individuals with two mtDNA haplogroups from 2017 to 2019 and selected four families for further study. RESULTS: We identified this phenomenon in 104/27,388 (approximately 1/263) unrelated individuals. Further study revealed (1) a male with two mitochondrial haplogroups transmits only one haplogroup to some of his offspring, consistent with nuclear transmission; (2) the heteroplasmy level of paternally transmitted variants is highest in blood, lower in buccal, and absent in muscle or urine of the same individual, indicating it is inversely correlated with mtDNA content; and (3) paternally transmitted apparent large-scale mtDNA deletions/duplications are not associated with a disease phenotype. CONCLUSION: These findings strongly suggest that the observed mitochondrial haplogroup of paternal origin resulted from coamplification of rare, concatenated nuclear mtDNA segments with genuine mtDNA during testing. Evaluation of additional specimen types can help clarify the clinical significance of the observed results.
Assuntos
DNA Mitocondrial , Mitocôndrias , DNA Mitocondrial/genética , Haplótipos , Humanos , Masculino , Mitocôndrias/genética , Fenótipo , Estudos RetrospectivosRESUMO
The study objective was to test the hypothesis that having histocompatible children increases the risk of rheumatoid arthritis (RA) and systemic lupus erythematosus (SLE), possibly by contributing to the persistence of fetal cells acquired during pregnancy. We conducted a case control study using data from the UC San Francisco Mother Child Immunogenetic Study and studies at the Inova Translational Medicine Institute. We imputed human leukocyte antigen (HLA) alleles and minor histocompatibility antigens (mHags). We created a variable of exposure to histocompatible children. We estimated an average sequence similarity matching (SSM) score for each mother based on discordant mother-child alleles as a measure of histocompatibility. We used logistic regression models to estimate odds ratios (ORs) and 95% confidence intervals. A total of 138 RA, 117 SLE, and 913 control mothers were analyzed. Increased risk of RA was associated with having any child compatible at HLA-B (OR 1.9; 1.2-3.1), DPB1 (OR 1.8; 1.2-2.6) or DQB1 (OR 1.8; 1.2-2.7). Compatibility at mHag ZAPHIR was associated with reduced risk of SLE among mothers carrying the HLA-restriction allele B*07:02 (n = 262; OR 0.4; 0.2-0.8). Our findings support the hypothesis that mother-child histocompatibility is associated with risk of RA and SLE.