Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 14 de 14
1.
J Neurodev Disord ; 16(1): 17, 2024 Apr 17.
Article En | MEDLINE | ID: mdl-38632549

Monogenic disorders account for a large proportion of population-attributable risk for neurodevelopmental disabilities. However, the data necessary to infer a causal relationship between a given genetic variant and a particular neurodevelopmental disorder is often lacking. Recognizing this scientific roadblock, 13 Intellectual and Developmental Disabilities Research Centers (IDDRCs) formed a consortium to create the Brain Gene Registry (BGR), a repository pairing clinical genetic data with phenotypic data from participants with variants in putative brain genes. Phenotypic profiles are assembled from the electronic health record (EHR) and a battery of remotely administered standardized assessments collectively referred to as the Rapid Neurobehavioral Assessment Protocol (RNAP), which include cognitive, neurologic, and neuropsychiatric assessments, as well as assessments for attention deficit hyperactivity disorder (ADHD) and autism spectrum disorder (ASD). Co-enrollment of BGR participants in the Clinical Genome Resource's (ClinGen's) GenomeConnect enables display of variant information in ClinVar. The BGR currently contains data on 479 participants who are 55% male, 6% Asian, 6% Black or African American, 76% white, and 12% Hispanic/Latine. Over 200 genes are represented in the BGR, with 12 or more participants harboring variants in each of these genes: CACNA1A, DNMT3A, SLC6A1, SETD5, and MYT1L. More than 30% of variants are de novo and 43% are classified as variants of uncertain significance (VUSs). Mean standard scores on cognitive or developmental screens are below average for the BGR cohort. EHR data reveal developmental delay as the earliest and most common diagnosis in this sample, followed by speech and language disorders, ASD, and ADHD. BGR data has already been used to accelerate gene-disease validity curation of 36 genes evaluated by ClinGen's BGR Intellectual Disability (ID)-Autism (ASD) Gene Curation Expert Panel. In summary, the BGR is a resource for use by stakeholders interested in advancing translational research for brain genes and continues to recruit participants with clinically reported variants to establish a rich and well-characterized national resource to promote research on neurodevelopmental disorders.


Autism Spectrum Disorder , Autistic Disorder , Intellectual Disability , Neurodevelopmental Disorders , Humans , Male , Female , Autism Spectrum Disorder/genetics , Brain , Registries , Methyltransferases
2.
Genome Res ; 34(2): 243-255, 2024 Mar 20.
Article En | MEDLINE | ID: mdl-38355306

Dozens of variants in the gene for the homeodomain transcription factor (TF) cone-rod homeobox (CRX) are linked with human blinding diseases that vary in their severity and age of onset. How different variants in this single TF alter its function in ways that lead to a range of phenotypes is unclear. We characterized the effects of human disease-causing variants on CRX cis-regulatory function by deploying massively parallel reporter assays (MPRAs) in mouse retina explants carrying knock-ins of two variants, one in the DNA-binding domain (p.R90W) and the other in the transcriptional effector domain (p.E168d2). The degree of reporter gene dysregulation in these mutant Crx retinas corresponds with their phenotypic severity. The two variants affect similar sets of enhancers, and p.E168d2 has distinct effects on silencers. Cis-regulatory elements (CREs) near cone photoreceptor genes are enriched for silencers that are derepressed in the presence of p.E168d2. Chromatin environments of CRX-bound loci are partially predictive of episomal MPRA activity, and distal elements whose accessibility increases later in retinal development are enriched for CREs with silencer activity. We identified a set of potentially pleiotropic regulatory elements that convert from silencers to enhancers in retinas that lack a functional CRX effector domain. Our findings show that phenotypically distinct variants in different domains of CRX have partially overlapping effects on its cis-regulatory function, leading to misregulation of similar sets of enhancers while having a qualitatively different impact on silencers.


Homeodomain Proteins , Trans-Activators , Animals , Humans , Mice , Homeodomain Proteins/genetics , Homeodomain Proteins/metabolism , Regulatory Sequences, Nucleic Acid , Retina/metabolism , Retinal Cone Photoreceptor Cells/metabolism , Trans-Activators/genetics , Trans-Activators/metabolism , Transcription Factors/genetics
3.
Am J Transplant ; 24(3): 458-467, 2024 Mar.
Article En | MEDLINE | ID: mdl-37468109

Primary graft dysfunction (PGD) is the leading cause of morbidity and mortality in the first 30 days after lung transplantation. Risk factors for the development of PGD include donor and recipient characteristics, but how multiple variables interact to impact the development of PGD and how clinicians should consider these in making decisions about donor acceptance remain unclear. This was a single-center retrospective cohort study to develop and evaluate machine learning pipelines to predict the development of PGD grade 3 within the first 72 hours of transplantation using donor and recipient variables that are known at the time of donor offer acceptance. Among 576 bilateral lung recipients, 173 (30%) developed PGD grade 3. The cohort underwent a 75% to 25% train-test split, and lasso regression was used to identify 11 variables for model development. A K-nearest neighbor's model showing the best calibration and performance with relatively small confidence intervals was selected as the final predictive model with an area under the receiver operating characteristics curve of 0.65. Machine learning models can predict the risk for development of PGD grade 3 based on data available at the time of donor offer acceptance. This may improve donor-recipient matching and donor utilization in the future.


Lung Transplantation , Primary Graft Dysfunction , Humans , Retrospective Studies , Primary Graft Dysfunction/diagnosis , Primary Graft Dysfunction/etiology , Lung Transplantation/adverse effects , Risk Factors , Lung
4.
bioRxiv ; 2024 Apr 06.
Article En | MEDLINE | ID: mdl-37808763

Objective: Accurately identifying clinical phenotypes from Electronic Health Records (EHRs) provides additional insights into patients' health, especially when such information is unavailable in structured data. This study evaluates the application of OpenAI's Generative Pre-trained Transformer (GPT)-4 model to identify clinical phenotypes from EHR text in non-small cell lung cancer (NSCLC) patients. The goal was to identify disease stages, treatments and progression utilizing GPT-4, and compare its performance against GPT-3.5-turbo, Flan-T5-xl, Flan-T5-xxl, and two rule-based and machine learning-based methods, namely, scispaCy and medspaCy. Materials and Methods: Phenotypes such as initial cancer stage, initial treatment, evidence of cancer recurrence, and affected organs during recurrence were identified from 13,646 records for 63 NSCLC patients from Washington University in St. Louis, Missouri. The performance of the GPT-4 model is evaluated against GPT-3.5-turbo, Flan-T5-xxl, Flan-T5-xl, medspaCy and scispaCy by comparing precision, recall, and micro-F1 scores. Results: GPT-4 achieved higher F1 score, precision, and recall compared to Flan-T5-xl, Flan-T5-xxl, medspaCy and scispaCy's models. GPT-3.5-turbo performed similarly to that of GPT-4. GPT and Flan-T5 models were not constrained by explicit rule requirements for contextual pattern recognition. SpaCy models relied on predefined patterns, leading to their suboptimal performance. Discussion and Conclusion: GPT-4 improves clinical phenotype identification due to its robust pre-training and remarkable pattern recognition capability on the embedded tokens. It demonstrates data-driven effectiveness even with limited context in the input. While rule-based models remain useful for some tasks, GPT models offer improved contextual understanding of the text, and robust clinical phenotype extraction.

5.
Neurology ; 101(14): e1424-e1433, 2023 10 03.
Article En | MEDLINE | ID: mdl-37532510

BACKGROUND AND OBJECTIVES: The capacity of specialty memory clinics in the United States is very limited. If lower socioeconomic status or minoritized racial group is associated with reduced use of memory clinics, this could exacerbate health care disparities, especially if more effective treatments of Alzheimer disease become available. We aimed to understand how use of a memory clinic is associated with neighborhood-level measures of socioeconomic factors and the intersectionality of race. METHODS: We conducted an observational cross-sectional study using electronic health record data to compare the neighborhood advantage of patients seen at the Washington University Memory Diagnostic Center with the catchment area using a geographical information system. Furthermore, we compared the severity of dementia at the initial visit between patients who self-identified as Black or White. We used a multinomial logistic regression model to assess the Clinical Dementia Rating at the initial visit and t tests to compare neighborhood characteristics, including Area Deprivation Index, with those of the catchment area. RESULTS: A total of 4,824 patients seen at the memory clinic between 2008 and 2018 were included in this study (mean age 72.7 [SD 11.0] years, 2,712 [56%] female, 543 [11%] Black). Most of the memory clinic patients lived in more advantaged neighborhoods within the overall catchment area. The percentage of patients self-identifying as Black (11%) was lower than the average percentage of Black individuals by census tract in the catchment area (16%) (p < 0.001). Black patients lived in less advantaged neighborhoods, and Black patients were more likely than White patients to have moderate or severe dementia at their initial visit (odds ratio 1.59, 95% CI 1.11-2.25). DISCUSSION: This study demonstrates that patients living in less affluent neighborhoods were less likely to be seen in one large memory clinic. Black patients were under-represented in the clinic, and Black patients had more severe dementia at their initial visit. These findings suggest that patients with a lower socioeconomic status and who identify as Black are less likely to be seen in memory clinics, which are likely to be a major point of access for any new Alzheimer disease treatments that may become available.


Alzheimer Disease , Aged , Female , Humans , Male , Alzheimer Disease/complications , Alzheimer Disease/diagnosis , Alzheimer Disease/epidemiology , Alzheimer Disease/ethnology , Alzheimer Disease/therapy , Black People , Cross-Sectional Studies , Racial Groups , Socioeconomic Factors , United States , Memory Disorders/epidemiology , Memory Disorders/ethnology , Memory Disorders/etiology , White People , Neighborhood Characteristics , Middle Aged , Aged, 80 and over
6.
bioRxiv ; 2023 Dec 02.
Article En | MEDLINE | ID: mdl-37292699

Dozens of variants in the photoreceptor-specific transcription factor (TF) CRX are linked with human blinding diseases that vary in their severity and age of onset. It is unclear how different variants in this single TF alter its function in ways that lead to a range of phenotypes. We examined the effects of human disease-causing variants on CRX cis-regulatory function by deploying massively parallel reporter assays (MPRAs) in live mouse retinas carrying knock-ins of two variants, one in the DNA binding domain (p.R90W) and the other in the transcriptional effector domain (p.E168d2). The degree of reporter gene dysregulation caused by the variants corresponds with their phenotypic severity. The two variants affect similar sets of enhancers, while p.E168d2 has stronger effects on silencers. Cis-regulatory elements (CREs) near cone photoreceptor genes are enriched for silencers that are de-repressed in the presence of p.E168d2. Chromatin environments of CRX-bound loci were partially predictive of episomal MPRA activity, and silencers were notably enriched among distal elements whose accessibility increases later in retinal development. We identified a set of potentially pleiotropic regulatory elements that convert from silencers to enhancers in retinas that lack a functional CRX effector domain. Our findings show that phenotypically distinct variants in different domains of CRX have partially overlapping effects on its cis-regulatory function, leading to misregulation of similar sets of enhancers, while having a qualitatively different impact on silencers.

7.
Spine (Phila Pa 1976) ; 48(16): 1138-1147, 2023 Aug 15.
Article En | MEDLINE | ID: mdl-37249385

STUDY DESIGN: Retrospective cohort. OBJECTIVE: The aim of this study was to design a risk-stratified benchmarking tool for adolescent idiopathic scoliosis (AIS) surgeries. SUMMARY OF BACKGROUND DATA: Machine learning (ML) is an emerging method for prediction modeling in orthopedic surgery. Benchmarking is an established method of process improvement and is an area of opportunity for ML methods. Current surgical benchmark tools often use ranks and no "gold standards" for comparisons exist. MATERIALS AND METHODS: Data from 6076 AIS surgeries were collected from a multicenter registry and divided into three datasets: encompassing surgeries performed (1) during the entire registry, (2) the past 10 years, and (3) during the last 5 years of the registry. We trained three ML regression models (baseline linear regression, gradient boosting, and eXtreme gradient boosted) on each data subset to predict each of the five outcome variables, length of stay (LOS), estimated blood loss (EBL), operative time, Scoliosis Research Society (SRS)-Pain and SRS-Self-Image. Performance was categorized as "below expected" if performing worse than one standard deviation of the mean, "as expected" if within 1 SD, and "better than expected" if better than 1 SD of the mean. RESULTS: Ensemble ML methods classified performance better than traditional regression techniques for LOS, EBL, and operative time. The best performing models for predicting LOS and EBL were trained on data collected in the last 5 years, while operative time used the entire 10-year dataset. No models were able to predict SRS-Pain or SRS-Self-Image in any useful manner. Point-precise estimates for continuous variables were subject to high average errors. CONCLUSIONS: Classification of benchmark outcomes is improved with ensemble ML techniques and may provide much needed case-adjustment for a surgeon performance program. Precise estimates of health-related quality of life scores and continuous variables were not possible, suggesting that performance classification is a better method of performance evaluation.


Kyphosis , Scoliosis , Humans , Adolescent , Scoliosis/surgery , Benchmarking , Retrospective Studies , Quality of Life , Pain
8.
JAMIA Open ; 6(1): ooad014, 2023 Apr.
Article En | MEDLINE | ID: mdl-36844369

Objectives: There is much interest in utilizing clinical data for developing prediction models for Alzheimer's disease (AD) risk, progression, and outcomes. Existing studies have mostly utilized curated research registries, image analysis, and structured electronic health record (EHR) data. However, much critical information resides in relatively inaccessible unstructured clinical notes within the EHR. Materials and Methods: We developed a natural language processing (NLP)-based pipeline to extract AD-related clinical phenotypes, documenting strategies for success and assessing the utility of mining unstructured clinical notes. We evaluated the pipeline against gold-standard manual annotations performed by 2 clinical dementia experts for AD-related clinical phenotypes including medical comorbidities, biomarkers, neurobehavioral test scores, behavioral indicators of cognitive decline, family history, and neuroimaging findings. Results: Documentation rates for each phenotype varied in the structured versus unstructured EHR. Interannotator agreement was high (Cohen's kappa = 0.72-1) and positively correlated with the NLP-based phenotype extraction pipeline's performance (average F1-score = 0.65-0.99) for each phenotype. Discussion: We developed an automated NLP-based pipeline to extract informative phenotypes that may improve the performance of eventual machine learning predictive models for AD. In the process, we examined documentation practices for each phenotype relevant to the care of AD patients and identified factors for success. Conclusion: Success of our NLP-based phenotype extraction pipeline depended on domain-specific knowledge and focus on a specific clinical domain instead of maximizing generalizability.

9.
JAMIA Open ; 4(3): ooab052, 2021 Jul.
Article En | MEDLINE | ID: mdl-34350389

OBJECTIVE: Alzheimer disease (AD) is the most common cause of dementia, a syndrome characterized by cognitive impairment severe enough to interfere with activities of daily life. We aimed to conduct a systematic literature review (SLR) of studies that applied machine learning (ML) methods to clinical data derived from electronic health records in order to model risk for progression of AD dementia. MATERIALS AND METHODS: We searched for articles published between January 1, 2010, and May 31, 2020, in PubMed, Scopus, ScienceDirect, IEEE Explore Digital Library, Association for Computing Machinery Digital Library, and arXiv. We used predefined criteria to select relevant articles and summarized them according to key components of ML analysis such as data characteristics, computational algorithms, and research focus. RESULTS: There has been a considerable rise over the past 5 years in the number of research papers using ML-based analysis for AD dementia modeling. We reviewed 64 relevant articles in our SLR. The results suggest that majority of existing research has focused on predicting progression of AD dementia using publicly available datasets containing both neuroimaging and clinical data (neurobehavioral status exam scores, patient demographics, neuroimaging data, and laboratory test values). DISCUSSION: Identifying individuals at risk for progression of AD dementia could potentially help to personalize disease management to plan future care. Clinical data consisting of both structured data tables and clinical notes can be effectively used in ML-based approaches to model risk for AD dementia progression. Data sharing and reproducibility of results can enhance the impact, adaptation, and generalizability of this research.

10.
Nat Commun ; 12(1): 2557, 2021 05 07.
Article En | MEDLINE | ID: mdl-33963188

The genetic modules that contribute to human evolution are poorly understood. Here we investigate positive selection in the Epidermal Differentiation Complex locus for skin barrier adaptation in diverse HapMap human populations (CEU, JPT/CHB, and YRI). Using Composite of Multiple Signals and iSAFE, we identify selective sweeps for LCE1A-SMCP and involucrin (IVL) haplotypes associated with human migration out-of-Africa, reaching near fixation in European populations. CEU-IVL is associated with increased IVL expression and a known epidermis-specific enhancer. CRISPR/Cas9 deletion of the orthologous mouse enhancer in vivo reveals a functional requirement for the enhancer to regulate Ivl expression in cis. Reporter assays confirm increased regulatory and additive enhancer effects of CEU-specific polymorphisms identified at predicted IRF1 and NFIC binding sites in the IVL enhancer (rs4845327) and its promoter (rs1854779). Together, our results identify a selective sweep for a cis regulatory module for CEU-IVL, highlighting human skin barrier evolution for increased IVL expression out-of-Africa.


Enhancer Elements, Genetic , Gene Expression Regulation/genetics , Protein Precursors/genetics , Skin/metabolism , Africa , Alleles , Animals , CRISPR-Cas Systems , Chromatin/genetics , Chromatin/metabolism , Chromatin Immunoprecipitation Sequencing , Databases, Genetic , Gene Frequency , Haplotypes , Humans , Mice , Mice, Inbred BALB C , Mice, Inbred C57BL , Polymorphism, Genetic , Polymorphism, Single Nucleotide , Protein Precursors/metabolism , Quantitative Trait Loci , RNA-Seq , Regulatory Sequences, Nucleic Acid
11.
Adv Exp Med Biol ; 1185: 359-364, 2019.
Article En | MEDLINE | ID: mdl-31884638

Inherited retinal degenerations are diverse and debilitating blinding diseases. Genetic tests and exome sequencing have identified mutations in many protein-coding genes associated with such diseases, but causal sequence variants remain to be found in many retinopathy cases. Since 99% of our genome does not code for protein but contains cis-regulatory elements (CREs) that regulate the expression of essential genes, CRE variants might hold the answer for some of these cases. However, identifying functional CREs within the noncoding genome and predicting the pathogenicity of CRE variants pose a significant challenge. Here, we review the development of massively parallel reporter assays in the mouse retina, its use in dissecting retinal cis-regulatory networks, and its potential application for developing therapies.


Gene Regulatory Networks , High-Throughput Nucleotide Sequencing , Regulatory Sequences, Nucleic Acid , Retina , Retinal Diseases/genetics , Animals , Mice
12.
J Vis Exp ; (140)2018 10 05.
Article En | MEDLINE | ID: mdl-30346381

The identification of regulatory elements for a given target gene poses a significant technical challenge owing to the variability in the positioning and effect sizes of regulatory elements to a target gene. Some progress has been made with the bioinformatic prediction of the existence and function of proximal epigenetic modifications associated with activated gene expression using conserved transcription factor binding sites. Chromatin conformation capture studies have revolutionized our ability to discover physical chromatin contacts between sequences and even within an entire genome. Circular chromatin conformation capture coupled with next-generation sequencing (4C-seq), in particular, is designed to discover all possible physical chromatin interactions for a given sequence of interest (viewpoint), such as a target gene or a regulatory enhancer. Current 4C-seq strategies directly sequence from within the viewpoint but require numerous and diverse viewpoints to be simultaneously sequenced to avoid the technical challenges of uniform base calling (imaging) with next generation sequencing platforms. This volume of experiments may not be practical for many laboratories. Here, we report a modified approach to the 4C-seq protocol that incorporates both an additional restriction enzyme digest and qPCR-based amplification steps that are designed to facilitate a greater capture of diverse sequence reads and mitigate the potential for PCR bias, respectively. Our modified 4C method is amenable to the standard molecular biology lab for assessing chromatin architecture.


Chromosomes/chemistry , High-Throughput Nucleotide Sequencing , Regulatory Sequences, Nucleic Acid/genetics , Sequence Analysis, DNA , Chromatin/chemistry , Chromatin/metabolism , DNA Restriction Enzymes/metabolism , Epigenesis, Genetic , Genome/genetics , Nucleic Acid Conformation , Polymerase Chain Reaction
13.
J Invest Dermatol ; 137(5): e101-e104, 2017 05.
Article En | MEDLINE | ID: mdl-28411839

The epidermal differentiation complex (EDC) locus consists of a cluster of genes important for the terminal differentiation of the epidermis. While early studies identified the functional importance of individual EDC genes, the recognition of the EDC genes as a cluster with its shared biology, homology, and physical linkage was pivotal to later studies that investigated the transcriptional regulation of the locus. Evolutionary conservation of the EDC and the transcriptional activation during epidermal differentiation suggested a cis-regulatory mechanism via conserved noncoding elements or enhancers. This line of pursuit led to the identification of CNE 923, an epidermal-specific enhancer that was found to mediate chromatin remodeling of the EDC in an AP-1 dependent manner. These genomic studies, as well as the advent of high-throughput sequencing and genome engineering techniques, have paved the way for future investigation into enhancer-mediated regulatory networks in cutaneous biology.


Cell Differentiation/genetics , Enhancer Elements, Genetic/genetics , Epidermis/physiology , Animals , Chromatin Assembly and Disassembly/genetics , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Molecular Biology/methods , Transcription Factor AP-1/genetics
14.
J Invest Dermatol ; 134(9): 2371-2380, 2014 Sep.
Article En | MEDLINE | ID: mdl-24468747

The epidermal differentiation complex (EDC) locus comprises a syntenic and linear cluster of genes whose concomitant expression is a hallmark feature of differentiation in the developing skin epidermis. Many of the EDC proteins are cross-linked together to form the cornified envelope, an essential and discrete unit of the mammalian skin barrier. The mechanism underlying coordinate transcriptional activation of the EDC is unknown. Within the human EDC, we identified an epidermal-specific regulatory enhancer, 923, which responded to the developmental and spatiotemporal cues at the onset of epidermal differentiation in the mouse embryo. Comparative chromosomal conformation capture assays in proliferating and differentiated primary mouse keratinocytes revealed multiple physiologically sensitive chromatin interactions between the 923 enhancer and EDC gene promoters, thus depicting the dynamic chromatin topology of the EDC. We elucidate a mechanistic link between c-Jun/AP-1 and 923, whereby AP-1- and 923-mediated EDC chromatin remodeling are required for functional EDC gene activation. Thus, we identify a critical enhancer/transcription factor axis governing the dynamic regulation of the EDC chromatin architecture and gene expression and provide a framework for future studies toward understanding gene regulation in cutaneous diseases.


Chromatin/physiology , Enhancer Elements, Genetic/genetics , Epidermis/physiology , Gene Expression Regulation, Developmental/genetics , Transcription Factor AP-1/genetics , Animals , Animals, Newborn , Cell Differentiation/genetics , Epidermal Cells , Epidermis/embryology , Female , Humans , Lac Operon , Mice, Inbred Strains , Mice, Transgenic , Multigene Family/genetics , Pregnancy , RNA/genetics , Transcription Factor AP-1/metabolism
...