RESUMEN
Acute onset of severe psychiatric symptoms or regression may occur in children with premorbid neurodevelopmental disorders, although typically developing children can also be affected. Infections or other stressors are likely triggers. The underlying causes are unclear, but a current hypothesis suggests the convergence of genes that influence neuronal and immunological function. We previously identified 11 genes in Pediatric Acute-Onset Neuropsychiatry Syndrome (PANS), in which two classes of genes related to either synaptic function or the immune system were found. Among the latter, three affect the DNA damage response (DDR): PPM1D, CHK2, and RAG1. We now report an additional 17 cases with mutations in PPM1D and other DDR genes in patients with acute onset of psychiatric symptoms and/or regression that their clinicians classified as PANS or another inflammatory brain condition. The genes include clusters affecting p53 DNA repair (PPM1D, ATM, ATR, 53BP1, and RMRP), and the Fanconi Anemia Complex (FANCE, SLX4/FANCP, FANCA, FANCI, and FANCC). We hypothesize that defects in DNA repair genes, in the context of infection or other stressors, could contribute to decompensated states through an increase in genomic instability with a concomitant accumulation of cytosolic DNA in immune cells triggering DNA sensors, such as cGAS-STING and AIM2 inflammasomes, as well as central deficits on neuroplasticity. In addition, increased senescence and defective apoptosis affecting immunological responses could be playing a role. These compelling preliminary findings motivate further genetic and functional characterization as the downstream impact of DDR deficits may point to novel treatment strategies.
RESUMEN
22q11.2 deletion is one of the strongest known genetic risk factors for schizophrenia. Recent whole-genome sequencing of schizophrenia cases and controls with this deletion provided an unprecedented opportunity to identify risk modifying genetic variants and investigate their contribution to the pathogenesis of schizophrenia in 22q11.2 deletion syndrome. Here, we apply a novel analytic framework that integrates gene network and phenotype data to investigate the aggregate effects of rare coding variants and identified modifier genes in this etiologically homogenous cohort (223 schizophrenia cases and 233 controls of European descent). Our analyses revealed significant additive genetic components of rare nonsynonymous variants in 110 modifier genes (adjusted P = 9.4E-04) that overall accounted for 4.6% of the variance in schizophrenia status in this cohort, of which 4.0% was independent of the common polygenic risk for schizophrenia. The modifier genes affected by rare coding variants were enriched with genes involved in synaptic function and developmental disorders. Spatiotemporal transcriptomic analyses identified an enrichment of coexpression between modifier and 22q11.2 genes in cortical brain regions from late infancy to young adulthood. Corresponding gene coexpression modules are enriched with brain-specific protein-protein interactions of SLC25A1, COMT, and PI4KA in the 22q11.2 deletion region. Overall, our study highlights the contribution of rare coding variants to the SCZ risk. They not only complement common variants in disease genetics but also pinpoint brain regions and developmental stages critical to the etiology of syndromic schizophrenia.
Asunto(s)
Síndrome de DiGeorge , Esquizofrenia , Humanos , Adulto Joven , Adulto , Esquizofrenia/genética , Síndrome de DiGeorge/genética , Encéfalo , Perfilación de la Expresión Génica , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: Bipolar disorder (BD) is associated with cognitive impairment and mitochondrial dysfunction. However, the associations among mitochondrial DNA copy number (MCN), treatment response, and cognitive function remain elusive in BD patients. METHODS: Sixty euthymic BD patients receiving valproate (VPA) and 66 healthy controls from the community were recruited. The indices of metabolic syndrome (MetS) were measured. Quantitative polymerase chain reaction analysis of blood leukocytes was used to measure the MCN. Cognitive function was measured by calculating perseverative errors and completed categories on the Wisconsin Card Sorting Test (WCST). The VPA treatment response was measured using the Alda scale. RESULTS: BD patients had significantly higher MCN, triglyceride, and C-reactive protein (CRP) levels, waist circumference, and worse performance on the WCST than the controls. Regression models showed that BD itself and the VPA concentration exerted significant effects on increased MCN levels. Moreover, the receiver operating characteristic curve analysis showed that an MCN of 2.05 distinguished VPA responders from nonresponders, with an area under the curve of 0.705 and a sensitivity and specificity of 0.529 and 0.816, respectively. An MCN level ≥2.05 was associated with 5.39 higher odds of being a VPA responder (P = .006). BD patients who were stratified into the high-MCN group had a higher VPA response rate, better WCST performance, lower CRP level, and less MetS. CONCLUSIONS: The study suggests a link between the peripheral MCN and cognitive function in BD patients. As an inflammatory status, MetS might modulate this association.
Asunto(s)
Trastorno Bipolar , Síndrome Metabólico , Cognición , Variaciones en el Número de Copia de ADN , ADN Mitocondrial/genética , Humanos , Mitocondrias/metabolismo , Pruebas Neuropsicológicas , Ácido Valproico/uso terapéuticoRESUMEN
Schizophrenia occurs in about one in four individuals with 22q11.2 deletion syndrome (22q11.2DS). The aim of this International Brain and Behavior 22q11.2DS Consortium (IBBC) study was to identify genetic factors that contribute to schizophrenia, in addition to the ~20-fold increased risk conveyed by the 22q11.2 deletion. Using whole-genome sequencing data from 519 unrelated individuals with 22q11.2DS, we conducted genome-wide comparisons of common and rare variants between those with schizophrenia and those with no psychotic disorder at age ≥25 years. Available microarray data enabled direct comparison of polygenic risk for schizophrenia between 22q11.2DS and independent population samples with no 22q11.2 deletion, with and without schizophrenia (total n = 35,182). Polygenic risk for schizophrenia within 22q11.2DS was significantly greater for those with schizophrenia (padj = 6.73 × 10-6). Novel reciprocal case-control comparisons between the 22q11.2DS and population-based cohorts showed that polygenic risk score was significantly greater in individuals with psychotic illness, regardless of the presence of the 22q11.2 deletion. Within the 22q11.2DS cohort, results of gene-set analyses showed some support for rare variants affecting synaptic genes. No common or rare variants within the 22q11.2 deletion region were significantly associated with schizophrenia. These findings suggest that in addition to the deletion conferring a greatly increased risk to schizophrenia, the risk is higher when the 22q11.2 deletion and common polygenic risk factors that contribute to schizophrenia in the general population are both present.
Asunto(s)
Síndrome de DiGeorge , Trastornos Psicóticos , Esquizofrenia , Adulto , Estudios de Casos y Controles , Estudios de Cohortes , Síndrome de DiGeorge/genética , Humanos , Esquizofrenia/genéticaRESUMEN
Enhancers, as specialized genomic cis-regulatory elements, activate transcription of their target genes and play an important role in pathogenesis of many human complex diseases. Despite recent systematic identification of them in the human genome, currently there is an urgent need for comprehensive annotation databases of human enhancers with a focus on their disease connections. In response, we built the Human Enhancer Disease Database (HEDD) to facilitate studies of enhancers and their potential roles in human complex diseases. HEDD currently provides comprehensive genomic information for â¼2.8 million human enhancers identified by ENCODE, FANTOM5 and RoadMap with disease association scores based on enhancer-gene and gene-disease connections. It also provides Web-based analytical tools to visualize enhancer networks and score enhancers given a set of selected genes in a specific gene network. HEDD is freely accessible at http://zdzlab.einstein.yu.edu/1/hedd.php.
Asunto(s)
Bases de Datos de Ácidos Nucleicos , Elementos de Facilitación Genéticos , Cromosomas Humanos Par 9/genética , Enfermedad/genética , Redes Reguladoras de Genes , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Internet , Anotación de Secuencia Molecular , Herencia Multifactorial , Polimorfismo de Nucleótido SimpleRESUMEN
Rare variants of major effect play an important role in human complex diseases and can be discovered by sequencing-based genome-wide association studies. Here, we introduce an integrated approach that combines the rare variant association test with gene network and phenotype information to identify risk genes implicated by rare variants for human complex diseases. Our data integration method follows a 'discovery-driven' strategy without relying on prior knowledge about the disease and thus maintains the unbiased character of genome-wide association studies. Simulations reveal that our method can outperform a widely-used rare variant association test method by 2 to 3 times. In a case study of a small disease cohort, we uncovered putative risk genes and the corresponding rare variants that may act as genetic modifiers of congenital heart disease in 22q11.2 deletion syndrome patients. These variants were missed by a conventional approach that relied on the rare variant association test alone.
Asunto(s)
Predisposición Genética a la Enfermedad , Variación Genética , Estudio de Asociación del Genoma Completo/métodos , Análisis de Secuencia de ADN/métodos , Estudios de Casos y Controles , Simulación por Computador , Interpretación Estadística de Datos , Síndrome de DiGeorge/genética , Humanos , Fenotipo , Factores de Riesgo , Análisis de Secuencia de ADN/estadística & datos numéricosRESUMEN
Summary: Although the genome-wide association study (GWAS) is a powerful method to identify disease-associated variants, it does not directly address the biological mechanisms underlying such genetic association signals. Here, we present PGA, a Perl- and Java-based program for post-GWAS analysis that predicts likely disease genes given a list of GWAS-reported variants. Designed with a command line interface, PGA incorporates genomic and eQTL data in identifying disease gene candidates and uses gene network and ontology data to score them based upon the strength of their relationship to the disease in question. Availability and implementation: http://zdzlab.einstein.yu.edu/1/pga.html. Contact: zhengdong.zhang@einstein.yu.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Asunto(s)
Predisposición Genética a la Enfermedad , Programas Informáticos , Redes Reguladoras de Genes , Estudio de Asociación del Genoma Completo , Genómica/métodos , Humanos , Prostaglandinas A , Sitios de Carácter CuantitativoRESUMEN
Although studies over the last decades have firmly connected a number of genes and molecular pathways to aging, the aging process as a whole still remains poorly understood. To gain novel insights into the mechanisms underlying aging, instead of considering aging genes individually, we studied their characteristics at the systems level in the context of biological networks. We calculated a comprehensive set of network characteristics for human aging-related genes from the GenAge database. By comparing them with other functional groups of genes, we identified a robust group of aging-specific network characteristics. To find the structural basis and the molecular mechanisms underlying this aging-related network specificity, we also analyzed protein domain interactions and gene expression patterns across different tissues. Our study revealed that aging genes not only tend to be network hubs, playing important roles in communication among different functional modules or pathways, but also are more likely to physically interact and be co-expressed with essential genes. The high expression of aging genes across a large number of tissue types also points to a high level of connectivity among aging genes. Unexpectedly, contrary to the depletion of interactions among hub genes in biological networks, we observed close interactions among aging hubs, which renders the aging subnetworks vulnerable to random attacks and thus may contribute to the aging process. Comparison across species reveals the evolution process of the aging subnetwork. As the organisms become more complex, the complexity of its aging mechanisms increases and their aging hub genes are more functionally connected.
Asunto(s)
Envejecimiento/genética , Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Biología de Sistemas , Biología Computacional , Bases de Datos Genéticas , Genes Esenciales/genética , Humanos , Transducción de Señal/genéticaRESUMEN
BACKGROUND: Malignant breast cancer with complex molecular mechanisms of progression and metastasis remains a leading cause of death in women. To improve diagnosis and drug development, it is critical to identify panels of genes and molecular pathways involved in tumor progression and malignant transition. Using the PyMT mouse, a genetically engineered mouse model that has been widely used to study human breast cancer, we profiled and analyzed gene expression from four distinct stages of tumor progression (hyperplasia, adenoma/MIN, early carcinoma and late carcinoma) during which malignant transition occurs. RESULTS: We found remarkable expression similarity among the four stages, meaning genes altered in the later stages showed trace in the beginning of tumor progression. We identified a large number of differentially expressed genes in PyMT samples of all stages compared with normal mammary glands, enriched in cancer-related pathways. Using co-expression networks, we found panels of genes as signature modules with some hub genes that predict metastatic risk. Time-course analysis revealed genes with expression transition when shifting to malignant stages. These may provide additional insight into the molecular mechanisms beyond pathways. CONCLUSIONS: Thus, in this study, our various analyses with the PyMT mouse model shed new light on transcriptomic dynamics during breast cancer malignant progression.
Asunto(s)
Antígenos Transformadores de Poliomavirus/genética , Progresión de la Enfermedad , Perfilación de la Expresión Génica , Neoplasias Mamarias Experimentales/genética , Neoplasias Mamarias Experimentales/virología , Virus del Tumor Mamario del Ratón/genética , Virus del Tumor Mamario del Ratón/fisiología , Animales , Modelos Animales de Enfermedad , Femenino , Expresión Génica , Redes Reguladoras de Genes , Humanos , Neoplasias Mamarias Experimentales/patología , Ratones , Metástasis de la Neoplasia , Estadificación de NeoplasiasRESUMEN
BACKGROUND: MicroRNAs (miRNAs) are small non-coding RNA molecules of about 22 nucleotides which function to silence the expression of their target genes. Numerous studies have shown that miRNAs are not only key regulators in important cellular processes but are also drivers in the development of many diseases, especially cancer. Estrogen receptor positive luminal B is the second most common but the least studied subtype of breast cancer. Only a few studies have examined the expression profiles of miRNAs in luminal B breast cancer, and their regulatory roles in cancer progression have yet to be investigated. METHODS: In this study, using polyoma middle T antigen (PyMT) mice, a widely used luminal B breast cancer model, we profiled microRNA (miRNA) expression at four time points that represent different key developmental stages of cancer progression. We considered the expression of both miRNAs and messenger RNAs (mRNAs) at these time points to improve the identification of regulatory targets of miRNAs. By combining gene functional and pathway annotation with miRNA-mRNA interactions, we created a PyMT-specific tripartite miRNA-mRNA-pathway network and identified novel functional regulatory programs (FRPs). RESULTS: We identified 151 differentially expressed miRNAs with a strict dual nature of either upregulation or downregulation during the whole course of disease progression. Among 82 newly discovered breast-cancer-related miRNAs, 35 can potentially regulate 271 protein-coding genes based on their sequence complementarity and expression profiles. We also identified miRNA-mRNA regulatory modules driving specific cancer-related biological processes. CONCLUSIONS: In this study we profiled the expression of miRNAs during breast cancer progression in the PyMT mouse model. By integrating miRNA and mRNA expression profiles, we identified differentially expressed miRNAs and their target genes involved in several hallmarks of cancer. We applied a novel clustering method to an annotated miRNA-mRNA regulatory network and identified network modules involved in specific cancer-related biological processes.
Asunto(s)
Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Regulación Neoplásica de la Expresión Génica , MicroARNs/genética , Animales , Neoplasias de la Mama/metabolismo , Análisis por Conglomerados , Biología Computacional/métodos , Modelos Animales de Enfermedad , Progresión de la Enfermedad , Ácidos Grasos/metabolismo , Femenino , Perfilación de la Expresión Génica , Ontología de Genes , Redes Reguladoras de Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Ratones , Ratones Transgénicos , Metástasis de la Neoplasia , ARN Mensajero/genética , TranscriptomaRESUMEN
The binding affinity between a nuclear localization signal (NLS) and its import receptor is closely related to corresponding nuclear import activity. PTM-based modulation of the NLS binding affinity to the import receptor is one of the most understood mechanisms to regulate nuclear import of proteins. However, identification of such regulation mechanisms is challenging due to the difficulty of assessing the impact of PTM on corresponding nuclear import activities. In this study we proposed NIpredict, an effective algorithm to predict nuclear import activity given its NLS, in which molecular interaction energy components (MIECs) were used to characterize the NLS-import receptor interaction, and the support vector regression machine (SVR) was used to learn the relationship between the characterized NLS-import receptor interaction and the corresponding nuclear import activity. Our experiments showed that nuclear import activity change due to NLS change could be accurately predicted by the NIpredict algorithm. Based on NIpredict, we developed a systematic framework to identify potential PTM-based nuclear import regulations for human and yeast nuclear proteins. Application of this approach has identified the potential nuclear import regulation mechanisms by phosphorylation of two nuclear proteins including SF1 and ORC6.
Asunto(s)
Transporte Activo de Núcleo Celular , Proteínas de Unión al ADN/metabolismo , Modelos Biológicos , Señales de Localización Nuclear/metabolismo , Complejo de Reconocimiento del Origen/metabolismo , Procesamiento Proteico-Postraduccional , Proteínas de Saccharomyces cerevisiae/metabolismo , Factores de Transcripción/metabolismo , alfa Carioferinas/metabolismo , Algoritmos , Inteligencia Artificial , Biología Computacional , Proteínas de Unión al ADN/química , Bases de Datos de Proteínas , Humanos , Internet , Cinética , Señales de Localización Nuclear/química , Complejo de Reconocimiento del Origen/química , Fosforilación , Conformación Proteica , Dominios y Motivos de Interacción de Proteínas , Isoformas de Proteínas , Factores de Empalme de ARN , Proteínas de Saccharomyces cerevisiae/química , Serina/metabolismo , Validación de Programas de Computación , Factores de Transcripción/química , alfa Carioferinas/químicaRESUMEN
Background: Approximately 40% of people aged 65 or older experience memory loss, particularly in episodic memory. Identifying the genetic basis of episodic memory decline is crucial for uncovering its underlying causes. Methods: We investigated common and rare genetic variants associated with episodic memory decline in 742 (632 for rare variants) Ashkenazi Jewish individuals (mean age 75) from the LonGenity study. All-atom MD simulations were performed to uncover mechanistic insights underlying rare variants associated with episodic memory decline. Results: In addition to the common polygenic risk of Alzheimer's Disease (AD), we identified and replicated rare variant association in ITSN1 and CRHR2 . Structural analyses revealed distinct memory pathologies mediated by interfacial rare coding variants such as impaired receptor activation of corticotropin releasing hormone and dysregulated L-serine synthesis. Discussion: Our study uncovers novel risk loci for episodic memory decline. The identified underlying mechanisms point toward heterogeneous memory pathologies mediated by rare coding variants.
RESUMEN
The highly polygenic nature of human longevity renders pleiotropy an indispensable feature of its genetic architecture. Leveraging the genetic correlation between aging-related traits (ARTs), we aimed to model the additive variance in lifespan as a function of the cumulative liability from pleiotropic segregating variants. We tracked allele frequency changes as a function of viability across different age bins and prioritized 34 variants with an immediate implication on lipid metabolism, body mass index (BMI), and cognitive performance, among other traits, revealed by PheWAS analysis in the UK Biobank. Given the highly complex and non-linear interactions between the genetic determinants of longevity, we reasoned that a composite polygenic score would approximate a substantial portion of the variance in lifespan and developed the integrated longevity genetic scores (iLGSs) for distinguishing exceptional survival. We showed that coefficients derived from our ensemble model could potentially reveal an interesting pattern of genomic pleiotropy specific to lifespan. We assessed the predictive performance of our model for distinguishing the enrichment of exceptional longevity among long-lived individuals in two replication cohorts (the Scripps Wellderly cohort and the Medical Genome Reference Bank (MRGB)) and showed that the median lifespan in the highest decile of our composite prognostic index is up to 4.8 years longer. Finally, using the proteomic correlates of iLGS, we identified protein markers associated with exceptional longevity irrespective of chronological age and prioritized drugs with repurposing potentials for gerotherapeutics. Together, our approach demonstrates a promising framework for polygenic modeling of additive liability conferred by ARTs in defining exceptional longevity and assisting the identification of individuals at a higher risk of mortality for targeted lifestyle modifications earlier in life. Furthermore, the proteomic signature associated with iLGS highlights the functional pathway upstream of the PI3K-Akt that can be effectively targeted to slow down aging and extend lifespan.
Asunto(s)
Pleiotropía Genética , Longevidad , Herencia Multifactorial , Humanos , Longevidad/genética , Herencia Multifactorial/genética , Femenino , Masculino , Envejecimiento/genética , Anciano , Anciano de 80 o más Años , Polimorfismo de Nucleótido Simple , Persona de Mediana Edad , Estudio de Asociación del Genoma Completo , Frecuencia de los GenesRESUMEN
BACKGROUND: Hematoxylin and Eosin (H&E)-based frozen section (FS) pathology is presently the global standard for intraoperative tumor assessment (ITA). Preparation of frozen section is labor intensive, which might consume up-to 30 minutes, and is susceptible to freezing artifacts. An FS-alternative technique is thus necessary, which is sectioning-free, artifact-free, fast, accurate, and reliably deployable without machine learning and/or additional interpretation training. METHODS: We develop a training-free true-H&E Rapid Fresh digital-Pathology (the-RFP) technique which is 4 times faster than the conventional preparation of frozen sections. The-RFP is assisted by a mesoscale Nonlinear Optical Gigascope (mNLOG) platform with a streamlined rapid artifact-compensated 2D large-field mosaic-stitching (rac2D-LMS) approach. A sub-6-minute True-H&E Rapid whole-mount-Soft-Tissue Staining (the-RSTS) protocol is introduced for soft/frangible fresh brain specimens. The mNLOG platform utilizes third harmonic generation (THG) and two-photon excitation fluorescence (TPEF) signals from H and E dyes, respectively, to yield the-RFP images. RESULTS: We demonstrate the-RFP technique on fresh excised human brain specimens. The-RFP enables optically-sectioned high-resolution 2D scanning and digital display of a 1 cm2 area in <120 seconds with 3.6 Gigapixels at a sustained effective throughput of >700 M bits/sec, with zero post-acquisition data/image processing. Training-free blind tests considering 50 normal and tumor-specific brain specimens obtained from 8 participants reveal 100% match to the respective formalin-fixed paraffin-embedded (FFPE)-biopsy outcomes. CONCLUSIONS: We provide a digital ITA solution: the-RFP, which is potentially a fast and reliable alternative to FS-pathology. With H&E-compatibility, the-RFP eliminates color- and morphology-specific additional interpretation training for a pathologist, and the-RFP-assessed specimen can reliably undergo FFPE-biopsy confirmation.
Brain tumors can be fatal and surgery is often required to remove them. During surgery, clinicians need to look for any leftover tumor tissue so that recurrence of the disease can be avoided. This requires sectioning of frozen tissue samples, staining them, and visualizing structural details under a microscope in the lab. This process should be fast to make the operation shorter and safer for the patient. Here, we provide an alternative approach to staining and imaging tumor samples, which is much faster than the current process. We show that our approach works with fresh tumor samples, avoiding the need to freeze and physically section them. We can distinguish normal versus tumor tissues, and pathologists do not require special training to use our approach. Our approach might ultimately help to improve the speed, safety, and outcomes of brain tumor surgery.
RESUMEN
The highly polygenic nature of human longevity renders cross-trait pleiotropy an indispensable feature of its genetic architecture. Leveraging the genetic correlation between the aging-related traits (ARTs), we sought to model the additive variance in lifespan as a function of cumulative liability from pleiotropic segregating variants. We tracked allele frequency changes as a function of viability across different age bins and prioritized 34 variants with an immediate implication on lipid metabolism, body mass index (BMI), and cognitive performance, among other traits, revealed by PheWAS analysis in the UK Biobank. Given the highly complex and non-linear interactions between the genetic determinants of longevity, we reasoned that a composite polygenic score would approximate a substantial portion of the variance in lifespan and developed the integrated longevity genetic scores (iLGSs) for distinguishing exceptional survival. We showed that coefficients derived from our ensemble model could potentially reveal an interesting pattern of genomic pleiotropy specific to lifespan. We assessed the predictive performance of our model for distinguishing the enrichment of exceptional longevity among long-lived individuals in two replication cohorts and showed that the median lifespan in the highest decile of our composite prognostic index is up to 4.8 years longer. Finally, using the proteomic correlates of iLGS, we identified protein markers associated with exceptional longevity irrespective of chronological age and prioritized drugs with repurposing potentials for gerotherapeutics. Together, our approach demonstrates a promising framework for polygenic modeling of additive liability conferred by ARTs in defining exceptional longevity and assisting the identification of individuals at higher risk of mortality for targeted lifestyle modifications earlier in life. Furthermore, the proteomic signature associated with iLGS highlights the functional pathway upstream of the PI3K-Akt that can be effectively targeted to slow down aging and extend lifespan.
RESUMEN
Congenital heart disease (CHD) affecting the conotruncal region of the heart, occurs in 40-50% of patients with 22q11.2 deletion syndrome (22q11.2DS). This syndrome is a rare disorder with relative genetic homogeneity that can facilitate identification of genetic modifiers. Haploinsufficiency of TBX1, encoding a T-box transcription factor, is one of the main genes responsible for the etiology of the syndrome. We suggest that genetic modifiers of conotruncal defects in patients with 22q11.2DS may be in the TBX1 gene network. To identify genetic modifiers, we analyzed rare, predicted damaging variants in whole genome sequence of 456 cases with conotruncal defects and 537 controls, with 22q11.2DS. We then performed gene set approaches and identified chromatin regulatory genes as modifiers. Chromatin genes with recurrent damaging variants include EP400, KAT6A, KMT2C, KMT2D, NSD1, CHD7 and PHF21A. In total, we identified 37 chromatin regulatory genes, that may increase risk for conotruncal heart defects in 8.5% of 22q11.2DS cases. Many of these genes were identified as risk factors for sporadic CHD in the general population. These genes are co-expressed in cardiac progenitor cells with TBX1, suggesting that they may be in the same genetic network. The genes KAT6A, KMT2C, CHD7 and EZH2, have been previously shown to genetically interact with TBX1 in mouse models. Our findings indicate that disturbance of chromatin regulatory genes impact the TBX1 gene network serving as genetic modifiers of 22q11.2DS and sporadic CHD, suggesting that there are some shared mechanisms involving the TBX1 gene network in the etiology of CHD.
RESUMEN
BACKGROUND: Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. RESULTS: This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. CONCLUSIONS: We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi.
Asunto(s)
Algoritmos , Biología Computacional/métodos , Mapeo de Interacción de Proteínas/métodos , Proteínas/genética , Área Bajo la Curva , Genoma Fúngico , Genoma Humano , Humanos , Internet , Modelos Logísticos , Saccharomyces cerevisiae/genética , Programas InformáticosRESUMEN
Hereditary transthyretin (ATTRv) amyloidosis is a systemic disease with amyloid deposition in the peripheral and autonomic nervous systems caused by mutation of transthyretin (TTR) gene. The mutant TTR S77Y is the second prevalent mutation in many countries. In Taiwan, A97S mutant accounts for more than 90% of cases. Although distinct clinical manifestations such as dysphagia, carpal tunnel syndrome, and sudden cardiac death occur, the underlying pathology has not been elucidated. Here, we report the first autopsy cases of ATTRv S77Y and A97S and comprehensively compare the pathology underlying the unique clinical manifestations. This study demonstrated the following: (1) distinct spatial patterns of amyloid deposits in peripheral nerves, with a tendency toward more amyloid deposition in the large peripheral nerves, particularly the median nerves, and scarcely in the sural nerves, and different amyloid distribution in different genotypes; (2) amyloid deposits in the conduction system of the heart in addition to surrounding cardiomyocytes; (3) extensive amyloid deposits in the larynx and gastrointestinal tract, contributing to the unique clinical symptom of dysphagia; and (4) characteristic TTR intracytoplasmic inclusions in the hepatocytes of A97S. The pathology of the first autopsied cases of ATTRv S77Y and A97S provides pathology and mechanisms underlying unique clinical manifestations.
Asunto(s)
Neuropatías Amiloides Familiares , Trastornos de Deglución , Neuropatías Amiloides Familiares/genética , Neuropatías Amiloides Familiares/patología , Autopsia , Humanos , Placa Amiloide , Prealbúmina/genéticaRESUMEN
Alzheimer's disease (AD) is a genetically complex, multifactorial neurodegenerative disease. It affects more than 45 million people worldwide and currently remains untreatable. Although genome-wide association studies (GWAS) have identified many AD-associated common variants, only about 25 genes are currently known to affect the risk of developing AD, despite its highly polygenic nature. Moreover, the risk variants underlying GWAS AD-association signals remain unknown. Here, we describe a deep post-GWAS analysis of AD-associated variants, using an integrated computational framework for predicting both disease genes and their risk variants. We identified 342 putative AD risk genes in 203 risk regions spanning 502 AD-associated common variants. 246 AD risk genes have not been identified as AD risk genes by previous GWAS collected in GWAS catalogs, and 115 of 342 AD risk genes are outside the risk regions, likely under the regulation of transcriptional regulatory elements contained therein. Even more significantly, for 109 AD risk genes, we predicted 150 risk variants, of both coding and regulatory (in promoters or enhancers) types, and 85 (57%) of them are supported by functional annotation. In-depth functional analyses showed that AD risk genes were overrepresented in AD-related pathways or GO terms-e.g., the complement and coagulation cascade and phosphorylation and activation of immune response-and their expression was relatively enriched in microglia, endothelia, and pericytes of the human brain. We found nine AD risk genes-e.g., IL1RAP, PMAIP1, LAMTOR4-as predictors for the prognosis of AD survival and genes such as ARL6IP5 with altered network connectivity between AD patients and normal individuals involved in AD progression. Our findings open new strategies for developing therapeutics targeting AD risk genes or risk variants to influence AD pathogenesis.
Asunto(s)
Enfermedad de Alzheimer/genética , Encéfalo/metabolismo , Enfermedad de Alzheimer/metabolismo , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Polimorfismo de Nucleótido SimpleRESUMEN
The North American beaver is an exceptionally long-lived and cancer-resistant rodent species. Here, we report the evolutionary changes in its gene coding sequences, copy numbers, and expression. We identify changes that likely increase its ability to detoxify aldehydes, enhance tumor suppression and DNA repair, and alter lipid metabolism, potentially contributing to its longevity and cancer resistance. Hpgd, a tumor suppressor gene, is uniquely duplicated in beavers among rodents, and several genes associated with tumor suppression and longevity are under positive selection in beavers. Lipid metabolism genes show positive selection signals, changes in copy numbers, or altered gene expression in beavers. Aldh1a1, encoding an enzyme for aldehydes detoxification, is particularly notable due to its massive expansion in beavers, which enhances their cellular resistance to ethanol and capacity to metabolize diverse aldehyde substrates from lipid oxidation and their woody diet. We hypothesize that the amplification of Aldh1a1 may contribute to the longevity of beavers.