RESUMEN
One of the major challenges in the post-genomic era is elucidating the genetic basis of human diseases. In recent years, studies have shown that polygenic risk scores (PRS), based on aggregated information from millions of variants across the human genome, can estimate individual risk for common diseases. In practice, the current medical practice still predominantly relies on physiological and clinical indicators to assess personal disease risk. For example, caregivers mark individuals with high body mass index (BMI) as having an increased risk to develop type 2 diabetes (T2D). An important question is whether combining PRS with clinical metrics can increase the power of disease prediction in particular from early life. In this work we examined this question, focusing on T2D. We present here a sex-specific integrated approach that combines PRS with additional measurements and age to define a new risk score. We show that such approach combining adult BMI and PRS achieves considerably better prediction than each of the measures on unrelated Caucasians in the UK Biobank (UKB, n = 290,584). Likewise, integrating PRS with self-reports on birth weight (n = 172,239) and comparative body size at age ten (n = 287,203) also substantially enhance prediction as compared to each of its components. While the integration of PRS with BMI achieved better results as compared to the other measurements, the latter are early-life measurements that can be integrated already at childhood, to allow preemptive intervention for those at high risk to develop T2D. Our integrated approach can be easily generalized to other diseases, with the relevant early-life measurements.
RESUMEN
INTRODUCTION THE: GBA-N370S mutation is one of the most frequent risk factors for dementia with Lewy bodies (DLB) and Parkinson's disease (PD). We looked for genetic variations that contribute to the outcome in N370S-carriers, whether PD or DLB. METHODS: Whole-genome sequencing of 95 Ashkenazi-N370S-carriers affected with either DLB (n = 19) or PD (n = 76) was performed, and 564 genes related to dementia and PD analyzed. RESULTS: We identified enrichment of linked alleles in PINK1 locus in DLB patients (false discovery rate P = .0412). Haplotype analysis delineated 1.8 Mb interval encompassing 29 genes and 87 unique variants, of them, KIF17-R869C received the highest functional prediction score (Combined Annotation Dependent Depletion = 34). Its frequency was significantly higher in 26 DLB-N370S-carriers compared to 140 PD-N370S-carriers (odds ratio [OR] = 33.4 P = .001, and OR = 70.2 when only heterozygotes were included). DISCUSSION: Because KIF17 was shown to be important for learning and memory in mice, our data further suggest, for the first time, its involvement in DLB, and possibly in human dementia.
Asunto(s)
Metilación de ADN/genética , Predisposición Genética a la Enfermedad , Enfermedad de Parkinson/genética , Amidohidrolasas/genética , Proteínas de Transporte de Catión/genética , Humanos , Proteínas de Transporte de Monosacáridos/genética , Proteínas Nucleares/genética , Enfermedad de Parkinson/patología , Fosfoproteínas/genética , ARN/genética , Factores de Riesgo , Proteínas de Unión al GTP rab/genéticaRESUMEN
Cochin Jews form a small and unique community on the Malabar coast in southwest India. While the arrival time of any putative Jewish ancestors of the community has been speculated to have taken place as far back as biblical times (King Solomon's era), a Jewish community in the Malabar coast has been documented only since the 9th century CE. Here, we explore the genetic history of Cochin Jews by collecting and genotyping 21 community members and combining the data with that of 707 individuals from 72 other Indian, Jewish, and Pakistani populations, together with additional individuals from worldwide populations. We applied comprehensive genome-wide analyses based on principal component analysis, F ST, ADMIXTURE, identity-by-descent sharing, admixture linkage disequilibrium decay, haplotype sharing, allele sharing autocorrelation decay and contrasting the X chromosome with the autosomes. We find that, as reported by several previous studies, the genetics of Cochin Jews resembles that of local Indian populations. However, we also identify considerable Jewish genetic ancestry that is not present in any other Indian or Pakistani populations (with the exception of the Jewish Bene Israel, which we characterized previously). Combined, Cochin Jews have both Jewish and Indian ancestry. Specifically, we detect a significant recent Jewish gene flow into this community 13-22 generations (~470-730 years) ago, with contributions from Yemenite, Sephardi, and Middle-Eastern Jews, in accordance with historical records. Genetic analyses also point to high endogamy and a recent population bottleneck in this population, which might explain the increased prevalence of some recessive diseases in Cochin Jews.
Asunto(s)
Genética de Población , Judíos/genética , Desequilibrio de Ligamiento , Alelos , Pueblo Asiatico/genética , Genoma Humano , Genotipo , Haplotipos , Humanos , India , IsraelRESUMEN
The Bene Israel Jewish community from West India is a unique population whose history before the 18th century remains largely unknown. Bene Israel members consider themselves as descendants of Jews, yet the identity of Jewish ancestors and their arrival time to India are unknown, with speculations on arrival time varying between the 8th century BCE and the 6th century CE. Here, we characterize the genetic history of Bene Israel by collecting and genotyping 18 Bene Israel individuals. Combining with 486 individuals from 41 other Jewish, Indian and Pakistani populations, and additional individuals from worldwide populations, we conducted comprehensive genome-wide analyses based on FST, principal component analysis, ADMIXTURE, identity-by-descent sharing, admixture linkage disequilibrium decay, haplotype sharing and allele sharing autocorrelation decay, as well as contrasted patterns between the X chromosome and the autosomes. The genetics of Bene Israel individuals resemble local Indian populations, while at the same time constituting a clearly separated and unique population in India. They are unique among Indian and Pakistani populations we analyzed in sharing considerable genetic ancestry with other Jewish populations. Putting together the results from all analyses point to Bene Israel being an admixed population with both Jewish and Indian ancestry, with the genetic contribution of each of these ancestral populations being substantial. The admixture took place in the last millennium, about 19-33 generations ago. It involved Middle-Eastern Jews and was sex-biased, with more male Jewish and local female contribution. It was followed by a population bottleneck and high endogamy, which can lead to increased prevalence of recessive diseases in this population. This study provides an example of how genetic analysis advances our knowledge of human history in cases where other disciplines lack the relevant data to do so.
Asunto(s)
Pueblo Asiatico/genética , Genética de Población , Judíos/genética , Femenino , Estudio de Asociación del Genoma Completo , Genotipo , Haplotipos , Humanos , India , Israel , Desequilibrio de Ligamiento , Masculino , PakistánRESUMEN
Many complex human diseases are highly sexually dimorphic, suggesting a potential contribution of the X chromosome to disease risk. However, the X chromosome has been neglected or incorrectly analyzed in most genome-wide association studies (GWAS). We present tailored analytical methods and software that facilitate X-wide association studies (XWAS), which we further applied to reanalyze data from 16 GWAS of different autoimmune and related diseases (AID). We associated several X-linked genes with disease risk, among which (1) ARHGEF6 is associated with Crohn's disease and replicated in a study of ulcerative colitis, another inflammatory bowel disease (IBD). Indeed, ARHGEF6 interacts with a gastric bacterium that has been implicated in IBD. (2) CENPI is associated with three different AID, which is compelling in light of known associations with AID of autosomal genes encoding centromere proteins, as well as established autosomal evidence of pleiotropy between autoimmune diseases. (3) We replicated a previous association of FOXP3, a transcription factor that regulates T-cell development and function, with vitiligo; and (4) we discovered that C1GALT1C1 exhibits sex-specific effect on disease risk in both IBDs. These and other X-linked genes that we associated with AID tend to be highly expressed in tissues related to immune response, participate in major immune pathways, and display differential gene expression between males and females. Combined, the results demonstrate the importance of the X chromosome in autoimmunity, reveal the potential of extensive XWAS, even based on existing data, and provide the tools and incentive to properly include the X chromosome in future studies.
Asunto(s)
Cromosomas Humanos X/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Caracteres Sexuales , Colitis Ulcerosa/genética , Enfermedad de Crohn/genética , Proteínas de Unión al ADN/genética , Femenino , Humanos , Masculino , Chaperonas Moleculares/genética , Factores de Intercambio de Guanina Nucleótido Rho/genética , Programas InformáticosRESUMEN
Utilizing molecular data to derive functional physiological models tailored for specific cancer cells can facilitate the use of individually tailored therapies. To this end we present an approach termed PRIME for generating cell-specific genome-scale metabolic models (GSMMs) based on molecular and phenotypic data. We build >280 models of normal and cancer cell-lines that successfully predict metabolic phenotypes in an individual manner. We utilize this set of cell-specific models to predict drug targets that selectively inhibit cancerous but not normal cell proliferation. The top predicted target, MLYCD, is experimentally validated and the metabolic effects of MLYCD depletion investigated. Furthermore, we tested cell-specific predicted responses to the inhibition of metabolic enzymes, and successfully inferred the prognosis of cancer patients based on their PRIME-derived individual GSMMs. These results lay a computational basis and a counterpart experimental proof of concept for future personalized metabolic modeling applications, enhancing the search for novel selective anticancer therapies.
Asunto(s)
Modelos Biológicos , Neoplasias/metabolismo , Neoplasias/patología , Algoritmos , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico , Biomarcadores de Tumor/metabolismo , Carboxiliasas/metabolismo , Línea Celular Tumoral , Proliferación Celular/efectos de los fármacos , Ciclo del Ácido Cítrico/efectos de los fármacos , Ácidos Grasos/biosíntesis , Técnicas de Silenciamiento del Gen , Genoma Humano , Humanos , Linfocitos/efectos de los fármacos , Linfocitos/metabolismo , Neoplasias/tratamiento farmacológico , Oxidación-Reducción/efectos de los fármacos , Fenotipo , Medicina de PrecisiónRESUMEN
Synthetic lethality occurs when the inhibition of two genes is lethal while the inhibition of each single gene is not. It can be harnessed to selectively treat cancer by identifying inactive genes in a given cancer and targeting their synthetic lethal (SL) partners. We present a data-driven computational pipeline for the genome-wide identification of SL interactions in cancer by analyzing large volumes of cancer genomic data. First, we show that the approach successfully captures known SL partners of tumor suppressors and oncogenes. We then validate SL predictions obtained for the tumor suppressor VHL. Next, we construct a genome-wide network of SL interactions in cancer and demonstrate its value in predicting gene essentiality and clinical prognosis. Finally, we identify synthetic lethality arising from gene overactivation and use it to predict drug efficacy. These results form a computational basis for exploiting synthetic lethality to uncover cancer-specific susceptibilities.
Asunto(s)
Biología Computacional/métodos , Minería de Datos/métodos , Neoplasias/genética , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/genética , Línea Celular Tumoral , Genes Supresores de Tumor , Humanos , Neoplasias/tratamiento farmacológico , Neoplasias/patología , Oncogenes , ARN Interferente Pequeño/metabolismo , Flujo de TrabajoRESUMEN
Understanding cell proliferation mechanisms has been a long-lasting goal of the scientific community and specifically of cancer researchers. Previous genome-scale studies of cancer proliferation determinants have mainly relied on knockdown screens aimed to gauge their effects on cancer growth. This powerful approach has several limitations such as off-target effects, partial knockdown, and masking effects due to functional backups. Here we employ a complementary approach and assign each gene a cancer Proliferation Index (cPI) that quantifies the association between its expression levels and growth rate measurements across 60 cancer cell lines. Reassuringly, genes found essential in cancer gene knockdown screens exhibit significant positive cPI values, while tumor suppressors exhibit significant negative cPI values. Cell cycle, DNA replication, splicing and protein production related processes are positively associated with cancer proliferation, while cellular migration is negatively associated with it - in accordance with the well known "go or grow" dichotomy. A parallel analysis of genes' non-cancerous proliferation indices (nPI) across 224 lymphoblastoid cell lines reveals surprisingly marked differences between cancerous and non-cancerous proliferation. These differences highlight genes in the translation and spliceosome machineries as selective cancer proliferation-associated proteins. A cross species comparison reveals that cancer proliferation resembles that of microorganisms while non-cancerous proliferation does not. Furthermore, combining cancerous and non-cancerous proliferation signatures leads to enhanced prediction of patient outcome and gene essentiality in cancer. Overall, these results point to an inherent difference between cancerous and non-cancerous proliferation determinants, whose understanding may contribute to the future development of novel cancer-specific anti-proliferative drugs.
Asunto(s)
Ciclo Celular/genética , Movimiento Celular/genética , Proliferación Celular , Neoplasias/genética , Transcriptoma , Antineoplásicos/uso terapéutico , División Celular/genética , Línea Celular Tumoral , Regulación Neoplásica de la Expresión Génica , Genoma Humano , Humanos , Neoplasias/patologíaRESUMEN
The prioritization of candidate disease-causing genes is a fundamental challenge in the post-genomic era. Current state of the art methods exploit a protein-protein interaction (PPI) network for this task. They are based on the observation that genes causing phenotypically-similar diseases tend to lie close to one another in a PPI network. However, to date, these methods have used a static picture of human PPIs, while diseases impact specific tissues in which the PPI networks may be dramatically different. Here, for the first time, we perform a large-scale assessment of the contribution of tissue-specific information to gene prioritization. By integrating tissue-specific gene expression data with PPI information, we construct tissue-specific PPI networks for 60 tissues and investigate their prioritization power. We find that tissue-specific PPI networks considerably improve the prioritization results compared to those obtained using a generic PPI network. Furthermore, they allow predicting novel disease-tissue associations, pointing to sub-clinical tissue effects that may escape early detection.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Modelos Biológicos , Mapeo de Interacción de Proteínas/métodos , Proteoma/genética , Proteoma/metabolismo , Transducción de Señal/genética , Simulación por Computador , Humanos , Distribución TisularRESUMEN
Numerous metabolic alterations are associated with the impairment of brain cells in Alzheimer's disease (AD). Here we use gene expression microarrays of both whole hippocampus tissue and hippocampal neurons of AD patients to investigate the ability of metabolic gene expression to predict AD progression and its cognitive decline. We find that the prediction accuracy of different AD stages is markedly higher when using neuronal expression data (0.9) than when using whole tissue expression (0.76). Furthermore, the metabolic genes' expression is shown to be as effective in predicting AD severity as the entire gene list. Remarkably, a regression model from hippocampal metabolic gene expression leads to a marked correlation of 0.57 with the Mini-Mental State Examination cognitive score. Notably, the expression of top predictive neuronal genes in AD is significantly higher than that of other metabolic genes in the brains of healthy subjects. All together, the analyses point to a subset of metabolic genes that is strongly associated with normal brain functioning and whose disruption plays a major role in AD.
Asunto(s)
Enfermedad de Alzheimer/patología , Regulación de la Expresión Génica/fisiología , Hipocampo/patología , Proteínas del Tejido Nervioso/metabolismo , Neuronas/metabolismo , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/genética , Animales , Progresión de la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Proteínas del Tejido Nervioso/genética , Valor Predictivo de las PruebasRESUMEN
Synonymous mutations are considered to be "silent" as they do not affect protein sequence. However, different silent codons have different translation efficiency (TE), which raises the question to what extent such mutations are really neutral. We perform the first genome-wide study of natural selection operating on TE in recent human evolution, surveying 13,798 synonymous single nucleotide polymorphisms (SNPs) in 1,198 unrelated individuals from 11 populations. We find evidence for both negative and positive selection on TE, as measured based on differentiation in allele frequencies between populations. Notably, the likelihood of an SNP to be targeted by positive or negative selection is correlated with the magnitude of its effect on the TE of the corresponding protein. Furthermore, negative selection acting against changes in TE is more marked in highly expressed genes, highly interacting proteins, complex members, and regulatory genes. It is also more common in functional regions and in the initial segments of highly expressed genes. Positive selection targeting sites with a large effect on TE is stronger in lowly interacting proteins and in regulatory genes. Similarly, essential genes are enriched for negative TE selection while underrepresented for positive TE selection. Taken together, these results point to the significant role of TE as a selective force operating in humans and hence underscore the importance of considering silent SNPs in interpreting associations with complex human diseases. Testifying to this potential, we describe two synonymous SNPs that may have clinical implications in phenylketonuria and in Best's macular dystrophy due to TE differences between alleles.
Asunto(s)
Evolución Molecular , Genética Médica , Polimorfismo de Nucleótido Simple , Biosíntesis de Proteínas , Grupos Raciales/genética , Selección Genética , Codón , Genoma Humano , Humanos , Modelos Genéticos , Proteínas/genéticaRESUMEN
Synonymous mutations do not alter the protein produced yet can have a significant effect on protein levels. The mechanisms by which this effect is achieved are controversial; although some previous studies have suggested that codon bias is the most important determinant of translation efficiency, a recent study suggested that mRNA folding at the beginning of genes is the dominant factor via its effect on translation initiation. Using the Escherichia coli and Saccharomyces cerevisiae transcriptomes, we conducted a genome-scale study aiming at dissecting the determinants of translation efficiency. There is a significant association between codon bias and translation efficiency across all endogenous genes in E. coli and S. cerevisiae but no association between folding energy and translation efficiency, demonstrating the role of codon bias as an important determinant of translation efficiency. However, folding energy does modulate the strength of association between codon bias and translation efficiency, which is maximized at very weak mRNA folding (i.e., high folding energy) levels. We find a strong correlation between the genomic profiles of ribosomal density and genomic profiles of folding energy across mRNA, suggesting that lower folding energies slow down the ribosomes and decrease translation efficiency. Accordingly, we find that selection forces act near uniformly to decrease the folding energy at the beginning of genes. In summary, these findings testify that in endogenous genes, folding energy affects translation efficiency in a global manner that is not related to the expression levels of individual genes, and thus cannot be detected by correlation with their expression levels.
Asunto(s)
Codón/genética , Conformación de Ácido Nucleico , Iniciación de la Cadena Peptídica Traduccional/genética , ARN Mensajero/química , Escherichia coli/genética , Escherichia coli/metabolismo , Perfilación de la Expresión Génica , ARN Mensajero/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismoRESUMEN
Various studies in unicellular and multicellular organisms have shown that codon bias plays a significant role in translation efficiency (TE) by co-adaptation to the tRNA pool. Yet, in humans and other mammals the role of codon bias is still an open question, with contradictory results from different studies. Here we address this question, performing a large-scale tissue-specific analysis of TE in humans, using the tRNA Adaptation Index (tAI) as a direct measure for TE. We find tAI to significantly correlate with expression levels both in tissue-specific and in global expression measures, testifying to the TE of human tissues. Interestingly, we find significantly higher correlations in adult tissues as opposed to fetal tissues, suggesting that the tRNA pool is more adjusted to the adult period. Optimization based analysis suggests that the tRNA pool-codon bias co-adaptation is globally (and not tissue-specific) driven. Additionally, we find that tAI correlates with several measures related to the protein functionally importance, including gene essentiality. Using inferred tissue-specific tRNA pools lead to similar results and shows that tissue-specific genes are more adapted to their tRNA pool than other genes and that related sets of functional gene groups are translated efficiently in each tissue. Similar results are obtained for other mammals. Taken together, these results demonstrate the role of codon bias in TE in humans, and pave the way for future studies of tissue-specific TE in multicellular organisms.
Asunto(s)
Codón , Biosíntesis de Proteínas , Animales , Feto/metabolismo , Humanos , ARN de Transferencia/análisis , Distribución TisularRESUMEN
The tumor suppressor gene TP53 is known to be a key regulator in cancer, and more than half of human cancers exhibit mutations in this gene. Recent evidence shows that point mutations in TP53 not only disrupt its function but also possess gain-of-function and dominant-negative effects on wild-type copies, thus making the mutated gene an oncogene. Hence, this brings about the possibility that TP53 mutations may be under selection for increasing the overall translation efficiency (TE) of defected TP53 in cancerous cells. Here, we perform the first large-scale analysis of TE in human cancer mutated TP53 variants, identifying a significant increase in TE that is correlated with the frequency of TP53 mutations. Furthermore, mutations with a known oncogenic effect significantly increase their TE compared with the other TP53 mutations. Further analysis shows that TE may have influence both on selecting the location of the mutation and on its outcome: codons with lower TE show stronger selection toward nonsynonymous mutations and, for each codon, frequent mutations show stronger increase in TE compared with less frequent mutations. Additionally, we find that TP53 mutations have significantly higher TE increase in progressive versus primary tumors. Finally, an analysis of TP53 NCI-60 cell lines points to a coadaptation between the mutations and the tRNA pool, increasing the overall TP53 TE. Taken together, these results show that TE plays an important role in the selection of TP53 cancerous mutations.