Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30.526
Filtrar
2.
BMC Bioinformatics ; 22(1): 180, 2021 Apr 07.
Artículo en Inglés | MEDLINE | ID: mdl-33827420

RESUMEN

BACKGROUND: Permutation testing is often considered the "gold standard" for multi-test significance analysis, as it is an exact test requiring few assumptions about the distribution being computed. However, it can be computationally very expensive, particularly in its naive form in which the full analysis pipeline is re-run after permuting the phenotype labels. This can become intractable in multi-locus genome-wide association studies (GWAS), in which the number of potential interactions to be tested is combinatorially large. RESULTS: In this paper, we develop an approach for permutation testing in multi-locus GWAS, specifically focusing on SNP-SNP-phenotype interactions using multivariable measures that can be computed from frequency count tables, such as those based in Information Theory. We find that the computational bottleneck in this process is the construction of the count tables themselves, and that this step can be eliminated at each iteration of the permutation testing by transforming the count tables directly. This leads to a speed-up by a factor of over 103 for a typical permutation test compared to the naive approach. Additionally, this approach is insensitive to the number of samples making it suitable for datasets with large number of samples. CONCLUSIONS: The proliferation of large-scale datasets with genotype data for hundreds of thousands of individuals enables new and more powerful approaches for the detection of multi-locus genotype-phenotype interactions. Our approach significantly improves the computational tractability of permutation testing for these studies. Moreover, our approach is insensitive to the large number of samples in these modern datasets. The code for performing these computations and replicating the figures in this paper is freely available at https://github.com/kunert/permute-counts .


Asunto(s)
Epistasis Genética , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Genotipo , Humanos , Fenotipo
3.
Transl Psychiatry ; 11(1): 160, 2021 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-33723208

RESUMEN

Psychiatric symptoms are seen in some COVID-19 patients, as direct or indirect sequelae, but it is unclear whether SARS-CoV-2 infection interacts with underlying neuronal or psychiatric susceptibilities. Such interactions might arise from COVID-19 immune responses, from infection of neurons themselves or may reflect social-psychological causes. To clarify this we sought the key gene expression pathways altered in COVID-19 also affected in bipolar disorder, post-traumatic stress disorder (PTSD) and schizophrenia, since this may identify pathways of interaction that could be treatment targets. We performed large scale comparisons of whole transcriptome data and immune factor transcript data in peripheral blood mononuclear cells (PBMC) from COVID-19 patients and patients with psychiatric disorders. We also analysed genome-wide association study (GWAS) data for symptomatic COVID-19 patients, comparing GWAS and whole-genome sequence data from patients with bipolar disorder, PTSD and schizophrenia patients. These studies revealed altered signalling and ontology pathways shared by COVID-19 patients and the three psychiatric disorders. Finally, co-expression and network analyses identified gene clusters common to the conditions. COVID-19 patients had peripheral blood immune system profiles that overlapped with those of patients with psychiatric conditions. From the pathways identified, PTSD profiles were the most highly correlated with COVID-19, perhaps consistent with stress-immune system interactions seen in PTSD. We also revealed common inflammatory pathways that may exacerbate psychiatric disorders, which may support the usage of anti-inflammatory medications in these patients. It also highlights the potential clinical application of multi-level dataset studies in difficult-to-treat psychiatric disorders in this COVID-19 pandemic.


Asunto(s)
Trastorno Bipolar/genética , Esquizofrenia/genética , Trastornos por Estrés Postraumático/genética , Trastorno Bipolar/inmunología , Comorbilidad , Perfilación de la Expresión Génica , Ontología de Genes , Redes Reguladoras de Genes , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Inmunidad/genética , Inflamación/genética , Trastornos Mentales/genética , Trastornos Mentales/inmunología , Esquizofrenia/inmunología , Transducción de Señal/genética , Trastornos por Estrés Postraumático/inmunología , Secuenciación Completa del Genoma
4.
Medicine (Baltimore) ; 100(11): e24769, 2021 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-33725943

RESUMEN

ABSTRACT: Several genetic loci have been reported to be significantly associated with coronary artery disease (CAD) by multiple genome-wide association studies (GWAS). Nevertheless, the biological and functional effects of these genetic variants on CAD remain largely equivocal. In the current study, we performed an integrative genomics analysis by integrating large-scale GWAS data (N = 459,534) and 2 independent expression quantitative trait loci (eQTL) datasets (N = 1890) to determine whether CAD-associated risk single nucleotide polymorphisms (SNPs) exert regulatory effects on gene expression. By using Sherlock Bayesian, MAGMA gene-based, multidimensional scaling (MDS), functional enrichment, and in silico permutation analyses for independent technical and biological replications, we highlighted 4 susceptible genes (CHCHD1, TUBG1, LY6G6C, and MRPS17) associated with CAD risk. Based on the protein-protein interaction (PPI) network analysis, these 4 genes were found to interact with each other. We detected a remarkably altered co-expression pattern among these 4 genes between CAD patients and controls. In addition, 3 genes of CHCHD1 (P = .0013), TUBG1 (P = .004), and LY6G6C (P = .038) showed significantly different expressions between CAD patients and controls. Together, we provide evidence to support that these identified genes such as CHCHD1 and TUBG1 are indicative factors of CAD.


Asunto(s)
Enfermedad de la Arteria Coronaria/genética , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Sitios de Carácter Cuantitativo/genética , Adulto , Antígenos Ly/genética , Teorema de Bayes , Femenino , Redes Reguladoras de Genes/genética , Marcadores Genéticos/genética , Genómica , Humanos , Masculino , Proteínas Mitocondriales/genética , Proteínas Nucleares/genética , Polimorfismo de Nucleótido Simple/genética , Mapas de Interacción de Proteínas/genética , Proteínas Ribosómicas/genética , Tubulina (Proteína)/genética
5.
Nat Commun ; 12(1): 1610, 2021 03 12.
Artículo en Inglés | MEDLINE | ID: mdl-33712570

RESUMEN

Genome-wide association studies (GWAS) have identified more than 40 loci associated with Alzheimer's disease (AD), but the causal variants, regulatory elements, genes and pathways remain largely unknown, impeding a mechanistic understanding of AD pathogenesis. Previously, we showed that AD risk alleles are enriched in myeloid-specific epigenomic annotations. Here, we show that they are specifically enriched in active enhancers of monocytes, macrophages and microglia. We integrated AD GWAS with myeloid epigenomic and transcriptomic datasets using analytical approaches to link myeloid enhancer activity to target gene expression regulation and AD risk modification. We identify AD risk enhancers and nominate candidate causal genes among their likely targets (including AP4E1, AP4M1, APBB3, BIN1, MS4A4A, MS4A6A, PILRA, RABEP1, SPI1, TP53INP1, and ZYX) in twenty loci. Fine-mapping of these enhancers nominates candidate functional variants that likely modify AD risk by regulating gene expression in myeloid cells. In the MS4A locus we identified a single candidate functional variant and validated it in human induced pluripotent stem cell (hiPSC)-derived microglia and brain. Taken together, this study integrates AD GWAS with multiple myeloid genomic datasets to investigate the mechanisms of AD risk alleles and nominates candidate functional variants, regulatory elements and genes that likely modulate disease susceptibility.


Asunto(s)
Enfermedad de Alzheimer/genética , Predisposición Genética a la Enfermedad/genética , Genómica , Células Mieloides , Secuencias Reguladoras de Ácidos Nucleicos/genética , Alelos , Enfermedad de Alzheimer/metabolismo , Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Células Madre Pluripotentes Inducidas/metabolismo , Macrófagos , Microglía/metabolismo , Transcriptoma
6.
Nat Commun ; 12(1): 1611, 2021 03 12.
Artículo en Inglés | MEDLINE | ID: mdl-33712590

RESUMEN

Genome-wide association studies of Systemic Lupus Erythematosus (SLE) nominate 3073 genetic variants at 91 risk loci. To systematically screen these variants for allelic transcriptional enhancer activity, we construct a massively parallel reporter assay (MPRA) library comprising 12,396 DNA oligonucleotides containing the genomic context around every allele of each SLE variant. Transfection into the Epstein-Barr virus-transformed B cell line GM12878 reveals 482 variants with enhancer activity, with 51 variants showing genotype-dependent (allelic) enhancer activity at 27 risk loci. Comparison of MPRA results in GM12878 and Jurkat T cell lines highlights shared and unique allelic transcriptional regulatory mechanisms at SLE risk loci. In-depth analysis of allelic transcription factor (TF) binding at and around allelic variants identifies one class of TFs whose DNA-binding motif tends to be directly altered by the risk variant and a second class of TFs that bind allelically without direct alteration of their motif by the variant. Collectively, our approach provides a blueprint for the discovery of allelic gene regulation at risk loci for any disease and offers insight into the transcriptional regulatory mechanisms underlying SLE.


Asunto(s)
Alelos , Predisposición Genética a la Enfermedad/genética , Lupus Eritematoso Sistémico/genética , Linfocitos B , Línea Celular , Cromatina , Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Genotipo , Herpesvirus Humano 4 , Humanos , Sitios de Carácter Cuantitativo , Sinaptogirinas/genética , Linfocitos T
7.
Nat Commun ; 12(1): 1639, 2021 03 12.
Artículo en Inglés | MEDLINE | ID: mdl-33712626

RESUMEN

Conventional human leukocyte antigen (HLA) imputation methods drop their performance for infrequent alleles, which is one of the factors that reduce the reliability of trans-ethnic major histocompatibility complex (MHC) fine-mapping due to inter-ethnic heterogeneity in allele frequency spectra. We develop DEEP*HLA, a deep learning method for imputing HLA genotypes. Through validation using the Japanese and European HLA reference panels (n = 1,118 and 5,122), DEEP*HLA achieves the highest accuracies with significant superiority for low-frequency and rare alleles. DEEP*HLA is less dependent on distance-dependent linkage disequilibrium decay of the target alleles and might capture the complicated region-wide information. We apply DEEP*HLA to type 1 diabetes GWAS data from BioBank Japan (n = 62,387) and UK Biobank (n = 354,459), and successfully disentangle independently associated class I and II HLA variants with shared risk among diverse populations (the top signal at amino acid position 71 of HLA-DRß1; P = 7.5 × 10-120). Our study illustrates the value of deep learning in genotype imputation and trans-ethnic MHC fine-mapping.


Asunto(s)
Aprendizaje Profundo , Diabetes Mellitus Tipo 1/genética , Predisposición Genética a la Enfermedad/genética , Antígenos HLA/genética , Complejo Mayor de Histocompatibilidad/genética , Alelos , Grupos de Población Continentales , Grupos Étnicos/genética , Estudio de Asociación del Genoma Completo , Genotipo , Antígenos de Histocompatibilidad Clase I/genética , Antígenos de Histocompatibilidad Clase II/genética , Humanos , Desequilibrio de Ligamiento
8.
Kidney Int ; 99(4): 805-808, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33745545

RESUMEN

Gorski et al. report a meta-GWAS of rapid kidney function decline in 42 longitudinal studies from the CKDGen Consortium and UK Biobank, amounting to more than 270'000 individuals with two eGFRcrea measurements. They identified genome-wide significant variants associated with two indexes of rapid kidney function decline, involving genes with a high potential for causality. These data increase our understanding of kidney function and risk of disease.


Asunto(s)
Estudio de Asociación del Genoma Completo , Riñón , Tasa de Filtración Glomerular , Humanos , Estudios Longitudinales
9.
BMC Med ; 19(1): 72, 2021 03 24.
Artículo en Inglés | MEDLINE | ID: mdl-33757497

RESUMEN

BACKGROUND: Observational studies suggest poorer glycemic traits and type 2 diabetes associated with coronavirus disease 2019 (COVID-19) risk although these findings could be confounded by socioeconomic position. We conducted a two-sample Mendelian randomization to clarify their role in COVID-19 risk and specific COVID-19 phenotypes (hospitalized and severe cases). METHOD: We identified genetic instruments for fasting glucose (n = 133,010), 2 h glucose (n = 42,854), glycated hemoglobin (n = 123,665), and type 2 diabetes (74,124 cases and 824,006 controls) from genome wide association studies and applied them to COVID-19 Host Genetics Initiative summary statistics (17,965 COVID-19 cases and 1,370,547 population controls). We used inverse variance weighting to obtain the causal estimates of glycemic traits and genetic predisposition to type 2 diabetes in COVID-19 risk. Sensitivity analyses included MR-Egger and weighted median method. RESULTS: We found genetic predisposition to type 2 diabetes was not associated with any COVID-19 phenotype (OR: 1.00 per unit increase in log odds of having diabetes, 95%CI 0.97 to 1.04 for overall COVID-19; OR: 1.02, 95%CI 0.95 to 1.09 for hospitalized COVID-19; and OR: 1.00, 95%CI 0.93 to 1.08 for severe COVID-19). There were no strong evidence for an association of glycemic traits in COVID-19 phenotypes, apart from a potential inverse association for fasting glucose albeit with wide confidence interval. CONCLUSION: We provide some genetic evidence that poorer glycemic traits and predisposition to type 2 diabetes unlikely increase the risk of COVID-19. Although our study did not indicate glycemic traits increase severity of COVID-19, additional studies are needed to verify our findings.


Asunto(s)
Glucemia/genética , Diabetes Mellitus Tipo 2/genética , Hemoglobina A Glucada/genética , Análisis de la Aleatorización Mendeliana , Adulto , Glucemia/metabolismo , /epidemiología , Estudios de Casos y Controles , Enfermedad Crítica/epidemiología , Diabetes Mellitus Tipo 2/sangre , Diabetes Mellitus Tipo 2/complicaciones , Diabetes Mellitus Tipo 2/epidemiología , Ayuno/sangre , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Hemoglobina A Glucada/metabolismo , Humanos , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple , Factores de Riesgo , Índice de Severidad de la Enfermedad
10.
Medicine (Baltimore) ; 100(11): e25113, 2021 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-33725991

RESUMEN

BACKGROUND: Recent studies have reported that lncRNA (long noncoding RNAs) antisense non-coding RNA in the INK4 locus (ANRIL) plays important roles in the development of atherosclerosis through regulating cell apoptosis, proliferation, and adhesion. GWAS (genome-wide association studies) identified common genetic variants within ANRIL could confer risk of ischemic stroke (IS) in southern Sweden. METHODS: We performed a case-control study, including 567 IS patients and 552 healthy controls from unrelated northern Chinese Han population, aiming to explore the association between lncRNA ANRIL rs2383207, rs4977574 polymorphisms and the risk of IS. Subsequently we implemented a meta-analysis to further assess the relationship of these variants and the disease. RESULTS: In our case-control study, no significant associations were observed in all models between above 2 polymorphisms and IS. Next in our subgroup analysis, we detected significant association between GA genotype of rs4977574 and the increased risk of LAA-IS (large-artery atherosclerotic ischemic stroke), similar elevated risk also appeared in the GG + GA genotype under the dominant model (P = .048, OR = 1.385, 95% CIs 1.002-1.914; P = .040, OR = 1.378, 95% CIs 1.015-1.872, respectively). As for rs2383207, negative results were obtained under all models and subgroups. Our meta-analysis showed a significant association between rs4977574 polymorphism and IS risk in allele model (G vs A P = .002, OR = 1.137, 95% CIs 1.048-1.234); with respect to rs2383207 polymorphism, no significant association between that and the risk of IS was detected under the dominant model (GA + AA vs GG, P = .061, OR = 0.923, 95% CIs 0.849-1.004), or recessive model (AA vs GA + GG, P = .656, OR = 0.972, 95% CIs 0.858-1.101), or allele model (A vs G, P = .326, OR = 0.952, 95% CIs 0.863-1.050). Likewise, no significant association between rs2383207 and IS was found in different stoke subtypes (P > .05). CONCLUSIONS: Our findings indicated G allele of lncRNA ANRIL rs4977574 could increase the risk of IS, and the variant may be associated with susceptibility to LAA-IS in Chinese Han population.


Asunto(s)
Predisposición Genética a la Enfermedad/genética , Polimorfismo Genético/genética , ARN Largo no Codificante/sangre , Anciano , Alelos , Grupo de Ascendencia Continental Asiática/etnología , Grupo de Ascendencia Continental Asiática/genética , Aterosclerosis/epidemiología , Aterosclerosis/etnología , Aterosclerosis/genética , Estudios de Casos y Controles , China/epidemiología , China/etnología , Enfermedad de la Arteria Coronaria/epidemiología , Enfermedad de la Arteria Coronaria/etnología , Enfermedad de la Arteria Coronaria/genética , Femenino , Predisposición Genética a la Enfermedad/etnología , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , /etnología , Masculino , Persona de Mediana Edad , Factores de Riesgo
11.
Science ; 371(6536)2021 03 26.
Artículo en Inglés | MEDLINE | ID: mdl-33766855

RESUMEN

The phenotypic measures used by Ganna et al (Research Articles, 30 August 2019, p. 882) lump together predominantly heterosexual, bisexual, and homosexual individuals, including those who have experimented with a same-sex partner only once. This may have resulted in misleading associations to personality traits unrelated to understood categories of human sexuality. Scientific studies of human sexuality should use validated and reliable measures of sexual behaviors, attractions, and identities that capture the full spectrum of complexity.


Asunto(s)
Estudio de Asociación del Genoma Completo , Conducta Sexual , Bisexualidad , Heterosexualidad , Homosexualidad , Humanos
12.
Science ; 371(6536)2021 03 26.
Artículo en Inglés | MEDLINE | ID: mdl-33766859

RESUMEN

Hamer et al argue that the variable "ever versus never had a same-sex partner" does not capture the complexity of human sexuality. We agree and said so in our paper. But Hamer et al neglect to mention that we also reported follow-up analyses showing substantial overlap of the genetic influences on our main variable and on more nuanced measures of sexual behavior, attraction, and identity.


Asunto(s)
Estudio de Asociación del Genoma Completo , Conducta Sexual , Humanos , Solución de Problemas
13.
EBioMedicine ; 65: 103277, 2021 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-33714028

RESUMEN

BACKGROUND: Idiopathic pulmonary fibrosis (IPF) is a complex lung disease, characterized by progressive lung scarring. Severe COVID-19 is associated with substantial pneumonitis and has a number of shared major risk factors with IPF. This study aimed to determine the genetic correlation between IPF and severe COVID-19 and assess a potential causal role of genetically increased risk of IPF on COVID-19 severity. METHODS: The genetic correlation between IPF and COVID-19 severity was estimated with linkage disequilibrium (LD) score regression. We performed a Mendelian randomization (MR) study for IPF causality in COVID-19. Genetic variants associated with IPF susceptibility (P<5 × 10-8) in previous genome-wide association studies (GWAS) were used as instrumental variables (IVs). Effect estimates of those IVs on COVID-19 severity were gathered from the GWAS meta-analysis by the COVID-19 Host Genetics Initiative (4,336 cases & 623,902 controls). FINDINGS: We detected a positive genetic correlation of IPF with COVID-19 severity (rg=0·31 [95% CI 0·04-0·57], P = 0·023). The MR estimates for severe COVID-19 did not reveal any genetic association (OR 1·05, [95% CI 0·92-1·20], P = 0·43). However, outlier analysis revealed that the IPF risk allele rs35705950 at MUC5B had a different effect compared with the other variants. When rs35705950 was excluded, MR results provided evidence that genetically increased risk of IPF has a causal effect on COVID-19 severity (OR 1·21, [95% CI 1·06-1·38], P = 4·24 × 10-3). Furthermore, the IPF risk-allele at MUC5B showed an apparent protective effect against COVID-19 hospitalization only in older adults (OR 0·86, [95% CI 0·73-1·00], P = 2·99 × 10-2) . INTERPRETATION: The strongest genetic determinant of IPF, rs35705950 at MUC5B, seems to confer protection against COVID-19, whereas the combined effect of all other IPF risk loci seem to confer risk of COVID-19 severity. The observed effect of rs35705950 could either be due to protective effects of mucin over-production on the airways or a consequence of selection bias due to (1) a patient group that is heavily enriched for the rs35705950 T undertaking strict self-isolation and/or (2) due to survival bias of the rs35705950 non-IPF risk allele carriers. Due to the diverse impact of IPF causal variants on SARS-CoV-2 infection, with a possible selection bias as an explanation, further investigation is needed to address this apparent paradox between variance at MUC5B and other IPF genetic risk factors. FUNDING: Novo Nordisk Foundation and Oak Foundation.


Asunto(s)
/patología , Predisposición Genética a la Enfermedad/genética , Fibrosis Pulmonar Idiopática/patología , /genética , Estudio de Asociación del Genoma Completo , Humanos , Fibrosis Pulmonar Idiopática/genética , Pulmón/patología , Mucina 5B/genética , Polimorfismo de Nucleótido Simple/genética , Riesgo , Índice de Severidad de la Enfermedad
14.
Plant Mol Biol ; 105(6): 585-599, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-33651261

RESUMEN

KEY MESSAGE: Total of 14 SNPs associated with overwintering-related traits and 75 selective regions were detected. Important candidate genes were identified and a possible network of cold-stress responses in woody plants was proposed. Local adaptation to low temperature is essential for woody plants to against changeable climate and safely survive the winter. To uncover the specific molecular mechanism of low temperature adaptation in woody plants, we sequenced 134 core individuals selected from 494 paper mulberry (Broussonetia papyrifera), which naturally distributed in different climate zones and latitudes. The population structure analysis, PCA analysis and neighbor-joining tree analysis indicated that the individuals were classified into three clusters, which showed forceful geographic distribution patterns because of the adaptation to local climate. Using two overwintering phenotypic data collected at high latitudes of 40°N and one bioclimatic variable, genome-phenotype and genome-environment associations, and genome-wide scans were performed. We detected 75 selective regions which possibly undergone temperature selection and identified 14 trait-associated SNPs that corresponded to 16 candidate genes (including LRR-RLK, PP2A, BCS1, etc.). Meanwhile, low temperature adaptation was also supported by other three trait-associated SNPs which exhibiting significant differences in overwintering traits between alleles within three geographic groups. To sum up, a possible network of cold signal perception and responses in woody plants were proposed, including important genes that have been confirmed in previous studies while others could be key potential candidates of woody plants. Overall, our results highlighted the specific and complex molecular mechanism of low temperature adaptation and overwintering of woody plants.


Asunto(s)
Adaptación Fisiológica/genética , Frío , Fenómenos Fisiológicos de las Plantas , Plantas/genética , Alelos , Secuencia de Bases , Clima , Estudio de Asociación del Genoma Completo , Morus/genética , Morus/fisiología , Fenotipo , Polimorfismo de Nucleótido Simple , Temperatura
15.
Methods Mol Biol ; 2212: 93-103, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33733352

RESUMEN

Transcriptome-wide association studies (TWASs) integrate expression quantitative trait loci (eQTLs) studies with genome-wide association studies (GWASs) to prioritize candidate target genes for complex traits. TWASs have become increasingly popular. They have been used to analyze many complex traits with expression profiles from different tissues, successfully enhancing the discovery of genetic risk loci for complex traits. Though conceptually straightforward, some steps are required to perform the TWAS properly. Here we provide a step-by-step guide to integrate eQTL data with both GWAS individual-level data and GWAS summary statistics from complex traits.


Asunto(s)
Epistasis Genética , Pruebas Genéticas/métodos , Modelos Genéticos , Herencia Multifactorial , Programas Informáticos , Transcriptoma , Genoma Humano , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Incertidumbre
16.
Methods Mol Biol ; 2212: 169-179, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33733356

RESUMEN

In biology, the term "epistasis" indicates the effect of the interaction of a gene with another gene. A gene can interact with an independently sorted gene, located far away on the chromosome or on an entirely different chromosome, and this interaction can have a strong effect on the function of the two genes. These changes then can alter the consequences of the biological processes, influencing the organism's phenotype. Machine learning is an area of computer science that develops statistical methods able to recognize patterns from data. A typical machine learning algorithm consists of a training phase, where the model learns to recognize specific trends in the data, and a test phase, where the trained model applies its learned intelligence to recognize trends in external data. Scientists have applied machine learning to epistasis problems multiple times, especially to identify gene-gene interactions from genome-wide association study (GWAS) data. In this brief survey, we report and describe the main scientific articles published in data mining and epistasis. Our article confirms the effectiveness of machine learning in this genetics subfield.


Asunto(s)
Biología Computacional/métodos , Minería de Datos/métodos , Epistasis Genética , Aprendizaje Automático , Carácter Cuantitativo Heredable , Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/metabolismo , Enfermedad de Alzheimer/patología , Enfermedad de Crohn/genética , Enfermedad de Crohn/metabolismo , Enfermedad de Crohn/patología , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Patrón de Herencia , Degeneración Macular/genética , Degeneración Macular/metabolismo , Degeneración Macular/patología , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/patología , Fenotipo , Plantas/genética , Polimorfismo de Nucleótido Simple
17.
Methods Mol Biol ; 2212: 225-243, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33733359

RESUMEN

Unraveling the complex biological mechanisms underlying human health and disease is a great challenge. With genomic data, many aspects can be investigated in great detail, such as interactions between different genetic variants as well as their effects on one or multiple traits. Modeling epistasis and pleiotropy jointly necessitates appropriate statistical methods. A suitable tool for this is C-JAMP, which is a recently proposed method based on copula functions. In this chapter, we outline C-JAMP and how it can be applied to investigate epistatic effects on multiple traits to advance our understanding of biological processes. We further discuss important aspects of this area of research, such as polygenic risk scores and ancestry-specific modeling, which we propose to include in future extensions of the software.


Asunto(s)
Epistasis Genética , Pleiotropía Genética , Antígenos HLA/genética , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Programas Informáticos , Bancos de Muestras Biológicas , Índice de Masa Corporal , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Fenotipo , Estudios Prospectivos , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Reino Unido , Relación Cintura-Cadera
18.
Methods Mol Biol ; 2212: 191-223, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33733358

RESUMEN

Gene-environment interactions have important implications for elucidating the genetic basis of complex diseases beyond the joint function of multiple genetic factors and their interactions (or epistasis). In the past, G × E interactions have been mainly conducted within the framework of genetic association studies. The high dimensionality of G × E interactions, due to the complicated form of environmental effects and the presence of a large number of genetic factors including gene expressions and SNPs, has motivated the recent development of penalized variable selection methods for dissecting G × E interactions, which has been ignored in the majority of published reviews on genetic interaction studies. In this article, we first survey existing studies on both gene-environment and gene-gene interactions. Then, after a brief introduction to the variable selection methods, we review penalization and relevant variable selection methods in marginal and joint paradigms, respectively, under a variety of conceptual models. Discussions on strengths and limitations, as well as computational aspects of the variable selection methods tailored for G × E studies, have also been provided.


Asunto(s)
Epistasis Genética , Interacción Gen-Ambiente , Modelos Lineales , Modelos Genéticos , Dinámicas no Lineales , Algoritmos , Teorema de Bayes , Simulación por Computador , Estudio de Asociación del Genoma Completo , Humanos , Polimorfismo de Nucleótido Simple
19.
Methods Mol Biol ; 2212: 291-305, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33733363

RESUMEN

To develop medical treatments and prevention, the association between disease and genetic variants needs to be identified. The main goal of genome-wide association study (GWAS) is to discover the underlying reason for vulnerability to disease and utilize this knowledge for the development of prevention and treatment against these diseases. Given the methods available to address the scientific problems involved in the search for epistasis, there is not any standard for detecting epistasis, and this remains a problem due to limited statistical power. The GenEpi package is a Python package that uses a two-level workflow machine learning model to detect within-gene and cross-gene epistasis. This protocol chapter shows the usage of GenEpi with example data. The package uses a three-step procedure to reduce dimensionality, select the within-gene epistasis, and select the cross-gene epistasis. The package also provides a medium to build prediction models with the combination of genetic features and environmental influences.


Asunto(s)
Biología Computacional/métodos , Epistasis Genética , Estudios de Asociación Genética , Aprendizaje Automático , Programas Informáticos , Bases de Datos Genéticas , Genoma Humano , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple
20.
Nat Commun ; 12(1): 1781, 2021 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-33741908

RESUMEN

Prostate cancer (PCa) risk-associated SNPs are enriched in noncoding cis-regulatory elements (rCREs), yet their modi operandi and clinical impact remain elusive. Here, we perform CRISPRi screens of 260 rCREs in PCa cell lines. We find that rCREs harboring high risk SNPs are more essential for cell proliferation and H3K27ac occupancy is a strong indicator of essentiality. We also show that cell-line-specific essential rCREs are enriched in the 8q24.21 region, with the rs11986220-containing rCRE regulating MYC and PVT1 expression, cell proliferation and tumorigenesis in a cell-line-specific manner, depending on DNA methylation-orchestrated occupancy of a CTCF binding site in between this rCRE and the MYC promoter. We demonstrate that CTCF deposition at this site as measured by DNA methylation level is highly variable in prostate specimens, and observe the MYC eQTL in the 8q24.21 locus in individuals with low CTCF binding. Together our findings highlight a causal mechanism synergistically driven by a risk SNP and DNA methylation-mediated 3D genome architecture, advocating for the integration of genetics and epigenetics in assessing risks conferred by genetic predispositions.


Asunto(s)
Sistemas CRISPR-Cas , Metilación de ADN , Edición Génica/métodos , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Neoplasias de la Próstata/genética , Animales , Factor de Unión a CCCTC/genética , Factor de Unión a CCCTC/metabolismo , Carcinogénesis/genética , Línea Celular Tumoral , Humanos , Masculino , Ratones Endogámicos NOD , Ratones SCID , Polimorfismo de Nucleótido Simple , Regiones Promotoras Genéticas/genética , Proteínas Proto-Oncogénicas c-myc/genética , Sitios de Carácter Cuantitativo/genética , Elementos Reguladores de la Transcripción/genética , Factores de Riesgo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...