Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 145
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Am J Hum Genet ; 110(4): 575-591, 2023 04 06.
Artículo en Inglés | MEDLINE | ID: mdl-37028392

RESUMEN

Leveraging linkage disequilibrium (LD) patterns as representative of population substructure enables the discovery of additive association signals in genome-wide association studies (GWASs). Standard GWASs are well-powered to interrogate additive models; however, new approaches are required for invesigating other modes of inheritance such as dominance and epistasis. Epistasis, or non-additive interaction between genes, exists across the genome but often goes undetected because of a lack of statistical power. Furthermore, the adoption of LD pruning as customary in standard GWASs excludes detection of sites that are in LD but might underlie the genetic architecture of complex traits. We hypothesize that uncovering long-range interactions between loci with strong LD due to epistatic selection can elucidate genetic mechanisms underlying common diseases. To investigate this hypothesis, we tested for associations between 23 common diseases and 5,625,845 epistatic SNP-SNP pairs (determined by Ohta's D statistics) in long-range LD (>0.25 cM). Across five disease phenotypes, we identified one significant and four near-significant associations that replicated in two large genotype-phenotype datasets (UK Biobank and eMERGE). The genes that were most likely involved in the replicated associations were (1) members of highly conserved gene families with complex roles in multiple pathways, (2) essential genes, and/or (3) genes that were associated in the literature with complex traits that display variable expressivity. These results support the highly pleiotropic and conserved nature of variants in long-range LD under epistatic selection. Our work supports the hypothesis that epistatic interactions regulate diverse clinical mechanisms and might especially be driving factors in conditions with a wide range of phenotypic outcomes.


Asunto(s)
Epistasis Genética , Estudio de Asociación del Genoma Completo , Desequilibrio de Ligamiento/genética , Genotipo , Bancos de Muestras Biológicas , Reino Unido , Polimorfismo de Nucleótido Simple/genética
2.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36738256

RESUMEN

Many problems in life sciences can be brought back to a comparison of graphs. Even though a multitude of such techniques exist, often, these assume prior knowledge about the partitioning or the number of clusters and fail to provide statistical significance of observed between-network heterogeneity. Addressing these issues, we developed an unsupervised workflow to identify groups of graphs from reliable network-based statistics. In particular, we first compute the similarity between networks via appropriate distance measures between graphs and use them in an unsupervised hierarchical algorithm to identify classes of similar networks. Then, to determine the optimal number of clusters, we recursively test for distances between two groups of networks. The test itself finds its inspiration in distance-wise ANOVA algorithms. Finally, we assess significance via the permutation of between-object distance matrices. Notably, the approach, which we will call netANOVA, is flexible since users can choose multiple options to adapt to specific contexts and network types. We demonstrate the benefits and pitfalls of our approach via extensive simulations and an application to two real-life datasets. NetANOVA achieved high performance in many simulation scenarios while controlling type I error. On non-synthetic data, comparison against state-of-the-art methods showed that netANOVA is often among the top performers. There are many application fields, including precision medicine, for which identifying disease subtypes via individual-level biological networks improves prevention programs, diagnosis and disease monitoring.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Simulación por Computador , Flujo de Trabajo , Análisis de Varianza
3.
Am J Med Genet A ; 194(7): e63584, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38450933

RESUMEN

Debates about the prospective clinical use of polygenic risk scores (PRS) have grown considerably in the last years. The potential benefits of PRS to improve patient care at individual and population levels have been extensively underlined. Nonetheless, the use of PRS in clinical contexts presents a number of unresolved ethical challenges and consequent normative gaps that hinder their optimal implementation. Here, we conducted a systematic review of reasons of the normative literature discussing ethical issues and moral arguments related to the use of PRS for the prevention and treatment of common complex diseases. In total, we have included and analyzed 34 records, spanning from 2013 to 2023. The findings have been organized in three major themes: in the first theme, we consider the potential harms of PRS to individuals and their kin. In the theme "Threats to health equity," we consider ethical concerns of social relevance, with a focus on justice issues. Finally, the theme "Towards best practices" collects a series of research priorities and provisional recommendations to be considered for an optimal clinical translation of PRS. We conclude that the use of PRS in clinical care reinvigorates old debates in matters of health justice; however, open questions, regarding best practices in clinical counseling, suggest that the ethical considerations applicable in monogenic settings will not be sufficient to face PRS emerging challenges.


Asunto(s)
Predisposición Genética a la Enfermedad , Herencia Multifactorial , Humanos , Herencia Multifactorial/genética , Principios Morales , Pruebas Genéticas/ética , Medición de Riesgo , Asesoramiento Genético/ética , Factores de Riesgo , Puntuación de Riesgo Genético
4.
PLoS Genet ; 17(6): e1009534, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-34086673

RESUMEN

Assumptions are made about the genetic model of single nucleotide polymorphisms (SNPs) when choosing a traditional genetic encoding: additive, dominant, and recessive. Furthermore, SNPs across the genome are unlikely to demonstrate identical genetic models. However, running SNP-SNP interaction analyses with every combination of encodings raises the multiple testing burden. Here, we present a novel and flexible encoding for genetic interactions, the elastic data-driven genetic encoding (EDGE), in which SNPs are assigned a heterozygous value based on the genetic model they demonstrate in a dataset prior to interaction testing. We assessed the power of EDGE to detect genetic interactions using 29 combinations of simulated genetic models and found it outperformed the traditional encoding methods across 10%, 30%, and 50% minor allele frequencies (MAFs). Further, EDGE maintained a low false-positive rate, while additive and dominant encodings demonstrated inflation. We evaluated EDGE and the traditional encodings with genetic data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes: age-related macular degeneration (AMD), age-related cataract, glaucoma, type 2 diabetes (T2D), and resistant hypertension. A multi-encoding genome-wide association study (GWAS) for each phenotype was performed using the traditional encodings, and the top results of the multi-encoding GWAS were considered for SNP-SNP interaction using the traditional encodings and EDGE. EDGE identified a novel SNP-SNP interaction for age-related cataract that no other method identified: rs7787286 (MAF: 0.041; intergenic region of chromosome 7)-rs4695885 (MAF: 0.34; intergenic region of chromosome 4) with a Bonferroni LRT p of 0.018. A SNP-SNP interaction was found in data from the UK Biobank within 25 kb of these SNPs using the recessive encoding: rs60374751 (MAF: 0.030) and rs6843594 (MAF: 0.34) (Bonferroni LRT p: 0.026). We recommend using EDGE to flexibly detect interactions between SNPs exhibiting diverse action.


Asunto(s)
Modelos Genéticos , Catarata/genética , Conjuntos de Datos como Asunto , Diabetes Mellitus Tipo 2/genética , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Glaucoma/genética , Humanos , Hipertensión/genética , Degeneración Macular/genética , Fenotipo , Polimorfismo de Nucleótido Simple
5.
BMC Bioinformatics ; 23(1): 57, 2022 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-35105309

RESUMEN

Genes and gene products do not function in isolation but as components of complex networks of macromolecules through physical or biochemical interactions. Dependencies of gene mutations on genetic background (i.e., epistasis) are believed to play a role in understanding molecular underpinnings of complex diseases such as inflammatory bowel disease (IBD). However, the process of identifying such interactions is complex due to for instance the curse of high dimensionality, dependencies in the data and non-linearity. Here, we propose a novel approach for robust and computationally efficient epistasis detection. We do so by first reducing dimensionality, per gene via diffusion kernel principal components (kpc). Subsequently, kpc gene summaries are used for downstream analysis including the construction of a gene-based epistasis network. We show that our approach is not only able to recover known IBD associated genes but also additional genes of interest linked to this difficult gastrointestinal disease.


Asunto(s)
Epistasis Genética , Estudio de Asociación del Genoma Completo , Difusión , Redes Reguladoras de Genes , Polimorfismo de Nucleótido Simple
6.
BMC Med Res Methodol ; 22(1): 62, 2022 03 06.
Artículo en Inglés | MEDLINE | ID: mdl-35249534

RESUMEN

BACKGROUND: Recent advances in biotechnology enable the acquisition of high-dimensional data on individuals, posing challenges for prediction models which traditionally use covariates such as clinical patient characteristics. Alternative forms of covariate representations for the features derived from these modern data modalities should be considered that can utilize their intrinsic interconnection. The connectivity information between these features can be represented as an individual-specific network defined by a set of nodes and edges, the strength of which can vary from individual to individual. Global or local graph-theoretical features describing the network may constitute potential prognostic biomarkers instead of or in addition to traditional covariates and may replace the often unsuccessful search for individual biomarkers in a high-dimensional predictor space. METHODS: We conducted a scoping review to identify, collate and critically appraise the state-of-art in the use of individual-specific networks for prediction modelling in medicine and applied health research, published during 2000-2020 in the electronic databases PubMed, Scopus and Embase. RESULTS: Our scoping review revealed the main application areas namely neurology and pathopsychology, followed by cancer research, cardiology and pathology (N = 148). Network construction was mainly based on Pearson correlation coefficients of repeated measurements, but also alternative approaches (e.g. partial correlation, visibility graphs) were found. For covariates measured only once per individual, network construction was mostly based on quantifying an individual's contribution to the overall group-level structure. Despite the multitude of identified methodological approaches for individual-specific network inference, the number of studies that were intended to enable the prediction of clinical outcomes for future individuals was quite limited, and most of the models served as proof of concept that network characteristics can in principle be useful for prediction. CONCLUSION: The current body of research clearly demonstrates the value of individual-specific network analysis for prediction modelling, but it has not yet been considered as a general tool outside the current areas of application. More methodological research is still needed on well-founded strategies for network inference, especially on adequate network sparsification and outcome-guided graph-theoretical feature extraction and selection, and on how networks can be exploited efficiently for prediction modelling.

7.
Int J Mol Sci ; 23(24)2022 Dec 08.
Artículo en Inglés | MEDLINE | ID: mdl-36555213

RESUMEN

A reoccurring issue in neuroepigenomic studies, especially in the context of neurodegenerative disease, is the use of (heterogeneous) bulk tissue, which generates noise during epigenetic profiling. A workable solution to this issue is to quantify epigenetic patterns in individually isolated neuronal cells using laser capture microdissection (LCM). For this purpose, we established a novel approach for targeted DNA methylation profiling of individual genes that relies on a combination of LCM and limiting dilution bisulfite pyrosequencing (LDBSP). Using this approach, we determined cytosine-phosphate-guanine (CpG) methylation rates of single alleles derived from 50 neurons that were isolated from unfixed post-mortem brain tissue. In the present manuscript, we describe the general workflow and, as a showcase, demonstrate how targeted methylation analysis of various genes, in this case, RHBDF2, OXT, TNXB, DNAJB13, PGLYRP1, C3, and LMX1B, can be performed simultaneously. By doing so, we describe an adapted data analysis pipeline for LDBSP, allowing one to include and correct CpG methylation rates derived from multi-allele reactions. In addition, we show that the efficiency of LDBSP on DNA derived from LCM neurons is similar to the efficiency obtained in previously published studies using this technique on other cell types. Overall, the method described here provides the user with a more accurate estimation of the DNA methylation status of each target gene in the analyzed cell pools, thereby adding further validity to this approach.


Asunto(s)
Enfermedades Neurodegenerativas , Humanos , Análisis de Secuencia de ADN/métodos , Metilación de ADN , Encéfalo , Secuenciación de Nucleótidos de Alto Rendimiento , Rayos Láser , Chaperonas Moleculares , Proteínas Reguladoras de la Apoptosis
8.
Brief Bioinform ; 20(6): 2200-2216, 2019 11 27.
Artículo en Inglés | MEDLINE | ID: mdl-30219892

RESUMEN

Principal components (PCs) are widely used in statistics and refer to a relatively small number of uncorrelated variables derived from an initial pool of variables, while explaining as much of the total variance as possible. Also in statistical genetics, principal component analysis (PCA) is a popular technique. To achieve optimal results, a thorough understanding about the different implementations of PCA is required and their impact on study results, compared to alternative approaches. In this review, we focus on the possibilities, limitations and role of PCs in ancestry prediction, genome-wide association studies, rare variants analyses, imputation strategies, meta-analysis and epistasis detection. We also describe several variations of classic PCA that deserve increased attention in statistical genetics applications.


Asunto(s)
Modelos Estadísticos , Análisis de Componente Principal , Animales , Humanos
9.
Semin Cancer Biol ; 55: 53-60, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-29727703

RESUMEN

Genome-wide association studies (GWAS) detect common genetic variants associated with complex disorders. With their comprehensive coverage of common single nucleotide polymorphisms and comparatively low cost, GWAS are an attractive tool in the clinical and commercial genetic testing. This review introduces the pipeline of statistical methods used in GWAS analysis, from data quality control, association tests, population structure control, interaction effects and results visualization, through to post-GWAS validation methods and related issues.


Asunto(s)
Pruebas Genéticas/estadística & datos numéricos , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Polimorfismo de Nucleótido Simple/genética , Genotipo , Humanos , Fenotipo
10.
Hum Genet ; 139(1): 45-59, 2020 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-31630246

RESUMEN

Due to its long genetic evolutionary history, Africans exhibit more genetic variation than any other population in the world. Their genetic diversity further lends itself to subdivisions of Africans into groups of individuals with a genetic similarity of varying degrees of granularity. It remains challenging to detect fine-scale structure in a computationally efficient and meaningful way. In this paper, we present a proof-of-concept of a novel fine-scale population structure detection tool with Western African samples. These samples consist of 1396 individuals from 25 ethnic groups (two groups are African American descendants). The strategy is based on a recently developed tool called IPCAPS. IPCAPS, or Iterative Pruning to CApture Population Structure, is a genetic divisive clustering strategy that enhances iterative pruning PCA, is robust to outliers and does not require a priori computation of haplotypes. Our strategy identified in total 12 groups and 6 groups were revealed as fine-scale structure detected in the samples from Cameroon, Gambia, Mali, Southwest USA, and Barbados. Our finding helped to explain evolutionary processes in the analyzed West African samples and raise awareness for fine-scale structure resolution when conducting genome-wide association and interaction studies.


Asunto(s)
Población Negra/genética , Etnicidad/genética , Variación Genética , Genética de Población , Estudio de Asociación del Genoma Completo , Haplotipos , Programas Informáticos , África Occidental/etnología , Humanos
11.
J Clin Gastroenterol ; 54(9): 819-825, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-31789759

RESUMEN

BACKGROUND AND GOALS: Active inflammatory bowel diseases (IBD) represent an independent risk factor for venous thromboembolism. The authors investigated the hemostatic profile of IBD patients before and after induction treatment with infliximab, vedolizumab, and methylprednisolone. STUDY: This prospective study included 62 patients with active IBD starting infliximab, vedolizumab, and/or methylprednisolone, and 22 healthy controls (HC). Plasma was collected before (w0) and after induction therapy (w14). Using a clot lysis assay, amplitude (marker for clot intensity), time to peak (Tmax; marker for clot formation rate), area under the curve (AUC; global marker for coagulation/fibrinolysis), and 50% clot lysis time (50%CLT; marker for fibrinolytic capacity) were determined. Plasminogen activator inhibitor-1 (PAI-1) and fibronectin were measured by ELISA. Clinical remission was evaluated at w14. RESULTS: At baseline, AUC, amplitude, and 50%CLT were significantly higher in IBD patients as compared with HC. In 34 remitters, AUC [165 (103-229)% vs. 97 (78-147)%, P=0.001], amplitude [119 (99-163)% vs. 95 (82-117)%, P=0.002], and 50%CLT [122 (94-146)% vs. 100 (87-129)%, P=0.001] decreased significantly and even normalized to the HC level. Vedolizumab trough concentration correlated inversely to fibronectin concentration (r, -0.732; P=0.002). The increase in Tmax for infliximab-treated remitters was significantly different from the decrease in Tmax for vedolizumab-treated remitters (P=0.028). The 50%CLT increased (P=0.038) when remitters were concomitantly treated with methylprednisolone. CONCLUSIONS: Control of inflammation using infliximab most strongly reduced those parameters that are associated with a higher risk of venous thromboembolism.


Asunto(s)
Enfermedades Inflamatorias del Intestino , Trombosis , Fibrinólisis , Humanos , Enfermedades Inflamatorias del Intestino/tratamiento farmacológico , Infliximab/efectos adversos , Estudios Prospectivos
12.
Am J Respir Crit Care Med ; 200(4): 444-453, 2019 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-30973757

RESUMEN

Rationale: Analysis of exhaled breath for asthma phenotyping using endogenously generated volatile organic compounds (VOCs) offers the possibility of noninvasive diagnosis and therapeutic monitoring. Induced sputum is indeed not widely available and markers of neutrophilic asthma are still lacking.Objectives: To determine whether analysis of exhaled breath using endogenously generated VOCs can be a surrogate marker for recognition of sputum inflammatory phenotypes.Methods: We conducted a prospective study on 521 patients with asthma recruited from the University Asthma Clinic of Liege. Patients underwent VOC measurement, fraction of exhaled nitric oxide (FeNO) spirometry, sputum induction, and gave a blood sample. Subjects with asthma were classified in three inflammatory phenotypes according to their sputum granulocytic cell count.Measurements and Main Results: In the discovery study, seven potential biomarkers were highlighted by gas chromatography-mass spectrometry in a training cohort of 276 patients with asthma. In the replication study (n = 245), we confirmed four VOCs of interest to discriminate among asthma inflammatory phenotypes using comprehensive two-dimensional gas chromatography coupled to high-resolution time-of-flight mass spectrometry. Hexane and 2-hexanone were identified as compounds with the highest classification performance in eosinophilic asthma with accuracy comparable to that of blood eosinophils and FeNO. Moreover, the combination of FeNO, blood eosinophils, and VOCs gave a very good prediction of eosinophilic asthma (area under the receiver operating characteristic curve, 0.9). For neutrophilic asthma, the combination of nonanal, 1-propanol, and hexane had a classification performance similar to FeNO or blood eosinophils in eosinophilic asthma. Those compounds were found in higher levels in neutrophilic asthma.Conclusions: Our study is the first attempt to characterize VOCs according to sputum granulocytic profile in a large population of patients with asthma and provide surrogate markers for neutrophilic asthma.


Asunto(s)
Asma/inmunología , Eosinofilia/inmunología , Eosinófilos , Neutrófilos , Esputo/citología , Adulto , Anciano , Asma/clasificación , Asma/diagnóstico , Asma/metabolismo , Pruebas Respiratorias , Eosinofilia/metabolismo , Femenino , Humanos , Masculino , Persona de Mediana Edad , Óxido Nítrico/metabolismo , Estudios Prospectivos , Espirometría , Compuestos Orgánicos Volátiles
14.
Qual Life Res ; 28(5): 1315-1325, 2019 May.
Artículo en Inglés | MEDLINE | ID: mdl-30659449

RESUMEN

PURPOSE: The inclusion of patient-reported outcome (PRO) questionnaires in prognostic factor analyses in oncology has substantially increased in recent years. We performed a simulation study to compare the performances of four different modeling strategies in estimating the prognostic impact of multiple collinear scales from PRO questionnaires. METHODS: We generated multiple scenarios describing survival data with different sample sizes, event rates and degrees of multicollinearity among five PRO scales. We used the Cox proportional hazards (PH) model to estimate the hazard ratios (HR) using automatic selection procedures, which were based on either the likelihood ratio-test (Cox-PV) or the Akaike Information Criterion (Cox-AIC). We also used Cox PH models which included all variables and were either penalized using the Ridge regression (Cox-R) or were estimated as usual (Cox-Full). For each scenario, we simulated 1000 independent datasets and compared the average outcomes of all methods. RESULTS: The Cox-R showed similar or better performances with respect to the other methods, particularly in scenarios with medium-high multicollinearity (ρ = 0.4 to ρ = 0.8) and small sample sizes (n = 100). Overall, the Cox-PV and Cox-AIC performed worse, for example they did not select one or more prognostic collinear PRO scales in some scenarios. Compared with the Cox-Full, the Cox-R provided HR estimates with similar bias patterns but smaller root-mean-squared errors, particularly in higher multicollinearity scenarios. CONCLUSIONS: Our findings suggest that the Cox-R is the best approach when performing prognostic factor analyses with multiple and collinear PRO scales, particularly in situations of high multicollinearity, small sample sizes and low event rates.


Asunto(s)
Neoplasias/psicología , Neoplasias/terapia , Medición de Resultados Informados por el Paciente , Calidad de Vida/psicología , Adulto , Anciano , Anciano de 80 o más Años , Análisis Factorial , Femenino , Humanos , Masculino , Persona de Mediana Edad , Pronóstico , Modelos de Riesgos Proporcionales , Tamaño de la Muestra , Adulto Joven
15.
Genet Epidemiol ; 41(6): 567-573, 2017 09.
Artículo en Inglés | MEDLINE | ID: mdl-28643332

RESUMEN

Integrative analyses of several omics data are emerging. The data are usually generated from the same source material (i.e., tumor sample) representing one level of regulation. However, integrating different regulatory levels (i.e., blood) with those from tumor may also reveal important knowledge about the human genetic architecture. To model this multilevel structure, an integrative-expression quantitative trait loci (eQTL) analysis applying two-stage regression (2SR) was proposed. This approach first regressed tumor gene expression levels with tumor markers and the adjusted residuals from the previous model were then regressed with the germline genotypes measured in blood. Previously, we demonstrated that penalized regression methods in combination with a permutation-based MaxT method (Global-LASSO) is a promising tool to fix some of the challenges that high-throughput omics data analysis imposes. Here, we assessed whether Global-LASSO can also be applied when tumor and blood omics data are integrated. We further compared our strategy with two 2SR-approaches, one using multiple linear regression (2SR-MLR) and other using LASSO (2SR-LASSO). We applied the three models to integrate genomic, epigenomic, and transcriptomic data from tumor tissue with blood germline genotypes from 181 individuals with bladder cancer included in the TCGA Consortium. Global-LASSO provided a larger list of eQTLs than the 2SR methods, identified a previously reported eQTLs in prostate stem cell antigen (PSCA), and provided further clues on the complexity of APBEC3B loci, with a minimal false-positive rate not achieved by 2SR-MLR. It also represents an important contribution for omics integrative analysis because it is easy to apply and adaptable to any type of data.


Asunto(s)
Genómica , Sitios de Carácter Cuantitativo/genética , Neoplasias de la Vejiga Urinaria/genética , Cromosomas Humanos/genética , Simulación por Computador , Humanos , Modelos Lineales , Modelos Genéticos , Análisis Multivariante , Polimorfismo de Nucleótido Simple/genética
16.
Genet Epidemiol ; 41(2): 136-144, 2017 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-28019039

RESUMEN

The vast amount of heterogeneous omics data, encompassing a broad range of biomolecular information, requires novel methods of analysis, including those that integrate the available levels of information. In this work, we describe Regression2Net, a computational approach that is able to integrate gene expression and genomic or methylation data in two steps. First, penalized regressions are used to build Expression-Expression (EEnet) and Expression-Genomic or Expression-Methylation (EMnet) networks. Second, network theory is used to highlight important communities of genes. When applying our approach, Regression2Net to gene expression and methylation profiles for individuals with glioblastoma multiforme, we identified, respectively, 284 and 447 potentially interesting genes in relation to glioblastoma pathology. These genes showed at least one connection in the integrated networks ANDnet and XORnet derived from aforementioned EEnet and EMnet networks. Although the edges in ANDnet occur in both EEnet and EMnet, the edges in XORnet occur in EMnet but not in EEnet. In-depth biological analysis of connected genes in ANDnet and XORnet revealed genes that are related to energy metabolism, cell cycle control (AATF), immune system response, and several cancer types. Importantly, we observed significant overrepresentation of cancer-related pathways including glioma, especially in the XORnet network, suggesting a nonignorable role of methylation in glioblastoma multiforma. In the ANDnet, we furthermore identified potential glioma suppressor genes ACCN3 and ACCN4 linked to the NBPF1 neuroblastoma breakpoint family, as well as numerous ABC transporter genes (ABCA1, ABCB1) suggesting drug resistance of glioblastoma tumors.


Asunto(s)
Metilación de ADN , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Genómica/métodos , Glioblastoma/genética , Proteínas de Neoplasias/genética , Biología Computacional/métodos , Glioblastoma/patología , Humanos
17.
Brief Bioinform ; 17(2): 293-308, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26108231

RESUMEN

Complex diseases are defined to be determined by multiple genetic and environmental factors alone as well as in interactions. To analyze interactions in genetic data, many statistical methods have been suggested, with most of them relying on statistical regression models. Given the known limitations of classical methods, approaches from the machine-learning community have also become attractive. From this latter family, a fast-growing collection of methods emerged that are based on the Multifactor Dimensionality Reduction (MDR) approach. Since its first introduction, MDR has enjoyed great popularity in applications and has been extended and modified multiple times. Based on a literature search, we here provide a systematic and comprehensive overview of these suggested methods. The methods are described in detail, and the availability of implementations is listed. Most recent approaches offer to deal with large-scale data sets and rare variants, which is why we expect these methods to even gain in popularity.


Asunto(s)
Algoritmos , Modelos Estadísticos , Reducción de Dimensionalidad Multifactorial/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Mapeo de Interacción de Proteínas/métodos , Simulación por Computador
18.
PLoS Genet ; 11(12): e1005689, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26646822

RESUMEN

Omics data integration is becoming necessary to investigate the genomic mechanisms involved in complex diseases. During the integration process, many challenges arise such as data heterogeneity, the smaller number of individuals in comparison to the number of parameters, multicollinearity, and interpretation and validation of results due to their complexity and lack of knowledge about biological processes. To overcome some of these issues, innovative statistical approaches are being developed. In this work, we propose a permutation-based method to concomitantly assess significance and correct by multiple testing with the MaxT algorithm. This was applied with penalized regression methods (LASSO and ENET) when exploring relationships between common genetic variants, DNA methylation and gene expression measured in bladder tumor samples. The overall analysis flow consisted of three steps: (1) SNPs/CpGs were selected per each gene probe within 1Mb window upstream and downstream the gene; (2) LASSO and ENET were applied to assess the association between each expression probe and the selected SNPs/CpGs in three multivariable models (SNP, CPG, and Global models, the latter integrating SNPs and CPGs); and (3) the significance of each model was assessed using the permutation-based MaxT method. We identified 48 genes whose expression levels were significantly associated with both SNPs and CPGs. Importantly, 36 (75%) of them were replicated in an independent data set (TCGA) and the performance of the proposed method was checked with a simulation study. We further support our results with a biological interpretation based on an enrichment analysis. The approach we propose allows reducing computational time and is flexible and easy to implement when analyzing several types of omics data. Our results highlight the importance of integrating omics data by applying appropriate statistical strategies to discover new insights into the complex genetic mechanisms involved in disease conditions.


Asunto(s)
Metilación de ADN/genética , Predisposición Genética a la Enfermedad , Proteínas de Neoplasias/biosíntesis , Neoplasias de la Vejiga Urinaria/genética , Algoritmos , Islas de CpG/genética , Regulación Neoplásica de la Expresión Génica , Genómica , Humanos , Proteínas de Neoplasias/genética , Polimorfismo de Nucleótido Simple/genética , Programas Informáticos , Neoplasias de la Vejiga Urinaria/patología
19.
Gut ; 66(1): 79-88, 2017 01.
Artículo en Inglés | MEDLINE | ID: mdl-26423113

RESUMEN

OBJECTIVE: Pouchitis is the most common complication after colectomy with ileal pouch-anal anastomosis (IPAA) for UC and the risk is the highest within the 1st year after surgery. The pathogenesis is not completely understood but clinical response to antibiotics suggests a role for gut microbiota. We hypothesised that the risk for pouchitis can be predicted based on the faecal microbial composition before colectomy. DESIGN: Faecal samples from 21 patients with UC undergoing IPAA were prospectively collected before colectomy and at predefined clinical visits at 1 month, 3 months, 6 months and 12 months after IPAA. The predominant microbiota was analysed using community profiling with denaturing gradient gel electrophoresis followed by quantitative real-time PCR validation. RESULTS: Cluster analysis before colectomy distinguished patients with pouchitis from those with normal pouch during the 1st year of follow-up. In patients developing pouchitis, an increase of Ruminococcus gnavus (p<0.001), Bacteroides vulgatus (p=0.043), Clostridium perfringens (p=0.011) and a reduction of two Lachnospiraceae genera (Blautia (p=0.04), Roseburia (p=0.008)) was observed. A score combining these five bacterial risk factors was calculated and presence of at least two risk factors showed a sensitivity and specificity of 100% and 63.6%, respectively. CONCLUSIONS: Presence of R. gnavus, B. vulgatus and C. perfringens and absence of Blautia and Roseburia in faecal samples of patients with UC before surgery is associated with a higher risk of pouchitis after IPAA. Our findings suggest new predictive and therapeutic strategies in patients undergoing colectomy with IPAA.


Asunto(s)
Colitis Ulcerosa/microbiología , Colitis Ulcerosa/cirugía , ADN Bacteriano/análisis , Heces/microbiología , Reservoritis/microbiología , Adulto , Bacteroidetes/genética , Bacteroidetes/aislamiento & purificación , Clostridium perfringens/genética , Clostridium perfringens/aislamiento & purificación , Análisis por Conglomerados , Reservorios Cólicos/efectos adversos , Ácidos Grasos Volátiles/análisis , Heces/química , Femenino , Microbioma Gastrointestinal/genética , Humanos , Complejo de Antígeno L1 de Leucocito/análisis , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Periodo Preoperatorio , Proctocolectomía Restauradora/efectos adversos , Estudios Prospectivos , Ruminococcus/genética , Ruminococcus/aislamiento & purificación , Factores de Tiempo
20.
Genet Epidemiol ; 40(8): 767-778, 2016 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-27870152

RESUMEN

Gene regulatory network (GRN) inference is an active area of research that facilitates understanding the complex interplays between biological molecules. We propose a novel framework to create such GRNs, based on Conditional Inference Forests (CIFs) as proposed by Strobl et al. Our framework consists of using ensembles of Conditional Inference Trees (CITs) and selecting an appropriate aggregation scheme for variant selection prior to network construction. We show on synthetic microarray data that taking the original implementation of CIFs with conditional permutation scheme (CIFcond ) may lead to improved performance compared to Breiman's implementation of Random Forests (RF). Among all newly introduced CIF-based methods and five network scenarios obtained from the DREAM4 challenge, CIFcond performed best. Networks derived from well-tuned CIFs, obtained by simply averaging P-values over tree ensembles (CIFmean ) are particularly attractive, because they combine adequate performance with computational efficiency. Moreover, thresholds for variable selection are based on significance levels for P-values and, hence, do not need to be tuned. From a practical point of view, our extensive simulations show the potential advantages of CIFmean -based methods. Although more work is needed to improve on speed, especially when fully exploiting the advantages of CITs in the context of heterogeneous and correlated data, we have shown that CIF methodology can be flexibly inserted in a framework to infer biological interactions. Notably, we confirmed biologically relevant interaction between IL2RA and FOXP1, linked to the IL-2 signaling pathway and to type 1 diabetes.


Asunto(s)
Biomarcadores/análisis , Diabetes Mellitus Tipo 1/genética , Factores de Transcripción Forkhead/genética , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Subunidad alfa del Receptor de Interleucina-2/genética , Modelos Genéticos , Proteínas Represoras/genética , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA