Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
1.
Bioinformatics ; 34(12): 2142-2143, 2018 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-29420690

RESUMEN

Summary: OmicsPrint is a versatile method for the detection of data linkage errors in multiple omics studies encompassing genetic, transcriptome and/or methylome data. OmicsPrint evaluates data linkage within and between omics data types using genotype calls from SNP arrays, DNA- or RNA-sequencing data and includes an algorithm to infer genotypes from Illumina DNA methylation array data. The method uses classification to verify assumed relationships and detect any data linkage errors, e.g. arising from sample mix-ups and mislabeling. Graphical and text output is provided to inspect and resolve putative data linkage errors. If sufficient genotype calls are available, first degree family relations also are revealed which can be used to check parent-offspring relations or zygosity in twin studies. Availability and implementation: omicsPrint is available from BioConductor; http://bioconductor.org/packages/omicsPrint. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Metilación de ADN , Genómica/métodos , Almacenamiento y Recuperación de la Información/métodos , Polimorfismo de Nucleótido Simple , Programas Informáticos , Transcriptoma , Algoritmos , Femenino , Perfilación de la Expresión Génica/métodos , Humanos , Masculino , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ARN/métodos
2.
Nature ; 501(7468): 506-11, 2013 Sep 26.
Artículo en Inglés | MEDLINE | ID: mdl-24037378

RESUMEN

Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ARN , Transcriptoma/genética , Alelos , Línea Celular Transformada , Exones/genética , Perfilación de la Expresión Génica , Humanos , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , ARN Mensajero/análisis , ARN Mensajero/genética
3.
BMC Genomics ; 19(1): 90, 2018 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-29370748

RESUMEN

BACKGROUND: SNP panels that uniquely identify an individual are useful for genetic and forensic research. Previously recommended SNP panels are based on DNA profiles and mostly contain intragenic SNPs. With the increasing interest in RNA expression profiles, we aimed for establishing a SNP panel for both DNA and RNA-based genotyping. RESULTS: To determine a small set of SNPs with maximally discriminative power, genotype calls were obtained from DNA and blood-derived RNA sequencing data belonging to healthy, geographically dispersed, Dutch individuals. SNPs were selected based on different criteria like genotype call rate, minor allele frequency, Hardy-Weinberg equilibrium and linkage disequilibrium. A panel of 50 SNPs was sufficient to identify an individual uniquely: the probability of identity was 6.9 × 10- 20 when assuming no family relations and 1.2 × 10- 10 when accounting for the presence of full sibs. The ability of the SNP panel to uniquely identify individuals on DNA and RNA level was validated in an independent population dataset. The panel is applicable to individuals from European descent, with slightly lower power in non-Europeans. Whereas most of the genes containing the 50 SNPs are expressed in various tissues, our SNP panel needs optimization for other tissues than blood. CONCLUSIONS: This first DNA/RNA SNP panel will be useful to identify sample mix-ups in biomedical research and for assigning DNA and RNA stains in crime scenes to unique individuals.


Asunto(s)
ADN/análisis , Etnicidad/genética , Genética de Población , Sistemas de Identificación de Pacientes/métodos , Polimorfismo de Nucleótido Simple , ARN/análisis , ADN/genética , Dermatoglifia del ADN , Frecuencia de los Genes , Pruebas Genéticas , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Individualidad , Desequilibrio de Ligamiento , ARN/genética
4.
Biochim Biophys Acta Gen Subj ; 1862(3): 637-648, 2018 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-29055820

RESUMEN

BACKGROUND: Glycosylation is one of the most common post-translation modifications with large influences on protein structure and function. The effector function of immunoglobulin G (IgG) alters between pro- and anti-inflammatory, based on its glycosylation. IgG glycan synthesis is highly complex and dynamic. METHODS: With the use of two different analytical methods for assessing IgG glycosylation, we aim to elucidate the link between DNA methylation and glycosylation of IgG by means of epigenome-wide association studies. In total, 3000 individuals from 4 cohorts were analyzed. RESULTS: The overlap of the results from the two glycan measurement panels yielded DNA methylation of 7 CpG-sites on 5 genomic locations to be associated with IgG glycosylation: cg25189904 (chr.1, GNG12); cg05951221, cg21566642 and cg01940273 (chr.2, ALPPL2); cg05575921 (chr.5, AHRR); cg06126421 (6p21.33); and cg03636183 (chr.19, F2RL3). Mediation analyses with respect to smoking revealed that the effect of smoking on IgG glycosylation may be at least partially mediated via DNA methylation levels at these 7 CpG-sites. CONCLUSION: Our results suggest the presence of an indirect link between DNA methylation and IgG glycosylation that may in part capture environmental exposures. GENERAL SIGNIFICANCE: An epigenome-wide analysis conducted in four population-based cohorts revealed an association between DNA methylation and IgG glycosylation patterns. Presumably, DNA methylation mediates the effect of smoking on IgG glycosylation.


Asunto(s)
Metilación de ADN , Inmunoglobulina G/química , Procesamiento Proteico-Postraduccional , Fumar/efectos adversos , Mapeo Cromosómico , Estudios de Cohortes , Islas de CpG , Epigenómica/métodos , Europa (Continente) , Glicosilación , Humanos , Inmunoglobulina G/metabolismo , Estudios Multicéntricos como Asunto , Polisacáridos/análisis , Estudios en Gemelos como Asunto
5.
Bioinformatics ; 30(23): 3435-7, 2014 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-25147358

RESUMEN

UNLABELLED: The Illumina 450k array is a frequently used platform for large-scale genome-wide DNA methylation studies, i.e. epigenome-wide association studies. Currently, quality control of 450k data can be performed with Illumina's GenomeStudio and is part of a limited number 450k analysis pipelines. However, GenomeStudio cannot handle large-scale studies, and existing pipelines provide limited options for quality control and neither support interactive exploration by the user. To aid the detection of bad-quality samples in large-scale genome-wide DNA methylation studies as flexible and transparent as possible, we have developed MethylAid; a visual and interactive Web application using RStudio's shiny package. Bad-quality samples are detected using sample-dependent and sample-independent quality control probes present on the array and user-adjustable thresholds. In-depth exploration of bad-quality samples can be performed using several interactive diagnostic plots. Furthermore, plots can be annotated with user-provided metadata, for example, to identify outlying batches. Our new tool makes quality assessment of 450k array data interactive, flexible and efficient and is, therefore, expected to be useful for both data analysts and core facilities. AVAILABILITY AND IMPLEMENTATION: MethylAid is implemented as an R/Bioconductor package (www.bioconductor.org/packages/3.0/bioc/html/MethylAid.html). A demo application is available from shiny.bioexp.nl/MethylAid.


Asunto(s)
Metilación de ADN , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Programas Informáticos , Genómica/métodos , Humanos , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Control de Calidad
6.
Nucleic Acids Res ; 41(15): e146, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23771142

RESUMEN

Current microRNA target predictions are based on sequence information and empirically derived rules but do not make use of the expression of microRNAs and their targets. This study aimed to improve microRNA target predictions in a given biological context, using in silico predictions, microRNA and mRNA expression. We used target prediction tools to produce lists of predicted targets and used a gene set test designed to detect consistent effects of microRNAs on the joint expression of multiple targets. In a single test, association between microRNA expression and target gene set expression as well as the contribution of the individual target genes on the association are determined. The strongest negatively associated mRNAs as measured by the test were prioritized. We applied our integration method to a well-defined muscle differentiation model. Validation of our predictions in C2C12 cells confirmed predicted targets of known as well as novel muscle-related microRNAs. We further studied associations between microRNA-mRNA pairs in human prostate cancer, finding some pairs that have been recently experimentally validated by others. Using the same study, we showed the advantages of the global test over Pearson correlation and lasso. We conclude that our integrated approach successfully identifies regulated microRNAs and their targets.


Asunto(s)
Regulación Neoplásica de la Expresión Génica , MicroARNs/análisis , Mioblastos Esqueléticos/metabolismo , ARN Mensajero/análisis , Programas Informáticos , Regiones no Traducidas 3' , Algoritmos , Animales , Diferenciación Celular , Humanos , Masculino , Ratones , MicroARNs/genética , Mioblastos Esqueléticos/citología , Neoplasias de la Próstata/metabolismo , Neoplasias de la Próstata/patología , ARN Mensajero/genética , Transcriptoma
7.
Stat Appl Genet Mol Biol ; 12(4): 449-67, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23934609

RESUMEN

In the design of microarray or next-generation sequencing experiments it is crucial to choose the appropriate number of biological replicates. As often the number of differentially expressed genes and their effect sizes are small and too few replicates will lead to insufficient power to detect these. On the other hand, too many replicates unnecessary leads to high experimental costs. Power and sample size analysis can guide experimentalist in choosing the appropriate number of biological replicates. Several methods for power and sample size analysis have recently been proposed for microarray data. However, most of these are restricted to two group comparisons and require user-defined effect sizes. Here we propose a pilot-data based method for power and sample size analysis which can handle more general experimental designs and uses pilot-data to obtain estimates of the effect sizes. The method can also handle χ2 distributed test statistics which enables power and sample size calculations for a much wider class of models, including high-dimensional generalized linear models which are used, e.g., for RNA-seq data analysis. The performance of the method is evaluated using simulated and experimental data from several microarray and next-generation sequencing experiments. Furthermore, we compare our proposed method for estimation of the density of effect sizes from pilot data with a recent proposed method specific for two group comparisons.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Algoritmos , Animales , Interpretación Estadística de Datos , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Modelos Lineales , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos , Tamaño de la Muestra , Análisis de Secuencia de ARN
8.
Stat Appl Genet Mol Biol ; 11(4)2012 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-22850064

RESUMEN

BACKGROUND: Among the most commonly applied microarray normalization methods are intensity-dependent normalization methods such as lowess or loess algorithms. Their computational complexity makes them slow and thus less suitable for normalization of large datasets. Current implementations try to circumvent this problem by using a random subset of the data for normalization, but the impact of this modification has not been previously assessed. We developed a novel intensity-dependent normalization method for microarrays that is fast, simple and can include weighing of observations. RESULTS: Our normalization method is based on the P-spline scatterplot smoother using all data points for normalization. We show that using a random subset of the data for normalization should be avoided as unstable results can be produced. However, in certain cases normalization based on an invariant subset is desirable, for example, when groups of samples before and after intervention are compared. We show in the context of DNA methylation arrays that a constant weighted P-spline normalization yields a more reliable normalization curve than the one obtained by normalization on the invariant subset only. CONCLUSIONS: Our novel intensity-dependent normalization method is simpler and faster than current loess algorithms, and can be applied to one- and two-colour array data, similar to normalization based on loess. AVAILABILITY: An implementation of the method is currently available as an R package called TurboNorm from www.bioconductor.org.


Asunto(s)
Ensayos Analíticos de Alto Rendimiento/normas , Análisis por Micromatrices/métodos , Análisis por Micromatrices/normas , Biología Computacional/métodos , Biología Computacional/normas , Ensayos Analíticos de Alto Rendimiento/métodos , Humanos , Distribución Aleatoria , Estándares de Referencia , Programas Informáticos , Factores de Tiempo , Estudios de Validación como Asunto
9.
Nat Commun ; 14(1): 544, 2023 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-36725846

RESUMEN

Immune cell function can be altered by lipids in circulation, a process potentially relevant to lipid-associated inflammatory diseases including atherosclerosis and rheumatoid arthritis. To gain further insight in the molecular changes involved, we here perform a transcriptome-wide association analysis of blood triglycerides, HDL cholesterol, and LDL cholesterol in 3229 individuals, followed by a systematic bidirectional Mendelian randomization analysis to assess the direction of effects and control for pleiotropy. Triglycerides are found to induce transcriptional changes in 55 genes and HDL cholesterol in 5 genes. The function and cell-specific expression pattern of these genes implies that triglycerides downregulate both cellular lipid metabolism and, unexpectedly, allergic response. Indeed, a Mendelian randomization approach based on GWAS summary statistics indicates that several of these genes, including interleukin-4 (IL4) and IgE receptors (FCER1A, MS4A2), affect the incidence of allergic diseases. Our findings highlight the interplay between triglycerides and immune cells in allergic disease.


Asunto(s)
Metabolismo de los Lípidos , Transcriptoma , Humanos , HDL-Colesterol , Metabolismo de los Lípidos/genética , Triglicéridos , LDL-Colesterol , Análisis de la Aleatorización Mendeliana , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Factores de Riesgo
10.
Proteomics ; 12(4-5): 543-9, 2012 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-22246801

RESUMEN

Bioinformatics is the field where computational methods from various domains have come together for analysis of biological data. Each domain has introduced its own specific jargon. However, in closely related domains, e.g. machine learning and statistics, concordant and discordant terminology occurs, the later can lead to confusion. This article aims to help solve the confusion of tongues arising from these two closely related domains, which are frequently used in bioinformatics. We provide a short summary of the most commonly applied machine learning and statistical approaches to data analysis in bioinformatics, i.e. classification and statistical hypothesis testing. We explain differences and similarities in common terminology used in various domains, such as precision, recall, sensitivity and true positive rate. This primer can serve as a guide to the terminology used in these fields.


Asunto(s)
Inteligencia Artificial , Biología Computacional/métodos , Proteómica/métodos , Estadística como Asunto/métodos , Modelos Estadísticos
11.
Clin Chem ; 58(4): 699-706, 2012 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-22278607

RESUMEN

BACKGROUND: Noninvasive fetal aneuploidy detection by use of free DNA from maternal plasma has recently been shown to be achievable by whole genome shotgun sequencing. The high-throughput next-generation sequencing platforms previously tested use a PCR step during sample preparation, which results in amplification bias in GC-rich areas of the human genome. To eliminate this bias, and thereby experimental noise, we have used single molecule sequencing as an alternative method. METHODS: For noninvasive trisomy 21 detection, we performed single molecule sequencing on the Helicos platform using free DNA isolated from maternal plasma from 9 weeks of gestation onwards. Relative sequence tag density ratios were calculated and results were directly compared to the previously described Illumina GAII platform. RESULTS: Sequence data generated without an amplification step show no GC bias. Therefore, with the use of single molecule sequencing all trisomy 21 fetuses could be distinguished more clearly from euploid fetuses. CONCLUSIONS: This study shows for the first time that single molecule sequencing is an attractive and easy to use alternative for reliable noninvasive fetal aneuploidy detection in diagnostics. With this approach, previously described experimental noise associated with PCR amplification, such as GC bias, can be overcome.


Asunto(s)
ADN/genética , Síndrome de Down/diagnóstico , ADN/sangre , Femenino , Feto , Humanos , Masculino , Reacción en Cadena de la Polimerasa , Embarazo , Primer Trimestre del Embarazo , Estudios Retrospectivos , Análisis de Secuencia de ADN/métodos
12.
BMC Bioinformatics ; 11: 450, 2010 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-20822518

RESUMEN

BACKGROUND: In high-dimensional data analysis such as differential gene expression analysis, people often use filtering methods like fold-change or variance filters in an attempt to reduce the multiple testing penalty and improve power. However, filtering may introduce a bias on the multiple testing correction. The precise amount of bias depends on many quantities, such as fraction of probes filtered out, filter statistic and test statistic used. RESULTS: We show that a biased multiple testing correction results if non-differentially expressed probes are not filtered out with equal probability from the entire range of p-values. We illustrate our results using both a simulation study and an experimental dataset, where the FDR is shown to be biased mostly by filters that are associated with the hypothesis being tested, such as the fold change. Filters that induce little bias on the FDR yield less additional power of detecting differentially expressed genes. Finally, we propose a statistical test that can be used in practice to determine whether any chosen filter introduces bias on the FDR estimate used, given a general experimental setup. CONCLUSIONS: Filtering out of probes must be used with care as it may bias the multiple testing correction. Researchers can use our test for FDR bias to guide their choice of filter and amount of filtering in practice.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Algoritmos , Humanos , Leucemia/genética
13.
Eur J Hum Genet ; 28(2): 253-263, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31558840

RESUMEN

Insights into individual differences in gene expression and its heritability (h2) can help in understanding pathways from DNA to phenotype. We estimated the heritability of gene expression of 52,844 genes measured in whole blood in the largest twin RNA-Seq sample to date (1497 individuals including 459 monozygotic twin pairs and 150 dizygotic twin pairs) from classical twin modeling and identity-by-state-based approaches. We estimated for each gene h2total, composed of cis-heritability (h2cis, the variance explained by single nucleotide polymorphisms in the cis-window of the gene), and trans-heritability (h2res, the residual variance explained by all other genome-wide variants). Mean h2total was 0.26, which was significantly higher than heritability estimates earlier found in a microarray-based study using largely overlapping (>60%) RNA samples (mean h2 = 0.14, p = 6.15 × 10-258). Mean h2cis was 0.06 and strongly correlated with beta of the top cis expression quantitative loci (eQTL, ρ = 0.76, p < 10-308) and with estimates from earlier RNA-Seq-based studies. Mean h2res was 0.20 and correlated with the beta of the corresponding trans-eQTL (ρ = 0.04, p < 1.89 × 10-3) and was significantly higher for genes involved in cytokine-cytokine interactions (p = 4.22 × 10-15), many other immune system pathways, and genes identified in genome-wide association studies for various traits including behavioral disorders and cancer. This study provides a thorough characterization of cis- and trans-h2 estimates of gene expression, which is of value for interpretation of GWAS and gene expression studies.


Asunto(s)
Interacción Gen-Ambiente , Polimorfismo de Nucleótido Simple , Carácter Cuantitativo Heredable , Adolescente , Adulto , Anciano , Femenino , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Sitios de Carácter Cuantitativo , RNA-Seq/métodos , Gemelos Dicigóticos/genética , Gemelos Monocigóticos/genética
14.
Genome Biol ; 21(1): 220, 2020 08 28.
Artículo en Inglés | MEDLINE | ID: mdl-32859263

RESUMEN

BACKGROUND: DNA methylation is a key epigenetic modification in human development and disease, yet there is limited understanding of its highly coordinated regulation. Here, we identify 818 genes that affect DNA methylation patterns in blood using large-scale population genomics data. RESULTS: By employing genetic instruments as causal anchors, we establish directed associations between gene expression and distant DNA methylation levels, while ensuring specificity of the associations by correcting for linkage disequilibrium and pleiotropy among neighboring genes. The identified genes are enriched for transcription factors, of which many consistently increased or decreased DNA methylation levels at multiple CpG sites. In addition, we show that a substantial number of transcription factors affected DNA methylation at their experimentally determined binding sites. We also observe genes encoding proteins with heterogenous functions that have widespread effects on DNA methylation, e.g., NFKBIE, CDCA7(L), and NLRC5, and for several examples, we suggest plausible mechanisms underlying their effect on DNA methylation. CONCLUSION: We report hundreds of genes that affect DNA methylation and provide key insights in the principles underlying epigenetic regulation.


Asunto(s)
Metilación de ADN , Epigénesis Genética , Estudio de Asociación del Genoma Completo , Endopeptidasas/genética , Expresión Génica , Pleiotropía Genética , Genómica , Humanos , Proteínas I-kappa B/genética , Péptidos y Proteínas de Señalización Intracelular/genética , Proteínas Nucleares/genética , Proteínas Proto-Oncogénicas/genética , Proteínas Represoras/genética , Factores de Transcripción/genética
15.
Carcinogenesis ; 30(10): 1805-12, 2009 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-19696161

RESUMEN

The carcinogenic potential of chemicals and pharmaceuticals is traditionally tested in the chronic, 2 year rodent bioassay. This assay is not only time consuming, expensive and often with a limited sensitivity and specificity but it also causes major distress to the experimental animals. A major improvement in carcinogenicity testing, especially regarding reduction and refinement of animal experimentation, could be the application of toxicogenomics. The ultimate aim of this study is to demonstrate a proof-of-principle for transcriptomics biomarkers in various tissues for identification of (subclasses of) carcinogenic compounds after short-term in vivo exposure studies. Both wild-type and DNA repair-deficient Xpa(-/-)/p53(+/-) (Xpa/p53) mice were exposed up to 14 days to compounds of three distinct classes: genotoxic carcinogens (GTXC), non-genotoxic carcinogens (NGTXC) and non-carcinogens. Subsequently, extensive transcriptomics analyses were performed on several tissues, and transcriptomics data were screened for potential biomarkers using advanced statistical learning techniques. For all tissues analyzed, we identified multigene gene-expression signatures that are, with a high confidence, predictive for GTXC and NGTXC exposures in both mouse genotypes. Xpa/p53 mice did not perform better in the short-term bioassay. We were able to achieve a proof-of-principle for the identification and use of transcriptomics biomarkers for GTXC or NGTXC. This supports the view that toxicogenomics with short-term in vivo exposure provides a viable tool for classifying (geno)toxic compounds.


Asunto(s)
Carcinógenos/toxicidad , Perfilación de la Expresión Génica , Mutágenos/toxicidad , Proteína p53 Supresora de Tumor/genética , Proteína de la Xerodermia Pigmentosa del Grupo A/genética , Animales , Reparación del ADN/genética , Regulación Neoplásica de la Expresión Génica , Marcadores Genéticos , Genotipo , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , ARN/genética , ARN/aislamiento & purificación , Transcripción Genética , Proteína p53 Supresora de Tumor/deficiencia
16.
Hum Mutat ; 29(1): 39-44, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17924574

RESUMEN

The superfamily of human G protein-coupled receptors (GPCRs) is large and regulates a plethora of important physiological processes by transducing extracellular signals over cell membranes. A diversity of natural variants occurs in these receptors, including rare mutations and common polymorphisms. These variants differ in their impact on DNA, ranging from single nucleotide polymorphisms (SNPs) to copy number variants, and in their impact on protein function. Natural variants furthermore vary in their effects on human phenotypes from neutral to disease-associated. As mutation data are highly dispersed over numerous sources, a single resource for variants would aid investigators of GPCRs. The GPCR NaVa database therefore integrates data on natural variants in human GPCRs from online databases, the scientific literature, and patents. Where available, variants contain information on their location in the DNA (and protein sequence), the involved nucleotides (and amino acids), the average frequency of each allele, reported disease associations, and references to public databases and the scientific literature. The GPCR NaVa database aims to facilitate studies into pharmacogenetics, genotype-phenotype, and structure-function relationships of GPCRs. The GPCR NaVa database is interlinked with the family-specific GPCRDB resource and is accessible as a stand-alone database through a user-friendly website at http://nava.liacs.nl (last accessed 28 August 2007).


Asunto(s)
Bases de Datos Genéticas , Variación Genética , Receptores Acoplados a Proteínas G/genética , Bases de Datos de Proteínas , Genoma Humano , Humanos , Polimorfismo de Nucleótido Simple , Programas Informáticos
17.
NPJ Sci Learn ; 3: 7, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30631468

RESUMEN

Educational attainment is a key behavioural measure in studies of cognitive and physical health, and socioeconomic status. We measured DNA methylation at 410,746 CpGs (N = 4152) and identified 58 CpGs associated with educational attainment at loci characterized by pleiotropic functions shared with neuronal, immune and developmental processes. Associations overlapped with those for smoking behaviour, but remained after accounting for smoking at many CpGs: Effect sizes were on average 28% smaller and genome-wide significant at 11 CpGs after adjusting for smoking and were 62% smaller in never smokers. We examined sources and biological implications of education-related methylation differences, demonstrating correlations with maternal prenatal folate, smoking and air pollution signatures, and associations with gene expression in cis, dynamic methylation in foetal brain, and correlations between blood and brain. Our findings show that the methylome of lower-educated people resembles that of smokers beyond effects of their own smoking behaviour and shows traces of various other exposures.

18.
Nat Commun ; 9(1): 3097, 2018 08 06.
Artículo en Inglés | MEDLINE | ID: mdl-30082726

RESUMEN

Identification of causal drivers behind regulatory gene networks is crucial in understanding gene function. Here, we develop a method for the large-scale inference of gene-gene interactions in observational population genomics data that are both directed (using local genetic instruments as causal anchors, akin to Mendelian Randomization) and specific (by controlling for linkage disequilibrium and pleiotropy). Analysis of genotype and whole-blood RNA-sequencing data from 3072 individuals identified 49 genes as drivers of downstream transcriptional changes (Wald P < 7 × 10-10), among which transcription factors were overrepresented (Fisher's P = 3.3 × 10-7). Our analysis suggests new gene functions and targets, including for SENP7 (zinc-finger genes involved in retroviral repression) and BCL2A1 (target genes possibly involved in auditory dysfunction). Our work highlights the utility of population genomics data in deriving directed gene expression networks. A resource of trans-effects for all 6600 genes with a genetic instrument can be explored individually using a web-based browser.


Asunto(s)
Redes Reguladoras de Genes , Genética de Población , Metagenómica , Estudios de Cohortes , Endopeptidasas/genética , Epistasis Genética , Expresión Génica , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Genotipo , Humanos , Desequilibrio de Ligamiento , Antígenos de Histocompatibilidad Menor/genética , Fenotipo , Proteínas Proto-Oncogénicas c-bcl-2/genética , Análisis de Secuencia de ARN , Factores de Transcripción/genética , Transcripción Genética , Transcriptoma , Dedos de Zinc
19.
Circ Genom Precis Med ; 11(9): e002030, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-30354327

RESUMEN

BACKGROUND: Tobacco smoking is a major risk factor for atherosclerotic disease and has been associated with DNA methylation (DNAm) changes in blood cells. However, whether smoking influences DNAm in the diseased vascular wall is unknown but may prove crucial in understanding the pathophysiology of atherosclerosis. In this study, we associated current tobacco smoking to epigenome-wide DNAm in atherosclerotic plaques from patients undergoing carotid endarterectomy. METHODS: DNAm at commonly methylated sites (cytosine-guanine nucleotide pairs separated by a phospho-group [CpGs]) was assessed in atherosclerotic plaque samples and peripheral blood samples from 485 carotid endarterectomy patients. We tested the association of current tobacco smoking with DNAm corrected for age and sex. To control for bias and inflation because of cellular heterogeneity, we applied a Bayesian method to estimate an empirical null distribution as implemented by the R package bacon. Replication of the smoking-associated methylated CpGs in atherosclerotic plaques was executed in the second sample of 190 carotid endarterectomy patients, and results were meta-analyzed using a fixed-effects model. RESULTS: Tobacco smoking was significantly associated to differential DNAm in atherosclerotic lesions of 4 CpGs (false discovery rate <0.05) mapped to 2 different genes ( AHRR, ITPK1) and 17 CpGs mapped to 8 genes and RNAs in blood. The strongest associations were found for CpGs mapped to the gene AHRR, a repressor of the aryl hydrocarbon receptor transcription factor involved in xenobiotic detoxification. One of these methylated CpGs were found to be regulated by local genetic variation. CONCLUSIONS: The risk factor tobacco smoking associates with DNAm at multiple loci in carotid atherosclerotic lesions. These observations support further investigation of the relationship between risk factors and epigenetic regulation in atherosclerotic disease.


Asunto(s)
Aterosclerosis/genética , Enfermedades de las Arterias Carótidas/genética , Metilación de ADN , Epigenómica/métodos , Estudio de Asociación del Genoma Completo/métodos , Fumar/efectos adversos , Anciano , Aterosclerosis/etiología , Enfermedades de las Arterias Carótidas/etiología , Islas de CpG/genética , Endarterectomía Carotidea/métodos , Endarterectomía Carotidea/estadística & datos numéricos , Epigénesis Genética , Femenino , Regulación de la Expresión Génica , Humanos , Masculino , Persona de Mediana Edad , Placa Aterosclerótica/etiología , Placa Aterosclerótica/genética
20.
Genome Biol ; 18(1): 19, 2017 01 27.
Artículo en Inglés | MEDLINE | ID: mdl-28129774

RESUMEN

We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking.


Asunto(s)
Sesgo , Epigénesis Genética , Epigenómica/métodos , Estudio de Asociación del Genoma Completo/métodos , Transcriptoma , Factores de Edad , Teorema de Bayes , Epigenómica/normas , Humanos , Metaanálisis como Asunto , Fumar
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda