Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
BMC Bioinformatics ; 25(1): 192, 2024 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-38750431

RESUMEN

BACKGROUND: Researchers have long studied the regulatory processes of genes to uncover their functions. Gene regulatory network analysis is one of the popular approaches for understanding these processes, requiring accurate identification of interactions among the genes to establish the gene regulatory network. Advances in genome-wide association studies and expression quantitative trait loci studies have led to a wealth of genomic data, facilitating more accurate inference of gene-gene interactions. However, unknown confounding factors may influence these interactions, making their interpretation complicated. Mendelian randomization (MR) has emerged as a valuable tool for causal inference in genetics, addressing confounding effects by estimating causal relationships using instrumental variables. In this paper, we propose a new statistical method, MR-GGI, for accurately inferring gene-gene interactions using Mendelian randomization. RESULTS: MR-GGI applies one gene as the exposure and another as the outcome, using causal cis-single-nucleotide polymorphisms as instrumental variables in the inverse-variance weighted MR model. Through simulations, we have demonstrated MR-GGI's ability to control type 1 error and maintain statistical power despite confounding effects. MR-GGI performed the best when compared to other methods using the F1 score on the DREAM5 dataset. Additionally, when applied to yeast genomic data, MR-GGI successfully identified six clusters. Through gene ontology analysis, we have confirmed that each cluster in our study performs distinct functional roles by gathering genes with specific functions. CONCLUSION: These findings demonstrate that MR-GGI accurately inferences gene-gene interactions despite the confounding effects in real biological environments.


Asunto(s)
Análisis de la Aleatorización Mendeliana , Polimorfismo de Nucleótido Simple , Estudio de Asociación del Genoma Completo/métodos , Redes Reguladoras de Genes/genética , Epistasis Genética/genética , Sitios de Carácter Cuantitativo , Humanos , Saccharomyces cerevisiae/genética
2.
Int J Mol Sci ; 22(6)2021 Mar 22.
Artículo en Inglés | MEDLINE | ID: mdl-33809961

RESUMEN

Amyotrophic lateral sclerosis (ALS) is a neurodegenerative neuromuscular disease. Although genome-wide association studies (GWAS) have successfully identified many variants significantly associated with ALS, it is still difficult to characterize the underlying biological mechanisms inducing ALS. In this study, we performed a transcriptome-wide association study (TWAS) to identify disease-specific genes in ALS. Using the largest ALS GWAS summary statistic (n = 80,610), we identified seven novel genes using 19 tissue reference panels. We conducted a conditional analysis to verify the genes' independence and to confirm that they are driven by genetically regulated expressions. Furthermore, we performed a TWAS-based enrichment analysis to highlight the association of important biological pathways, one in each of the four tissue reference panels. Finally, utilizing a connectivity map, a database of human cell expression profiles cultured with bioactive small molecules, we discovered functional associations between genes and drugs to identify 15 bioactive small molecules as potential drug candidates for ALS. We believe that, by integrating the largest ALS GWAS summary statistic with gene expression to identify new risk loci and causal genes, our study provides strong candidates for molecular basis experiments in ALS.


Asunto(s)
Esclerosis Amiotrófica Lateral/genética , Marcadores Genéticos , Predisposición Genética a la Enfermedad , Transcriptoma , Esclerosis Amiotrófica Lateral/diagnóstico , Esclerosis Amiotrófica Lateral/tratamiento farmacológico , Biomarcadores , Biología Computacional/métodos , Desarrollo de Medicamentos , Reposicionamiento de Medicamentos , Perfilación de la Expresión Génica , Humanos , Anotación de Secuencia Molecular , Terapia Molecular Dirigida , Medición de Riesgo , Factores de Riesgo , Flujo de Trabajo
3.
BMC Genomics ; 21(Suppl 10): 616, 2020 Nov 18.
Artículo en Inglés | MEDLINE | ID: mdl-33208108

RESUMEN

BACKGROUND: Regulatory hotspots are genetic variations that may regulate the expression levels of many genes. It has been of great interest to find those hotspots utilizing expression quantitative trait locus (eQTL) analysis. However, it has been reported that many of the findings are spurious hotspots induced by various unknown confounding factors. Recently, methods utilizing complicated statistical models have been developed that successfully identify genuine hotspots. Next-generation Intersample Correlation Emended (NICE) is one of the methods that show high sensitivity and low false-discovery rate in finding regulatory hotspots. Even though the methods successfully find genuine hotspots, they have not been widely used due to their non-user-friendly interfaces and complex running processes. Furthermore, most of the methods are impractical due to their prohibitively high computational complexity. RESULTS: To overcome the limitations of existing methods, we developed a fully automated web-based tool, referred to as NICER (NICE Renew), which is based on NICE program. First, we dramatically reduced running and installing burden of NICE. Second, we significantly reduced running time by incorporating multi-processing. Third, besides our web-based NICER, users can use NICER on Google Compute Engine and can readily install and run the NICER web service on their local computers. Finally, we provide different input formats and visualizations tools to show results. Utilizing a yeast dataset, we show that NICER can be successfully used in an eQTL analysis to identify many genuine regulatory hotspots, for which more than half of the hotspots were previously reported elsewhere. CONCLUSIONS: Even though many hotspot analysis tools have been proposed, they have not been widely used for many practical reasons. NICER is a fully-automated web-based solution for eQTL mapping and regulatory hotspots analysis. NICER provides a user-friendly interface and has made hotspot analysis more viable by reducing the running time significantly. We believe that NICER will become the method of choice for increasing power of eQTL hotspot analysis.


Asunto(s)
Sitios de Carácter Cuantitativo , Saccharomyces cerevisiae , Mapeo Cromosómico , Internet , Modelos Estadísticos , Saccharomyces cerevisiae/genética
4.
Am J Hum Genet ; 100(5): 789-802, 2017 May 04.
Artículo en Inglés | MEDLINE | ID: mdl-28475861

RESUMEN

Recent successes in genome-wide association studies (GWASs) make it possible to address important questions about the genetic architecture of complex traits, such as allele frequency and effect size. One lesser-known aspect of complex traits is the extent of allelic heterogeneity (AH) arising from multiple causal variants at a locus. We developed a computational method to infer the probability of AH and applied it to three GWASs and four expression quantitative trait loci (eQTL) datasets. We identified a total of 4,152 loci with strong evidence of AH. The proportion of all loci with identified AH is 4%-23% in eQTLs, 35% in GWASs of high-density lipoprotein (HDL), and 23% in GWASs of schizophrenia. For eQTLs, we observed a strong correlation between sample size and the proportion of loci with AH (R2 = 0.85, p = 2.2 × 10-16), indicating that statistical power prevents identification of AH in other loci. Understanding the extent of AH may guide the development of new methods for fine mapping and association mapping of complex traits.


Asunto(s)
Alelos , Frecuencia de los Genes , Sitios de Carácter Cuantitativo , Bases de Datos Genéticas , Estudios de Asociación Genética , Humanos , Desequilibrio de Ligamiento , Modelos Moleculares , Fenotipo
5.
Am J Hum Genet ; 99(6): 1245-1260, 2016 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-27866706

RESUMEN

The vast majority of genome-wide association study (GWAS) risk loci fall in non-coding regions of the genome. One possible hypothesis is that these GWAS risk loci alter the individual's disease risk through their effect on gene expression in different tissues. In order to understand the mechanisms driving a GWAS risk locus, it is helpful to determine which gene is affected in specific tissue types. For example, the relevant gene and tissue could play a role in the disease mechanism if the same variant responsible for a GWAS locus also affects gene expression. Identifying whether or not the same variant is causal in both GWASs and expression quantitative trail locus (eQTL) studies is challenging because of the uncertainty induced by linkage disequilibrium and the fact that some loci harbor multiple causal variants. However, current methods that address this problem assume that each locus contains a single causal variant. In this paper, we present eCAVIAR, a probabilistic method that has several key advantages over existing methods. First, our method can account for more than one causal variant in any given locus. Second, it can leverage summary statistics without accessing the individual genotype data. We use both simulated and real datasets to demonstrate the utility of our method. Using publicly available eQTL data on 45 different tissues, we demonstrate that eCAVIAR can prioritize likely relevant tissues and target genes for a set of glucose- and insulin-related trait loci.


Asunto(s)
Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Modelos Genéticos , Modelos Estadísticos , Sitios de Carácter Cuantitativo/genética , Conjuntos de Datos como Asunto , Regulación de la Expresión Génica/genética , Genotipo , Glucosa/metabolismo , Humanos , Insulina/metabolismo , Desequilibrio de Ligamiento , Especificidad de Órganos , Probabilidad , Tamaño de la Muestra
6.
Genome Res ; 25(10): 1558-69, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26260972

RESUMEN

Genetics provides a potentially powerful approach to dissect host-gut microbiota interactions. Toward this end, we profiled gut microbiota using 16s rRNA gene sequencing in a panel of 110 diverse inbred strains of mice. This panel has previously been studied for a wide range of metabolic traits and can be used for high-resolution association mapping. Using a SNP-based approach with a linear mixed model, we estimated the heritability of microbiota composition. We conclude that, in a controlled environment, the genetic background accounts for a substantial fraction of abundance of most common microbiota. The mice were previously studied for response to a high-fat, high-sucrose diet, and we hypothesized that the dietary response was determined in part by gut microbiota composition. We tested this using a cross-fostering strategy in which a strain showing a modest response, SWR, was seeded with microbiota from a strain showing a strong response, A×B19. Consistent with a role of microbiota in dietary response, the cross-fostered SWR pups exhibited a significantly increased response in weight gain. To examine specific microbiota contributing to the response, we identified various genera whose abundance correlated with dietary response. Among these, we chose Akkermansia muciniphila, a common anaerobe previously associated with metabolic effects. When administered to strain A×B19 by gavage, the dietary response was significantly blunted for obesity, plasma lipids, and insulin resistance. In an effort to further understand host-microbiota interactions, we mapped loci controlling microbiota composition and prioritized candidate genes. Our publicly available data provide a resource for future studies.


Asunto(s)
Microbioma Gastrointestinal/genética , Animales , Dieta , Dieta Alta en Grasa , Ambiente , Femenino , Estudio de Asociación del Genoma Completo , Herencia , Masculino , Ratones , Ratones Endogámicos , Obesidad/microbiología , ARN Ribosómico 16S , Sacarosa/metabolismo
7.
Genome Res ; 24(4): 664-72, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24614977

RESUMEN

The development of high-throughput genomic technologies has impacted many areas of genetic research. While many applications of these technologies focus on the discovery of genes involved in disease from population samples, applications of genomic technologies to an individual's genome or personal genomics have recently gained much interest. One such application is the identification of relatives from genetic data. In this application, genetic information from a set of individuals is collected in a database, and each pair of individuals is compared in order to identify genetic relatives. An inherent issue that arises in the identification of relatives is privacy. In this article, we propose a method for identifying genetic relatives without compromising privacy by taking advantage of novel cryptographic techniques customized for secure and private comparison of genetic information. We demonstrate the utility of these techniques by allowing a pair of individuals to discover whether or not they are related without compromising their genetic information or revealing it to a third party. The idea is that individuals only share enough special-purpose cryptographically protected information with each other to identify whether or not they are relatives, but not enough to expose any information about their genomes. We show in HapMap and 1000 Genomes data that our method can recover first- and second-order genetic relationships and, through simulations, show that our method can identify relationships as distant as third cousins while preserving privacy.


Asunto(s)
Privacidad Genética , Investigación Genética , Genoma Humano , Familia , Genómica , Proyecto Mapa de Haplotipos , Proyecto Genoma Humano , Humanos
8.
PLoS Genet ; 10(1): e1004022, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24415945

RESUMEN

Identifying environmentally-specific genetic effects is a key challenge in understanding the structure of complex traits. Model organisms play a crucial role in the identification of such gene-by-environment interactions, as a result of the unique ability to observe genetically similar individuals across multiple distinct environments. Many model organism studies examine the same traits but under varying environmental conditions. For example, knock-out or diet-controlled studies are often used to examine cholesterol in mice. These studies, when examined in aggregate, provide an opportunity to identify genomic loci exhibiting environmentally-dependent effects. However, the straightforward application of traditional methodologies to aggregate separate studies suffers from several problems. First, environmental conditions are often variable and do not fit the standard univariate model for interactions. Additionally, applying a multivariate model results in increased degrees of freedom and low statistical power. In this paper, we jointly analyze multiple studies with varying environmental conditions using a meta-analytic approach based on a random effects model to identify loci involved in gene-by-environment interactions. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional uni- or multi-variate approaches for discovery of gene-by-environment interactions. We apply our new method to combine 17 mouse studies containing in aggregate 4,965 distinct animals. We identify 26 significant loci involved in High-density lipoprotein (HDL) cholesterol, many of which are consistent with previous findings. Several of these loci show significant evidence of involvement in gene-by-environment interactions. An additional advantage of our meta-analysis approach is that our combined study has significantly higher power and improved resolution compared to any single study thus explaining the large number of loci discovered in the combined study.


Asunto(s)
HDL-Colesterol/genética , Interacción Gen-Ambiente , Sitios de Carácter Cuantitativo/genética , Animales , Ambiente , Genoma , Ratones , Modelos Teóricos
9.
Bioinformatics ; 30(12): i204-11, 2014 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-24931985

RESUMEN

MOTIVATION: High-throughput sequencing technologies have impacted many areas of genetic research. One such area is the identification of relatives from genetic data. The standard approach for the identification of genetic relatives collects the genomic data of all individuals and stores it in a database. Then, each pair of individuals is compared to detect the set of genetic relatives, and the matched individuals are informed. The main drawback of this approach is the requirement of sharing your genetic data with a trusted third party to perform the relatedness test. RESULTS: In this work, we propose a secure protocol to detect the genetic relatives from sequencing data while not exposing any information about their genomes. We assume that individuals have access to their genome sequences but do not want to share their genomes with anyone else. Unlike previous approaches, our approach uses both common and rare variants which provide the ability to detect much more distant relationships securely. We use a simulated data generated from the 1000 genomes data and illustrate that we can easily detect up to fifth degree cousins which was not possible using the existing methods. We also show in the 1000 genomes data with cryptic relationships that our method can detect these individuals. AVAILABILITY: The software is freely available for download at http://genetics.cs.ucla.edu/crypto/.


Asunto(s)
Privacidad Genética , Variación Genética , Genoma Humano , Genómica/métodos , Linaje , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
10.
bioRxiv ; 2024 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-38293199

RESUMEN

Accurate identification of human leukocyte antigen (HLA) alleles is essential for various clinical and research applications, such as transplant matching and drug sensitivities. Recent advances in RNA-seq technology have made it possible to impute HLA types from sequencing data, spurring the development of a large number of computational HLA typing tools. However, the relative performance of these tools is unknown, limiting the ability for clinical and biomedical research to make informed choices regarding which tools to use. Here we report the study design of a comprehensive benchmarking of the performance of 12 HLA callers across 682 RNA-seq samples from 8 datasets with molecularly defined gold standard at 5 loci, HLA-A, -B, -C, -DRB1, and -DQB1. For each HLA typing tool, we will comprehensively assess their accuracy, compare default with optimized parameters, and examine for discrepancies in accuracy at the allele and loci levels. We will also evaluate the computational expense of each HLA caller measured in terms of CPU time and RAM. We also plan to evaluate the influence of read length over the HLA region on accuracy for each tool. Most notably, we will examine the performance of HLA callers across European and African groups, to determine discrepancies in accuracy associated with ancestry. We hypothesize that RNA-Seq HLA callers are capable of returning high-quality results, but the tools that offer a good balance between accuracy and computational expensiveness for all ancestry groups are yet to be developed. We believe that our study will provide clinicians and researchers with clear guidance to inform their selection of an appropriate HLA caller.

11.
Sci Rep ; 14(1): 10105, 2024 05 02.
Artículo en Inglés | MEDLINE | ID: mdl-38698020

RESUMEN

Colorectal cancer (CRC) is one of the top five most common and life-threatening malignancies worldwide. Most CRC develops from advanced colorectal adenoma (ACA), a precancerous stage, through the adenoma-carcinoma sequence. However, its underlying mechanisms, including how the tumor microenvironment changes, remain elusive. Therefore, we conducted an integrative analysis comparing RNA-seq data collected from 40 ACA patients who visited Dongguk University Ilsan Hospital with normal adjacent colons and tumor samples from 18 CRC patients collected from a public database. Differential expression analysis identified 21 and 79 sequentially up- or down-regulated genes across the continuum, respectively. The functional centrality of the continuum genes was assessed through network analysis, identifying 11 up- and 13 down-regulated hub-genes. Subsequently, we validated the prognostic effects of hub-genes using the Kaplan-Meier survival analysis. To estimate the immunological transition of the adenoma-carcinoma sequence, single-cell deconvolution and immune repertoire analyses were conducted. Significant composition changes for innate immunity cells and decreased plasma B-cells with immunoglobulin diversity were observed, along with distinctive immunoglobulin recombination patterns. Taken together, we believe our findings suggest underlying transcriptional and immunological changes during the adenoma-carcinoma sequence, contributing to the further development of pre-diagnostic markers for CRC.


Asunto(s)
Adenoma , Neoplasias Colorrectales , Biología Computacional , Regulación Neoplásica de la Expresión Génica , Humanos , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/inmunología , Neoplasias Colorrectales/patología , Adenoma/genética , Adenoma/inmunología , Adenoma/patología , República de Corea , Biología Computacional/métodos , Masculino , Femenino , Microambiente Tumoral/genética , Microambiente Tumoral/inmunología , Pronóstico , Persona de Mediana Edad , Anciano , Biomarcadores de Tumor/genética , Estimación de Kaplan-Meier , Perfilación de la Expresión Génica
12.
Mamm Genome ; 23(9-10): 680-92, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22892838

RESUMEN

We have developed an association-based approach using classical inbred strains of mice in which we correct for population structure, which is very extensive in mice, using an efficient mixed-model algorithm. Our approach includes inbred parental strains as well as recombinant inbred strains in order to capture loci with effect sizes typical of complex traits in mice (in the range of 5% of total trait variance). Over the last few years, we have typed the hybrid mouse diversity panel (HMDP) strains for a variety of clinical traits as well as intermediate phenotypes and have shown that the HMDP has sufficient power to map genes for highly complex traits with resolution that is in most cases less than a megabase. In this essay, we review our experience with the HMDP, describe various ongoing projects, and discuss how the HMDP may fit into the larger picture of common diseases and different approaches.


Asunto(s)
Ratones Endogámicos/genética , Animales , Bases de Datos Genéticas , Ratones
13.
Commun Biol ; 5(1): 615, 2022 06 22.
Artículo en Inglés | MEDLINE | ID: mdl-35729261

RESUMEN

Atopic dermatitis (AD) is one of the most common inflammatory skin diseases, which significantly impact the quality of life. Transcriptome-wide association study (TWAS) was conducted to estimate both transcriptomic and genomic features of AD and detected significant associations between 31 expression quantitative loci and 25 genes. Our results replicated well-known genetic markers for AD, as well as 4 novel associated genes. Next, transcriptome meta-analysis was conducted with 5 studies retrieved from public databases and identified 5 additional novel susceptibility genes for AD. Applying the connectivity map to the results from TWAS and meta-analysis, robustly enriched perturbations were identified and their chemical or functional properties were analyzed. Here, we report the first research on integrative approaches for an AD, combining TWAS and transcriptome meta-analysis. Together, our findings could provide a comprehensive understanding of the pathophysiologic mechanisms of AD and suggest potential drug candidates as alternative treatment options.


Asunto(s)
Dermatitis Atópica , Transcriptoma , Dermatitis Atópica/tratamiento farmacológico , Dermatitis Atópica/genética , Dermatitis Atópica/metabolismo , Reposicionamiento de Medicamentos , Estudio de Asociación del Genoma Completo/métodos , Humanos , Calidad de Vida
14.
PLoS One ; 17(9): e0274879, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36174000

RESUMEN

Uterine fibroid is one of the most prevalent benign tumors in women, with high socioeconomic costs. Although genome-wide association studies (GWAS) have identified several loci associated with uterine fibroid risks, they could not successfully interpret the biological effects of genomic variants at the gene expression levels. To prioritize uterine fibroid susceptibility genes that are biologically interpretable, we conducted a transcriptome-wide association study (TWAS) by integrating GWAS data of uterine fibroid and expression quantitative loci data. We identified nine significant TWAS genes including two novel genes, RP11-282O18.3 and KBTBD7, which may be causal genes for uterine fibroid. We conducted functional enrichment network analyses using the TWAS results to investigate the biological pathways in which the overall TWAS genes were involved. The results demonstrated the immune system process to be a key pathway in uterine fibroid pathogenesis. Finally, we carried out chemical-gene interaction analyses using the TWAS results and the comparative toxicogenomics database to determine the potential risk chemicals for uterine fibroid. We identified five toxic chemicals that were significantly associated with uterine fibroid TWAS genes, suggesting that they may be implicated in the pathogenesis of uterine fibroid. In this study, we performed an integrative analysis covering the broad application of bioinformatics approaches. Our study may provide a deeper understanding of uterine fibroid etiologies and informative notifications about potential risk chemicals for uterine fibroid.


Asunto(s)
Leiomioma , Transcriptoma , Femenino , Marcadores Genéticos , Estudio de Asociación del Genoma Completo , Humanos , Leiomioma/genética , Toxicogenética
15.
IEEE J Biomed Health Inform ; 26(12): 6150-6160, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36070258

RESUMEN

Ion channels, which can be modulated by peptides, are promising drug targets for neurological, metabolic, and cardiovascular disorders. Because it is expensive and labor-intensive to experimentally screen ion channel-modulating peptides (IMPs), in-silico approaches can serve as excellent alternatives. In this study, we present PrIMP, prediction models for screening IMPs that can target sodium, potassium, and calcium ion channels, as well as nicotine acetylcholine receptors (nAChRs). To overcome the data insufficiency of the IMPs, we utilized two types of knowledge transfer approaches: multi-task learning (MTL) and transfer learning (TL). MTL enabled model training for four target tasks simultaneously with hard parameter sharing, thereby increasing model generalization. TL transferred knowledge of pre-trained model weights from antimicrobial peptide data, which was a much larger, naturally-occurring functional peptide dataset that could potentially improve the model performance. MTL and TL successfully improved the prediction performance of prediction models. In addition, a hybrid approach by implementing deep learning along with traditional machine learning was utilized, with additional performance improvements. PrIMP achieved F1 scores of 0.924 (sodium ion channel), 0.937 (potassium ion channel), 0.898 (calcium ion channel), and 0.931 (nAChRs). The pre-processed dataset and proposed model are available at https://github.com/bzlee-bio/PrIMP.


Asunto(s)
Canales Iónicos , Aprendizaje Automático , Humanos , Péptidos
16.
Genome Biol ; 22(1): 128, 2021 04 30.
Artículo en Inglés | MEDLINE | ID: mdl-33931127

RESUMEN

In standard genome-wide association studies (GWAS), the standard association test is underpowered to detect associations between loci with multiple causal variants with small effect sizes. We propose a statistical method, Model-based Association test Reflecting causal Status (MARS), that finds associations between variants in risk loci and a phenotype, considering the causal status of variants, only requiring the existing summary statistics to detect associated risk loci. Utilizing extensive simulated data and real data, we show that MARS increases the power of detecting true associated risk loci compared to previous approaches that consider multiple variants, while controlling the type I error.


Asunto(s)
Alelos , Estudios de Asociación Genética/métodos , Heterogeneidad Genética , Modelos Genéticos , Modelos Estadísticos , Algoritmos , Estudio de Asociación del Genoma Completo/métodos , Humanos
17.
J Proteome Res ; 9(2): 1150-6, 2010 Feb 05.
Artículo en Inglés | MEDLINE | ID: mdl-19908919

RESUMEN

Shotgun proteomics using mass spectrometry (MS) has become the choice for large-scale peptide and protein identification. The recent development of high-resolution mass spectrometers such as FT-ICR or Orbitrap makes it possible to identify peptides within only a few parts per million (ppm), and it is expected to dramatically improve performance of peptide identification, as compared to low-resolution instruments. To fully exploit such significantly higher mass accuracy, however, appropriate data analysis methods are required. Here, we present a new target-decoy strategy, called Target-Decoy with Mass Binning, utilizing high mass accuracy for peptide identification validation, which remains a challenging problem in MS-based proteomics. When tested on various high-resolution MS data, our method was very effective and yet simple and showed comparable or better performance when compared with other validation methods.


Asunto(s)
Espectrometría de Masas/métodos , Proteómica
18.
J Comput Biol ; 26(11): 1203-1213, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-30272994

RESUMEN

Genotype imputation has been widely utilized for two reasons in the analysis of genome-wide association studies (GWAS). One reason is to increase the power for association studies when causal single nucleotide polymorphisms are not collected in the GWAS. The second reason is to aid the interpretation of a GWAS result by predicting the association statistics at untyped variants. In this article, we show that prediction of association statistics at untyped variants that have an influence on the trait produces is overly conservative. Current imputation methods assume that none of the variants in a region (locus consists of multiple variants) affect the trait, which is often inconsistent with the observed data. In this article, we propose a new method, CAUSAL-Imp, which can impute the association statistics at untyped variants while taking into account variants in the region that may affect the trait. Our method builds on recent methods that impute the marginal statistics for GWAS by utilizing the fact that marginal statistics follow a multivariate normal distribution. We utilize both simulated and real data sets to assess the performance of our method. We show that traditional imputation approaches underestimate the association statistics for variants involved in the trait, and our results demonstrate that our approach provides less biased estimates of these association statistics.


Asunto(s)
Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Genoma/genética , Programas Informáticos , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética
19.
BMC Med Genomics ; 12(Suppl 5): 98, 2019 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-31296227

RESUMEN

BACKGROUND: Dupuytren's disease (DD) is a fibroproliferative disorder characterized by thickening and contracting palmar fascia. The exact pathogenesis of DD remains unknown. RESULTS: In this study, we identified co-expressed gene set (DD signature) consisting of 753 genes via weighted gene co-expression network analysis. To confirm the robustness of DD signature, module enrichment analysis and meta-analysis were performed. Moreover, this signature effectively classified DD disease samples. The DD signature were significantly enriched in unfolded protein response (UPR) related to endoplasmic reticulum (ER) stress. Next, we conducted multiple-phenotype regression analysis to identify trans-regulatory hotspots regulating expression levels of DD signature using Genotype-Tissue Expression data. Finally, 10 trans-regulatory hotspots and 16 eGenes genes that are significantly associated with at least one cis-eQTL were identified. CONCLUSIONS: Among these eGenes, major histocompatibility complex class II genes and ZFP57 zinc finger protein were closely related to ER stress and UPR, suggesting that these genetic markers might be potential therapeutic targets for DD.


Asunto(s)
Contractura de Dupuytren/genética , Perfilación de la Expresión Génica , Marcadores Genéticos/genética , Genómica , Animales , Redes Reguladoras de Genes , Humanos
20.
Sci Rep ; 9(1): 3176, 2019 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-30816214

RESUMEN

Characterization of protein structural changes in response to protein modifications, ligand or chemical binding, or protein-protein interactions is essential for understanding protein function and its regulation. Amide hydrogen/deuterium exchange (HDX) coupled with mass spectrometry (MS) is one of the most favorable tools for characterizing the protein dynamics and changes of protein conformation. However, currently the analysis of HDX-MS data is not up to its full power as it still requires manual validation by mass spectrometry experts. Especially, with the advent of high throughput technologies, the data size grows everyday and an automated tool is essential for the analysis. Here, we introduce a fully automated software, referred to as 'deMix', for the HDX-MS data analysis. deMix deals directly with the deuterated isotopic distributions, but not considering their centroid masses and is designed to be robust over random noises. In addition, unlike the existing approaches that can only determine a single state from an isotopic distribution, deMix can also detect a bimodal deuterated distribution, arising from EX1 behavior or heterogeneous peptides in conformational isomer proteins. Furthermore, deMix comes with visualization software to facilitate validation and representation of the analysis results.


Asunto(s)
Espectrometría de Masas de Intercambio de Hidrógeno-Deuterio/métodos , Proteínas/ultraestructura , Programas Informáticos , Conformación Proteica , Proteínas/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA