Pesquisa | Portal de Pesquisa da BVS

1.

MR-GGI: accurate inference of gene-gene interactions using Mendelian randomization.

Oh, Wonseok; Jung, Junghyun; Joo, Jong Wha J.

BMC Bioinformatics ; 25(1): 192, 2024 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-38750431

RESUMO

BACKGROUND: Researchers have long studied the regulatory processes of genes to uncover their functions. Gene regulatory network analysis is one of the popular approaches for understanding these processes, requiring accurate identification of interactions among the genes to establish the gene regulatory network. Advances in genome-wide association studies and expression quantitative trait loci studies have led to a wealth of genomic data, facilitating more accurate inference of gene-gene interactions. However, unknown confounding factors may influence these interactions, making their interpretation complicated. Mendelian randomization (MR) has emerged as a valuable tool for causal inference in genetics, addressing confounding effects by estimating causal relationships using instrumental variables. In this paper, we propose a new statistical method, MR-GGI, for accurately inferring gene-gene interactions using Mendelian randomization. RESULTS: MR-GGI applies one gene as the exposure and another as the outcome, using causal cis-single-nucleotide polymorphisms as instrumental variables in the inverse-variance weighted MR model. Through simulations, we have demonstrated MR-GGI's ability to control type 1 error and maintain statistical power despite confounding effects. MR-GGI performed the best when compared to other methods using the F1 score on the DREAM5 dataset. Additionally, when applied to yeast genomic data, MR-GGI successfully identified six clusters. Through gene ontology analysis, we have confirmed that each cluster in our study performs distinct functional roles by gathering genes with specific functions. CONCLUSION: These findings demonstrate that MR-GGI accurately inferences gene-gene interactions despite the confounding effects in real biological environments.

Assuntos

Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Redes Reguladoras de Genes/genética , Epistasia Genética/genética , Locos de Características Quantitativas , Humanos , Saccharomyces cerevisiae/genética

2.

An Integrative Transcriptome-Wide Analysis of Amyotrophic Lateral Sclerosis for the Identification of Potential Genetic Markers and Drug Candidates.

Park, Sungmin; Kim, Daeun; Song, Jaeseung; Joo, Jong Wha J.

Int J Mol Sci ; 22(6)2021 Mar 22.

Artigo em Inglês | MEDLINE | ID: mdl-33809961

RESUMO

Amyotrophic lateral sclerosis (ALS) is a neurodegenerative neuromuscular disease. Although genome-wide association studies (GWAS) have successfully identified many variants significantly associated with ALS, it is still difficult to characterize the underlying biological mechanisms inducing ALS. In this study, we performed a transcriptome-wide association study (TWAS) to identify disease-specific genes in ALS. Using the largest ALS GWAS summary statistic (n = 80,610), we identified seven novel genes using 19 tissue reference panels. We conducted a conditional analysis to verify the genes' independence and to confirm that they are driven by genetically regulated expressions. Furthermore, we performed a TWAS-based enrichment analysis to highlight the association of important biological pathways, one in each of the four tissue reference panels. Finally, utilizing a connectivity map, a database of human cell expression profiles cultured with bioactive small molecules, we discovered functional associations between genes and drugs to identify 15 bioactive small molecules as potential drug candidates for ALS. We believe that, by integrating the largest ALS GWAS summary statistic with gene expression to identify new risk loci and causal genes, our study provides strong candidates for molecular basis experiments in ALS.

Assuntos

Esclerose Lateral Amiotrófica/genética , Marcadores Genéticos , Predisposição Genética para Doença , Transcriptoma , Esclerose Lateral Amiotrófica/diagnóstico , Esclerose Lateral Amiotrófica/tratamento farmacológico , Biomarcadores , Biologia Computacional/métodos , Desenvolvimento de Medicamentos , Reposicionamento de Medicamentos , Perfilação da Expressão Gênica , Humanos , Anotação de Sequência Molecular , Terapia de Alvo Molecular , Medição de Risco , Fatores de Risco , Fluxo de Trabalho

3.

Fully automated web-based tool for identifying regulatory hotspots.

Choi, Ju Hun; Kim, Taegun; Jung, Junghyun; Joo, Jong Wha J.

BMC Genomics ; 21(Suppl 10): 616, 2020 Nov 18.

Artigo em Inglês | MEDLINE | ID: mdl-33208108

RESUMO

BACKGROUND: Regulatory hotspots are genetic variations that may regulate the expression levels of many genes. It has been of great interest to find those hotspots utilizing expression quantitative trait locus (eQTL) analysis. However, it has been reported that many of the findings are spurious hotspots induced by various unknown confounding factors. Recently, methods utilizing complicated statistical models have been developed that successfully identify genuine hotspots. Next-generation Intersample Correlation Emended (NICE) is one of the methods that show high sensitivity and low false-discovery rate in finding regulatory hotspots. Even though the methods successfully find genuine hotspots, they have not been widely used due to their non-user-friendly interfaces and complex running processes. Furthermore, most of the methods are impractical due to their prohibitively high computational complexity. RESULTS: To overcome the limitations of existing methods, we developed a fully automated web-based tool, referred to as NICER (NICE Renew), which is based on NICE program. First, we dramatically reduced running and installing burden of NICE. Second, we significantly reduced running time by incorporating multi-processing. Third, besides our web-based NICER, users can use NICER on Google Compute Engine and can readily install and run the NICER web service on their local computers. Finally, we provide different input formats and visualizations tools to show results. Utilizing a yeast dataset, we show that NICER can be successfully used in an eQTL analysis to identify many genuine regulatory hotspots, for which more than half of the hotspots were previously reported elsewhere. CONCLUSIONS: Even though many hotspot analysis tools have been proposed, they have not been widely used for many practical reasons. NICER is a fully-automated web-based solution for eQTL mapping and regulatory hotspots analysis. NICER provides a user-friendly interface and has made hotspot analysis more viable by reducing the running time significantly. We believe that NICER will become the method of choice for increasing power of eQTL hotspot analysis.

Assuntos

Locos de Características Quantitativas , Saccharomyces cerevisiae , Mapeamento Cromossômico , Internet , Modelos Estatísticos , Saccharomyces cerevisiae/genética

4.

Widespread Allelic Heterogeneity in Complex Traits.

Hormozdiari, Farhad; Zhu, Anthony; Kichaev, Gleb; Ju, Chelsea J-T; Segrè, Ayellet V; Joo, Jong Wha J; Won, Hyejung; Sankararaman, Sriram; Pasaniuc, Bogdan; Shifman, Sagiv; Eskin, Eleazar.

Am J Hum Genet ; 100(5): 789-802, 2017 May 04.

Artigo em Inglês | MEDLINE | ID: mdl-28475861

RESUMO

Recent successes in genome-wide association studies (GWASs) make it possible to address important questions about the genetic architecture of complex traits, such as allele frequency and effect size. One lesser-known aspect of complex traits is the extent of allelic heterogeneity (AH) arising from multiple causal variants at a locus. We developed a computational method to infer the probability of AH and applied it to three GWASs and four expression quantitative trait loci (eQTL) datasets. We identified a total of 4,152 loci with strong evidence of AH. The proportion of all loci with identified AH is 4%-23% in eQTLs, 35% in GWASs of high-density lipoprotein (HDL), and 23% in GWASs of schizophrenia. For eQTLs, we observed a strong correlation between sample size and the proportion of loci with AH (R2 = 0.85, p = 2.2 × 10-16), indicating that statistical power prevents identification of AH in other loci. Understanding the extent of AH may guide the development of new methods for fine mapping and association mapping of complex traits.

Assuntos

Alelos , Frequência do Gene , Locos de Características Quantitativas , Bases de Dados Genéticas , Estudos de Associação Genética , Humanos , Desequilíbrio de Ligação , Modelos Moleculares , Fenótipo

5.

Colocalization of GWAS and eQTL Signals Detects Target Genes.

Hormozdiari, Farhad; van de Bunt, Martijn; Segrè, Ayellet V; Li, Xiao; Joo, Jong Wha J; Bilow, Michael; Sul, Jae Hoon; Sankararaman, Sriram; Pasaniuc, Bogdan; Eskin, Eleazar.

Am J Hum Genet ; 99(6): 1245-1260, 2016 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-27866706

RESUMO

The vast majority of genome-wide association study (GWAS) risk loci fall in non-coding regions of the genome. One possible hypothesis is that these GWAS risk loci alter the individual's disease risk through their effect on gene expression in different tissues. In order to understand the mechanisms driving a GWAS risk locus, it is helpful to determine which gene is affected in specific tissue types. For example, the relevant gene and tissue could play a role in the disease mechanism if the same variant responsible for a GWAS locus also affects gene expression. Identifying whether or not the same variant is causal in both GWASs and expression quantitative trail locus (eQTL) studies is challenging because of the uncertainty induced by linkage disequilibrium and the fact that some loci harbor multiple causal variants. However, current methods that address this problem assume that each locus contains a single causal variant. In this paper, we present eCAVIAR, a probabilistic method that has several key advantages over existing methods. First, our method can account for more than one causal variant in any given locus. Second, it can leverage summary statistics without accessing the individual genotype data. We use both simulated and real datasets to demonstrate the utility of our method. Using publicly available eQTL data on 45 different tissues, we demonstrate that eCAVIAR can prioritize likely relevant tissues and target genes for a set of glucose- and insulin-related trait loci.

Assuntos

Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Modelos Estatísticos , Locos de Características Quantitativas/genética , Conjuntos de Dados como Assunto , Regulação da Expressão Gênica/genética , Genótipo , Glucose/metabolismo , Humanos , Insulina/metabolismo , Desequilíbrio de Ligação , Especificidade de Órgãos , Probabilidade , Tamanho da Amostra

6.

Genetic and environmental control of host-gut microbiota interactions.

Org, Elin; Parks, Brian W; Joo, Jong Wha J; Emert, Benjamin; Schwartzman, William; Kang, Eun Yong; Mehrabian, Margarete; Pan, Calvin; Knight, Rob; Gunsalus, Robert; Drake, Thomas A; Eskin, Eleazar; Lusis, Aldons J.

Genome Res ; 25(10): 1558-69, 2015 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-26260972

RESUMO

Genetics provides a potentially powerful approach to dissect host-gut microbiota interactions. Toward this end, we profiled gut microbiota using 16s rRNA gene sequencing in a panel of 110 diverse inbred strains of mice. This panel has previously been studied for a wide range of metabolic traits and can be used for high-resolution association mapping. Using a SNP-based approach with a linear mixed model, we estimated the heritability of microbiota composition. We conclude that, in a controlled environment, the genetic background accounts for a substantial fraction of abundance of most common microbiota. The mice were previously studied for response to a high-fat, high-sucrose diet, and we hypothesized that the dietary response was determined in part by gut microbiota composition. We tested this using a cross-fostering strategy in which a strain showing a modest response, SWR, was seeded with microbiota from a strain showing a strong response, A×B19. Consistent with a role of microbiota in dietary response, the cross-fostered SWR pups exhibited a significantly increased response in weight gain. To examine specific microbiota contributing to the response, we identified various genera whose abundance correlated with dietary response. Among these, we chose Akkermansia muciniphila, a common anaerobe previously associated with metabolic effects. When administered to strain A×B19 by gavage, the dietary response was significantly blunted for obesity, plasma lipids, and insulin resistance. In an effort to further understand host-microbiota interactions, we mapped loci controlling microbiota composition and prioritized candidate genes. Our publicly available data provide a resource for future studies.

Assuntos

Microbioma Gastrointestinal/genética , Animais , Dieta , Dieta Hiperlipídica , Meio Ambiente , Feminino , Estudo de Associação Genômica Ampla , Hereditariedade , Masculino , Camundongos , Camundongos Endogâmicos , Obesidade/microbiologia , RNA Ribossômico 16S , Sacarose/metabolismo

7.

Identifying genetic relatives without compromising privacy.

He, Dan; Furlotte, Nicholas A; Hormozdiari, Farhad; Joo, Jong Wha J; Wadia, Akshay; Ostrovsky, Rafail; Sahai, Amit; Eskin, Eleazar.

Genome Res ; 24(4): 664-72, 2014 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-24614977

RESUMO

The development of high-throughput genomic technologies has impacted many areas of genetic research. While many applications of these technologies focus on the discovery of genes involved in disease from population samples, applications of genomic technologies to an individual's genome or personal genomics have recently gained much interest. One such application is the identification of relatives from genetic data. In this application, genetic information from a set of individuals is collected in a database, and each pair of individuals is compared in order to identify genetic relatives. An inherent issue that arises in the identification of relatives is privacy. In this article, we propose a method for identifying genetic relatives without compromising privacy by taking advantage of novel cryptographic techniques customized for secure and private comparison of genetic information. We demonstrate the utility of these techniques by allowing a pair of individuals to discover whether or not they are related without compromising their genetic information or revealing it to a third party. The idea is that individuals only share enough special-purpose cryptographically protected information with each other to identify whether or not they are relatives, but not enough to expose any information about their genomes. We show in HapMap and 1000 Genomes data that our method can recover first- and second-order genetic relationships and, through simulations, show that our method can identify relationships as distant as third cousins while preserving privacy.

Assuntos

Privacidade Genética , Pesquisa em Genética , Genoma Humano , Família , Genômica , Projeto HapMap , Projeto Genoma Humano , Humanos

8.

Meta-analysis identifies gene-by-environment interactions as demonstrated in a study of 4,965 mice.

Kang, Eun Yong; Han, Buhm; Furlotte, Nicholas; Joo, Jong Wha J; Shih, Diana; Davis, Richard C; Lusis, Aldons J; Eskin, Eleazar.

PLoS Genet ; 10(1): e1004022, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24415945

RESUMO

Identifying environmentally-specific genetic effects is a key challenge in understanding the structure of complex traits. Model organisms play a crucial role in the identification of such gene-by-environment interactions, as a result of the unique ability to observe genetically similar individuals across multiple distinct environments. Many model organism studies examine the same traits but under varying environmental conditions. For example, knock-out or diet-controlled studies are often used to examine cholesterol in mice. These studies, when examined in aggregate, provide an opportunity to identify genomic loci exhibiting environmentally-dependent effects. However, the straightforward application of traditional methodologies to aggregate separate studies suffers from several problems. First, environmental conditions are often variable and do not fit the standard univariate model for interactions. Additionally, applying a multivariate model results in increased degrees of freedom and low statistical power. In this paper, we jointly analyze multiple studies with varying environmental conditions using a meta-analytic approach based on a random effects model to identify loci involved in gene-by-environment interactions. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional uni- or multi-variate approaches for discovery of gene-by-environment interactions. We apply our new method to combine 17 mouse studies containing in aggregate 4,965 distinct animals. We identify 26 significant loci involved in High-density lipoprotein (HDL) cholesterol, many of which are consistent with previous findings. Several of these loci show significant evidence of involvement in gene-by-environment interactions. An additional advantage of our meta-analysis approach is that our combined study has significantly higher power and improved resolution compared to any single study thus explaining the large number of loci discovered in the combined study.

Assuntos

HDL-Colesterol/genética , Interação Gene-Ambiente , Locos de Características Quantitativas/genética , Animais , Meio Ambiente , Genoma , Camundongos , Modelos Teóricos

9.

Privacy preserving protocol for detecting genetic relatives using rare variants.

Hormozdiari, Farhad; Joo, Jong Wha J; Wadia, Akshay; Guan, Feng; Ostrosky, Rafail; Sahai, Amit; Eskin, Eleazar.

Bioinformatics ; 30(12): i204-11, 2014 Jun 15.

Artigo em Inglês | MEDLINE | ID: mdl-24931985

RESUMO

MOTIVATION: High-throughput sequencing technologies have impacted many areas of genetic research. One such area is the identification of relatives from genetic data. The standard approach for the identification of genetic relatives collects the genomic data of all individuals and stores it in a database. Then, each pair of individuals is compared to detect the set of genetic relatives, and the matched individuals are informed. The main drawback of this approach is the requirement of sharing your genetic data with a trusted third party to perform the relatedness test. RESULTS: In this work, we propose a secure protocol to detect the genetic relatives from sequencing data while not exposing any information about their genomes. We assume that individuals have access to their genome sequences but do not want to share their genomes with anyone else. Unlike previous approaches, our approach uses both common and rare variants which provide the ability to detect much more distant relationships securely. We use a simulated data generated from the 1000 genomes data and illustrate that we can easily detect up to fifth degree cousins which was not possible using the existing methods. We also show in the 1000 genomes data with cryptic relationships that our method can detect these individuals. AVAILABILITY: The software is freely available for download at http://genetics.cs.ucla.edu/crypto/.

Assuntos

Privacidade Genética , Variação Genética , Genoma Humano , Genômica/métodos , Linhagem , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos

10.

A rigorous benchmarking of alignment-based HLA typing algorithms for RNA-seq data.

Yu, Dottie; Ayyala, Ram; Sadek, Sarah Hany; Chittampalli, Likhitha; Farooq, Hafsa; Jung, Junghyun; Nahid, Abdullah Al; Boldirev, Grigore; Jung, Mina; Park, Sungmin; Nguyen, Austin; Zelikovsky, Alex; Mancuso, Nicholas; Joo, Jong Wha J; Thompson, Reid F; Alachkar, Houda; Mangul, Serghei.

bioRxiv ; 2024 Jan 16.

Artigo em Inglês | MEDLINE | ID: mdl-38293199

RESUMO

Accurate identification of human leukocyte antigen (HLA) alleles is essential for various clinical and research applications, such as transplant matching and drug sensitivities. Recent advances in RNA-seq technology have made it possible to impute HLA types from sequencing data, spurring the development of a large number of computational HLA typing tools. However, the relative performance of these tools is unknown, limiting the ability for clinical and biomedical research to make informed choices regarding which tools to use. Here we report the study design of a comprehensive benchmarking of the performance of 12 HLA callers across 682 RNA-seq samples from 8 datasets with molecularly defined gold standard at 5 loci, HLA-A, -B, -C, -DRB1, and -DQB1. For each HLA typing tool, we will comprehensively assess their accuracy, compare default with optimized parameters, and examine for discrepancies in accuracy at the allele and loci levels. We will also evaluate the computational expense of each HLA caller measured in terms of CPU time and RAM. We also plan to evaluate the influence of read length over the HLA region on accuracy for each tool. Most notably, we will examine the performance of HLA callers across European and African groups, to determine discrepancies in accuracy associated with ancestry. We hypothesize that RNA-Seq HLA callers are capable of returning high-quality results, but the tools that offer a good balance between accuracy and computational expensiveness for all ancestry groups are yet to be developed. We believe that our study will provide clinicians and researchers with clear guidance to inform their selection of an appropriate HLA caller.

11.

Elucidating immunological characteristics of the adenoma-carcinoma sequence in colorectal cancer patients in South Korea using a bioinformatics approach.

Song, Jaeseung; Kim, Daeun; Jung, Junghyun; Choi, Eunyoung; Lee, Yubin; Jeong, Yeonbin; Lee, Byungjo; Lee, Sora; Shim, Yujeong; Won, Youngtae; Cho, Hyeki; Jang, Dong Kee; Kang, Hyoun Woo; Joo, Jong Wha J; Jang, Wonhee.

Sci Rep ; 14(1): 10105, 2024 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-38698020

RESUMO

Colorectal cancer (CRC) is one of the top five most common and life-threatening malignancies worldwide. Most CRC develops from advanced colorectal adenoma (ACA), a precancerous stage, through the adenoma-carcinoma sequence. However, its underlying mechanisms, including how the tumor microenvironment changes, remain elusive. Therefore, we conducted an integrative analysis comparing RNA-seq data collected from 40 ACA patients who visited Dongguk University Ilsan Hospital with normal adjacent colons and tumor samples from 18 CRC patients collected from a public database. Differential expression analysis identified 21 and 79 sequentially up- or down-regulated genes across the continuum, respectively. The functional centrality of the continuum genes was assessed through network analysis, identifying 11 up- and 13 down-regulated hub-genes. Subsequently, we validated the prognostic effects of hub-genes using the Kaplan-Meier survival analysis. To estimate the immunological transition of the adenoma-carcinoma sequence, single-cell deconvolution and immune repertoire analyses were conducted. Significant composition changes for innate immunity cells and decreased plasma B-cells with immunoglobulin diversity were observed, along with distinctive immunoglobulin recombination patterns. Taken together, we believe our findings suggest underlying transcriptional and immunological changes during the adenoma-carcinoma sequence, contributing to the further development of pre-diagnostic markers for CRC.

Assuntos

Adenoma , Neoplasias Colorretais , Biologia Computacional , Regulação Neoplásica da Expressão Gênica , Humanos , Neoplasias Colorretais/genética , Neoplasias Colorretais/imunologia , Neoplasias Colorretais/patologia , Adenoma/genética , Adenoma/imunologia , Adenoma/patologia , República da Coreia , Biologia Computacional/métodos , Masculino , Feminino , Microambiente Tumoral/genética , Microambiente Tumoral/imunologia , Prognóstico , Pessoa de Meia-Idade , Idoso , Biomarcadores Tumorais/genética , Estimativa de Kaplan-Meier , Perfilação da Expressão Gênica

12.

Integrative transcriptome-wide analysis of atopic dermatitis for drug repositioning.

Song, Jaeseung; Kim, Daeun; Lee, Sora; Jung, Junghyun; Joo, Jong Wha J; Jang, Wonhee.

Commun Biol ; 5(1): 615, 2022 06 22.

Artigo em Inglês | MEDLINE | ID: mdl-35729261

RESUMO

Atopic dermatitis (AD) is one of the most common inflammatory skin diseases, which significantly impact the quality of life. Transcriptome-wide association study (TWAS) was conducted to estimate both transcriptomic and genomic features of AD and detected significant associations between 31 expression quantitative loci and 25 genes. Our results replicated well-known genetic markers for AD, as well as 4 novel associated genes. Next, transcriptome meta-analysis was conducted with 5 studies retrieved from public databases and identified 5 additional novel susceptibility genes for AD. Applying the connectivity map to the results from TWAS and meta-analysis, robustly enriched perturbations were identified and their chemical or functional properties were analyzed. Here, we report the first research on integrative approaches for an AD, combining TWAS and transcriptome meta-analysis. Together, our findings could provide a comprehensive understanding of the pathophysiologic mechanisms of AD and suggest potential drug candidates as alternative treatment options.

Assuntos

Dermatite Atópica , Transcriptoma , Dermatite Atópica/tratamento farmacológico , Dermatite Atópica/genética , Dermatite Atópica/metabolismo , Reposicionamento de Medicamentos , Estudo de Associação Genômica Ampla/métodos , Humanos , Qualidade de Vida

13.

A transcriptome-wide association study of uterine fibroids to identify potential genetic markers and toxic chemicals.

Kim, Gayeon; Jang, Gyuyeon; Song, Jaeseung; Kim, Daeun; Lee, Sora; Joo, Jong Wha J; Jang, Wonhee.

PLoS One ; 17(9): e0274879, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36174000

RESUMO

Uterine fibroid is one of the most prevalent benign tumors in women, with high socioeconomic costs. Although genome-wide association studies (GWAS) have identified several loci associated with uterine fibroid risks, they could not successfully interpret the biological effects of genomic variants at the gene expression levels. To prioritize uterine fibroid susceptibility genes that are biologically interpretable, we conducted a transcriptome-wide association study (TWAS) by integrating GWAS data of uterine fibroid and expression quantitative loci data. We identified nine significant TWAS genes including two novel genes, RP11-282O18.3 and KBTBD7, which may be causal genes for uterine fibroid. We conducted functional enrichment network analyses using the TWAS results to investigate the biological pathways in which the overall TWAS genes were involved. The results demonstrated the immune system process to be a key pathway in uterine fibroid pathogenesis. Finally, we carried out chemical-gene interaction analyses using the TWAS results and the comparative toxicogenomics database to determine the potential risk chemicals for uterine fibroid. We identified five toxic chemicals that were significantly associated with uterine fibroid TWAS genes, suggesting that they may be implicated in the pathogenesis of uterine fibroid. In this study, we performed an integrative analysis covering the broad application of bioinformatics approaches. Our study may provide a deeper understanding of uterine fibroid etiologies and informative notifications about potential risk chemicals for uterine fibroid.

Assuntos

Leiomioma , Transcriptoma , Feminino , Marcadores Genéticos , Estudo de Associação Genômica Ampla , Humanos , Leiomioma/genética , Toxicogenética

14.

Prediction Models for Identifying Ion Channel-Modulating Peptides via Knowledge Transfer Approaches.

Lee, Byungjo; Shin, Min Kyoung; Kim, Taegun; Shim, Yu Jeong; Joo, Jong Wha J; Sung, Jung-Suk; Jang, Wonhee.

IEEE J Biomed Health Inform ; 26(12): 6150-6160, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-36070258

RESUMO

Ion channels, which can be modulated by peptides, are promising drug targets for neurological, metabolic, and cardiovascular disorders. Because it is expensive and labor-intensive to experimentally screen ion channel-modulating peptides (IMPs), in-silico approaches can serve as excellent alternatives. In this study, we present PrIMP, prediction models for screening IMPs that can target sodium, potassium, and calcium ion channels, as well as nicotine acetylcholine receptors (nAChRs). To overcome the data insufficiency of the IMPs, we utilized two types of knowledge transfer approaches: multi-task learning (MTL) and transfer learning (TL). MTL enabled model training for four target tasks simultaneously with hard parameter sharing, thereby increasing model generalization. TL transferred knowledge of pre-trained model weights from antimicrobial peptide data, which was a much larger, naturally-occurring functional peptide dataset that could potentially improve the model performance. MTL and TL successfully improved the prediction performance of prediction models. In addition, a hybrid approach by implementing deep learning along with traditional machine learning was utilized, with additional performance improvements. PrIMP achieved F1 scores of 0.924 (sodium ion channel), 0.937 (potassium ion channel), 0.898 (calcium ion channel), and 0.931 (nAChRs). The pre-processed dataset and proposed model are available at https://github.com/bzlee-bio/PrIMP.

Assuntos

Canais Iônicos , Aprendizado de Máquina , Humanos , Peptídeos

15.

Target-Decoy with Mass Binning: a simple and effective validation method for shotgun proteomics using high resolution mass spectrometry.

Joo, Jong Wha J; Na, Seungjin; Baek, Je-Hyun; Lee, Cheolju; Paek, Eunok.

J Proteome Res ; 9(2): 1150-6, 2010 Feb 05.

Artigo em Inglês | MEDLINE | ID: mdl-19908919

RESUMO

Shotgun proteomics using mass spectrometry (MS) has become the choice for large-scale peptide and protein identification. The recent development of high-resolution mass spectrometers such as FT-ICR or Orbitrap makes it possible to identify peptides within only a few parts per million (ppm), and it is expected to dramatically improve performance of peptide identification, as compared to low-resolution instruments. To fully exploit such significantly higher mass accuracy, however, appropriate data analysis methods are required. Here, we present a new target-decoy strategy, called Target-Decoy with Mass Binning, utilizing high mass accuracy for peptide identification validation, which remains a challenging problem in MS-based proteomics. When tested on various high-resolution MS data, our method was very effective and yet simple and showed comparable or better performance when compared with other validation methods.

Assuntos

Espectrometria de Massas/métodos , Proteômica

16.

Improving Imputation Accuracy by Inferring Causal Variants in Genetic Studies.

Wu, Yue; Hormozdiari, Farhad; Joo, Jong Wha J; Eskin, Eleazar.

J Comput Biol ; 26(11): 1203-1213, 2019 11.

Artigo em Inglês | MEDLINE | ID: mdl-30272994

RESUMO

Genotype imputation has been widely utilized for two reasons in the analysis of genome-wide association studies (GWAS). One reason is to increase the power for association studies when causal single nucleotide polymorphisms are not collected in the GWAS. The second reason is to aid the interpretation of a GWAS result by predicting the association statistics at untyped variants. In this article, we show that prediction of association statistics at untyped variants that have an influence on the trait produces is overly conservative. Current imputation methods assume that none of the variants in a region (locus consists of multiple variants) affect the trait, which is often inconsistent with the observed data. In this article, we propose a new method, CAUSAL-Imp, which can impute the association statistics at untyped variants while taking into account variants in the region that may affect the trait. Our method builds on recent methods that impute the marginal statistics for GWAS by utilizing the fact that marginal statistics follow a multivariate normal distribution. We utilize both simulated and real data sets to assess the performance of our method. We show that traditional imputation approaches underestimate the association statistics for variants involved in the trait, and our results demonstrate that our approach provides less biased estimates of these association statistics.

Assuntos

Estudo de Associação Genômica Ampla/estatística & dados numéricos , Genoma/genética , Software , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética

17.

Integrative genomic and transcriptomic analysis of genetic markers in Dupuytren's disease.

Jung, Junghyun; Kim, Go Woon; Lee, Byungjo; Joo, Jong Wha J; Jang, Wonhee.

BMC Med Genomics ; 12(Suppl 5): 98, 2019 07 11.

Artigo em Inglês | MEDLINE | ID: mdl-31296227

RESUMO

BACKGROUND: Dupuytren's disease (DD) is a fibroproliferative disorder characterized by thickening and contracting palmar fascia. The exact pathogenesis of DD remains unknown. RESULTS: In this study, we identified co-expressed gene set (DD signature) consisting of 753 genes via weighted gene co-expression network analysis. To confirm the robustness of DD signature, module enrichment analysis and meta-analysis were performed. Moreover, this signature effectively classified DD disease samples. The DD signature were significantly enriched in unfolded protein response (UPR) related to endoplasmic reticulum (ER) stress. Next, we conducted multiple-phenotype regression analysis to identify trans-regulatory hotspots regulating expression levels of DD signature using Genotype-Tissue Expression data. Finally, 10 trans-regulatory hotspots and 16 eGenes genes that are significantly associated with at least one cis-eQTL were identified. CONCLUSIONS: Among these eGenes, major histocompatibility complex class II genes and ZFP57 zinc finger protein were closely related to ER stress and UPR, suggesting that these genetic markers might be potential therapeutic targets for DD.

Assuntos

Contratura de Dupuytren/genética , Perfilação da Expressão Gênica , Marcadores Genéticos/genética , Genômica , Animais , Redes Reguladoras de Genes , Humanos

18.

deMix: Decoding Deuterated Distributions from Heterogeneous Protein States via HDX-MS.

Na, Seungjin; Lee, Jae-Jin; Joo, Jong Wha J; Lee, Kong-Joo; Paek, Eunok.

Sci Rep ; 9(1): 3176, 2019 02 28.

Artigo em Inglês | MEDLINE | ID: mdl-30816214

RESUMO

Characterization of protein structural changes in response to protein modifications, ligand or chemical binding, or protein-protein interactions is essential for understanding protein function and its regulation. Amide hydrogen/deuterium exchange (HDX) coupled with mass spectrometry (MS) is one of the most favorable tools for characterizing the protein dynamics and changes of protein conformation. However, currently the analysis of HDX-MS data is not up to its full power as it still requires manual validation by mass spectrometry experts. Especially, with the advent of high throughput technologies, the data size grows everyday and an automated tool is essential for the analysis. Here, we introduce a fully automated software, referred to as 'deMix', for the HDX-MS data analysis. deMix deals directly with the deuterated isotopic distributions, but not considering their centroid masses and is designed to be robust over random noises. In addition, unlike the existing approaches that can only determine a single state from an isotopic distribution, deMix can also detect a bimodal deuterated distribution, arising from EX1 behavior or heterogeneous peptides in conformational isomer proteins. Furthermore, deMix comes with visualization software to facilitate validation and representation of the analysis results.

Assuntos

Espectrometria de Massa com Troca Hidrogênio-Deutério/métodos , Proteínas/ultraestrutura , Software , Conformação Proteica , Proteínas/química

19.

Meta-Analysis of Polymyositis and Dermatomyositis Microarray Data Reveals Novel Genetic Biomarkers.

Song, Jaeseung; Kim, Daeun; Hong, Juyeon; Kim, Go Woon; Jung, Junghyun; Park, Sejin; Park, Hee Jung; Joo, Jong Wha J; Jang, Wonhee.

Genes (Basel) ; 10(11)2019 10 30.

Artigo em Inglês | MEDLINE | ID: mdl-31671645

RESUMO

Polymyositis (PM) and dermatomyositis (DM) are both classified as idiopathic inflammatory myopathies. They share a few common characteristics such as inflammation and muscle weakness. Previous studies have indicated that these diseases present aspects of an auto-immune disorder; however, their exact pathogenesis is still unclear. In this study, three gene expression datasets (PM: 7, DM: 50, Control: 13) available in public databases were used to conduct meta-analysis. We then conducted expression quantitative trait loci analysis to detect the variant sites that may contribute to the pathogenesis of PM and DM. Six-hundred differentially expressed genes were identified in the meta-analysis (false discovery rate (FDR) < 0.01), among which 317 genes were up-regulated and 283 were down-regulated in the disease group compared with those in the healthy control group. The up-regulated genes were significantly enriched in interferon-signaling pathways in protein secretion, and/or in unfolded-protein response. We detected 10 single nucleotide polymorphisms (SNPs) which could potentially play key roles in driving the PM and DM. Along with previously reported genes, we identified 4 novel genes and 10 SNP-variant regions which could be used as candidates for potential drug targets or biomarkers for PM and DM.

Assuntos

Dermatomiosite/genética , Polimiosite/genética , Biomarcadores , Estudos de Casos e Controles , Bases de Dados Genéticas , Expressão Gênica/genética , Perfilação da Expressão Gênica/métodos , Marcadores Genéticos/genética , Predisposição Genética para Doença/genética , Humanos , Interferons/genética , Miosite/genética , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Resposta a Proteínas não Dobradas/genética

20.

An Association Mapping Framework To Account for Potential Sex Difference in Genetic Architectures.

Kang, Eun Yong; Lee, Cue Hyunkyu; Furlotte, Nicholas A; Joo, Jong Wha J; Kostem, Emrah; Zaitlen, Noah; Eskin, Eleazar; Han, Buhm.

Genetics ; 209(3): 685-698, 2018 07.

Artigo em Inglês | MEDLINE | ID: mdl-29752291

RESUMO

Over the past few years, genome-wide association studies have identified many trait-associated loci that have different effects on females and males, which increased attention to the genetic architecture differences between the sexes. The between-sex differences in genetic architectures can cause a variety of phenomena such as differences in the effect sizes at trait-associated loci, differences in the magnitudes of polygenic background effects, and differences in the phenotypic variances. However, current association testing approaches for dealing with sex, such as including sex as a covariate, cannot fully account for these phenomena and can be suboptimal in statistical power. We present a novel association mapping framework, MetaSex, that can comprehensively account for the genetic architecture differences between the sexes. Through simulations and applications to real data, we show that our framework has superior performance than previous approaches in association mapping.

Assuntos

Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Caracteres Sexuais , Algoritmos , Feminino , Humanos , Masculino , Herança Multifatorial , Locos de Características Quantitativas

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA