Pesquisa | Portal Regional da BVS

1.

Cancer signature ensemble integrating cfDNA methylation, copy number, and fragmentation facilitates multi-cancer early detection.

Kim, Su Yeon; Jeong, Seongmun; Lee, Wookjae; Jeon, Yujin; Kim, Yong-Jin; Park, Seowoo; Lee, Dongin; Go, Dayoung; Song, Sang-Hyun; Lee, Sanghoo; Woo, Hyun Goo; Yoon, Jung-Ki; Park, Young Sik; Kim, Young Tae; Lee, Se-Hoon; Kim, Kwang Hyun; Lim, Yoojoo; Kim, Jin-Soo; Kim, Hwang-Phill; Bang, Duhee; Kim, Tae-You.

Exp Mol Med ; 55(11): 2445-2460, 2023 11.

Artigo em Inglês | MEDLINE | ID: mdl-37907748

RESUMO

Cell-free DNA (cfDNA) sequencing has demonstrated great potential for early cancer detection. However, most large-scale studies have focused only on either targeted methylation sites or whole-genome sequencing, limiting comprehensive analysis that integrates both epigenetic and genetic signatures. In this study, we present a platform that enables simultaneous analysis of whole-genome methylation, copy number, and fragmentomic patterns of cfDNA in a single assay. Using a total of 950 plasma (361 healthy and 589 cancer) and 240 tissue samples, we demonstrate that a multifeature cancer signature ensemble (CSE) classifier integrating all features outperforms single-feature classifiers. At 95.2% specificity, the cancer detection sensitivity with methylation, copy number, and fragmentomic models was 77.2%, 61.4%, and 60.5%, respectively, but sensitivity was significantly increased to 88.9% with the CSE classifier (p value < 0.0001). For tissue of origin, the CSE classifier enhanced the accuracy beyond the methylation classifier, from 74.3% to 76.4%. Overall, this work proves the utility of a signature ensemble integrating epigenetic and genetic information for accurate cancer detection.

Assuntos

Ácidos Nucleicos Livres , Neoplasias , Humanos , Detecção Precoce de Câncer , Variações do Número de Cópias de DNA , Neoplasias/diagnóstico , Neoplasias/genética , Metilação de DNA , Biomarcadores Tumorais/genética

2.

Retraction: Dissection of soybean populations according to selection signatures based on whole-genome sequences.

Kim, Jae-Yoon; Jeong, Seongmun; Kim, Kyoung Hyoun; Lim, Won-Jun; Lee, Ho-Yeon; Jeong, Namhee; Moon, Jung-Kyung; Kim, Namshin.

Gigascience ; 10(5)2021 May 11.

Artigo em Inglês | MEDLINE | ID: mdl-33973002

3.

Gene Expression Profile in Similar Tissues Using Transcriptome Sequencing Data of Whole-Body Horse Skeletal Muscle.

Lee, Ho-Yeon; Kim, Jae-Yoon; Kim, Kyoung Hyoun; Jeong, Seongmun; Cho, Youngbum; Kim, Namshin.

Genes (Basel) ; 11(11)2020 11 17.

Artigo em Inglês | MEDLINE | ID: mdl-33213000

RESUMO

Horses have been studied for exercise function rather than food production, unlike most livestock. Therefore, the role and characteristics of tissue landscapes are critically understudied, except for certain muscles used in exercise-related studies. In the present study, we compared RNA-Seq data from 18 Jeju horse skeletal muscles to identify differentially expressed genes (DEGs) between tissues that have similar functions and to characterize these differences. We identified DEGs between different muscles using pairwise differential expression (DE) analyses of tissue transcriptome expression data and classified the samples using the expression values of those genes. Each tissue was largely classified into two groups and their subgroups by k-means clustering, and the DEGs identified in comparison between each group were analyzed by functional/pathway level using gene set enrichment analysis and gene level, confirming the expression of significant genes. As a result of the analysis, the differences in metabolic properties like glycolysis, oxidative phosphorylation, and exercise adaptation of the groups were detected. The results demonstrated that the biochemical and anatomical features of a wide range of muscle tissues in horses could be determined through transcriptome expression analysis, and provided proof-of-concept data demonstrating that RNA-Seq analysis can be used to classify and study in-depth differences between tissues with similar properties.

Assuntos

Cavalos/genética , Músculo Esquelético/fisiologia , Transcriptoma , Animais , Glicólise/genética , Fosforilação Oxidativa

4.

GMStool: GWAS-based marker selection tool for genomic prediction from genomic data.

Jeong, Seongmun; Kim, Jae-Yoon; Kim, Namshin.

Sci Rep ; 10(1): 19653, 2020 11 12.

Artigo em Inglês | MEDLINE | ID: mdl-33184432

RESUMO

The increased accessibility to genomic data in recent years has laid the foundation for studies to predict various phenotypes of organisms based on the genome. Genomic prediction collectively refers to these studies, and it estimates an individual's phenotypes mainly using single nucleotide polymorphism markers. Typically, the accuracy of these genomic prediction studies is highly dependent on the markers used; however, in practice, choosing optimal markers with high accuracy for the phenotype to be used is a challenging task. Therefore, we present a new tool called GMStool for selecting optimal marker sets and predicting quantitative phenotypes. The GMStool is based on a genome-wide association study (GWAS) and heuristically searches for optimal markers using statistical and machine-learning methods. The GMStool performs the genomic prediction using statistical and machine/deep-learning models and presents the best prediction model with the optimal marker-set. For the evaluation, the GMStool was tested on real datasets with four phenotypes. The prediction results showed higher performance than using the entire markers or the GWAS-top markers, which have been used frequently in prediction studies. Although the GMStool has several limitations, it is expected to contribute to various studies for predicting quantitative phenotypes. The GMStool written in R is available at www.github.com/JaeYoonKim72/GMStool .

Assuntos

Marcadores Genéticos , Genoma de Planta , Estudo de Associação Genômica Ampla/métodos , Glycine max/genética , Oryza/genética , Proteínas de Plantas/genética , Polimorfismo de Nucleotídeo Único , Software , Bases de Dados Genéticas/estatística & dados numéricos , Genótipo , Aprendizado de Máquina , Modelos Genéticos , Fenótipo , Locos de Características Quantitativas , Seleção Genética

5.

Genetically, Dietary Sodium Intake Is Causally Associated with Salt-Sensitive Hypertension Risk in a Community-Based Cohort Study: a Mendelian Randomization Approach.

Jeong, Seongmun; Kim, Jae-Yoon; Cho, Youngbum; Koh, Sang Baek; Kim, Namshin; Choi, Jung Ran.

Curr Hypertens Rep ; 22(7): 45, 2020 06 26.

Artigo em Inglês | MEDLINE | ID: mdl-32591971

RESUMO

PURPOSE OF REVIEW: Excessive dietary salt intake is associated with an increased risk of hypertension. Salt sensitivity, i.e., an elevation in blood pressure in response to high dietary salt intake, has been associated with a high risk of cardiovascular disease and mortality. We investigated whether a causal association exists between dietary sodium intake and hypertension risk using Mendelian randomization (MR). RECENT FINDINGS: We performed an MR study using data from a large genome-wide association study comprising 15,034 Korean adults in a community-based cohort study. A total of 1282 candidate single nucleotide polymorphisms associated with dietary sodium intake, such as rs2960306, rs4343, and rs1937671, were selected as instrumental variables. The inverse variance weighted method was used to assess the evidence for causality. Higher dietary sodium intake was associated with salt-sensitive hypertension risk. The variants of SLC8E1 rs2241543 and ADD1 rs16843589 were strongly associated with increased blood pressure. In the logistic regression model, after adjusting for age, gender, smoking, drinking, exercise, and body mass index, the GRK4 rs2960306TT genotype was inversely associated with hypertension risk (OR, 0.356; 95% CI, 0.236-0.476). However, the 2350GG genotype (ACE rs4343) exhibited a 2.11-fold increased hypertension risk (OR, 2.114; 95% CI, 2.004-2.224) relative to carriers of the 2350AA genotype, after adjusting for confounders. MR analysis revealed that the odds ratio for hypertension per 1 mg/day increment of dietary sodium intake was 2.24 in participants with the PRKG1 rs12414562 AA genotype. Our findings suggest that dietary sodium intake may be causally associated with hypertension risk.

Assuntos

Hipertensão , Sódio na Dieta , Adulto , Estudos de Coortes , Quinase 4 de Receptor Acoplado a Proteína G , Estudo de Associação Genômica Ampla , Humanos , Hipertensão/genética , Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo Único , Cloreto de Sódio na Dieta/efeitos adversos , Sódio na Dieta/efeitos adversos

6.

Bioinformatics services for analyzing massive genomic datasets.

Ko, Gunhwan; Kim, Pan-Gyu; Cho, Youngbum; Jeong, Seongmun; Kim, Jae-Yoon; Kim, Kyoung Hyoun; Lee, Ho-Yeon; Han, Jiyeon; Yu, Namhee; Ham, Seokjin; Jang, Insoon; Kang, Byunghee; Shin, Sunguk; Kim, Lian; Lee, Seung-Won; Nam, Dougu; Kim, Jihyun F; Kim, Namshin; Kim, Seon-Young; Lee, Sanghyuk; Roh, Tae-Young; Lee, Byungwook.

Genomics Inform ; 18(1): e8, 2020 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-32224841

RESUMO

The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www.bioexpress.re.kr/.

7.

Expression of concern: Dissection of soybean populations according to selection signatures based on whole-genome sequences.

Kim, Jae-Yoon; Jeong, Seongmun; Kim, Kyoung Hyoun; Lim, Won-Jun; Lee, Ho-Yeon; Jeong, Namhee; Moon, Jung-Kyung; Kim, Namshin.

Gigascience ; 9(5)2020 05 01.

Artigo em Inglês | MEDLINE | ID: mdl-32347930

8.

Genome-wide association and epistatic interactions of flowering time in soybean cultivar.

Kim, Kyoung Hyoun; Kim, Jae-Yoon; Lim, Won-Jun; Jeong, Seongmun; Lee, Ho-Yeon; Cho, Youngbum; Moon, Jung-Kyung; Kim, Namshin.

PLoS One ; 15(1): e0228114, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-31968016

RESUMO

Genome-wide association studies (GWAS) have enabled the discovery of candidate markers that play significant roles in various complex traits in plants. Recently, with increased interest in the search for candidate markers, studies on epistatic interactions between single nucleotide polymorphism (SNP) markers have also increased, thus enabling the identification of more candidate markers along with GWAS on single-variant-additive-effect. Here, we focused on the identification of candidate markers associated with flowering time in soybean (Glycine max). A large population of 2,662 cultivated soybean accessions was genotyped using the 180k Axiom® SoyaSNP array, and the genomic architecture of these accessions was investigated to confirm the population structure. Then, GWAS was conducted to evaluate the association between SNP markers and flowering time. A total of 93 significant SNP markers were detected within 59 significant genes, including E1 and E3, which are the main determinants of flowering time. Based on the GWAS results, multilocus epistatic interactions were examined between the significant and non-significant SNP markers. Two significant and 16 non-significant SNP markers were discovered as candidate markers affecting flowering time via interactions with each other. These 18 candidate SNP markers mapped to 18 candidate genes including E1 and E3, and the 18 candidate genes were involved in six major flowering pathways. Although further biological validation is needed, our results provide additional information on the existing flowering time markers and present another option to marker-assisted breeding programs for regulating flowering time of soybean.

Assuntos

Flores/genética , Genoma de Planta/genética , Estudo de Associação Genômica Ampla/métodos , Glycine max/genética , Mapeamento Cromossômico/métodos , Genômica , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas

9.

Dissection of soybean populations according to selection signatures based on whole-genome sequences.

Kim, Jae-Yoon; Jeong, Seongmun; Kim, Kyoung Hyoun; Lim, Won-Jun; Lee, Ho-Yeon; Jeong, Namhee; Moon, Jung-Kyung; Kim, Namshin.

Gigascience ; 8(12)2019 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-31869408

RESUMO

BACKGROUND: Domestication and improvement processes, accompanied by selections and adaptations, have generated genome-wide divergence and stratification in soybean populations. Simultaneously, soybean populations, which comprise diverse subpopulations, have developed their own adaptive characteristics enhancing fitness, resistance, agronomic traits, and morphological features. The genetic traits underlying these characteristics play a fundamental role in improving other soybean populations. RESULTS: This study focused on identifying the selection signatures and adaptive characteristics in soybean populations. A core set of 245 accessions (112 wild-type, 79 landrace, and 54 improvement soybeans) selected from 4,234 soybean accessions was re-sequenced. Their genomic architectures were examined according to the domestication and improvement, and accessions were then classified into 3 wild-type, 2 landrace, and 2 improvement subgroups based on various population analyses. Selection and gene set enrichment analyses revealed that the landrace subgroups have selection signals for soybean-cyst nematode HG type 0 and seed development with germination, and that the improvement subgroups have selection signals for plant development with viability and seed development with embryo development, respectively. The adaptive characteristic for soybean-cyst nematode was partially underpinned by multiple resistance accessions, and the characteristics related to seed development were supported by our phenotypic findings for seed weights. Furthermore, their adaptive characteristics were also confirmed as genome-based evidence, and unique genomic regions that exhibit distinct selection and selective sweep patterns were revealed for 13 candidate genes. CONCLUSIONS: Although our findings require further biological validation, they provide valuable information about soybean breeding strategies and present new options for breeders seeking donor lines to improve soybean populations.

Assuntos

Glycine max/classificação , Locos de Características Quantitativas , Sequenciamento Completo do Genoma/métodos , Domesticação , Genoma de Planta , Proteínas de Plantas/genética , Sementes/classificação , Sementes/genética , Sementes/crescimento & desenvolvimento , Seleção Genética , Glycine max/genética , Glycine max/crescimento & desenvolvimento

10.

Korean soybean core collection: Genotypic and phenotypic diversity population structure and genome-wide association study.

Jeong, Namhee; Kim, Ki-Seung; Jeong, Seongmun; Kim, Jae-Yoon; Park, Soo-Kwon; Lee, Ju Seok; Jeong, Soon-Chun; Kang, Sung-Taeg; Ha, Bo-Keun; Kim, Dool-Yi; Kim, Namshin; Moon, Jung-Kyung; Choi, Man Soo.

PLoS One ; 14(10): e0224074, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31639154

RESUMO

A core collection is a subset that represents genetic diversity of the total collection. Soybean (Glycine max (L.) Merr.) is one of major food and feed crops. It is the world's most cultivated annual herbaceous legume. Constructing a core collection for soybean could play a pivotal role in conserving and utilizing its genetic variability for research and breeding programs. To construct and evaluate a Korean soybean core collection, genotypic and phenotypic data as well as population structure, were analyzed. The Korean soybean core collection consisted of 430 accessions selected from 2,872 collections based on Affymetrix Axiom® 180k SoyaSNP array data. The core collection represented 99% of genotypic diversity of the total collection. Analysis of population structure clustered the core collection into five subpopulations. Accessions from South Korea and North Korea were distributed across five subpopulations. Analysis of molecular variance indicated that only 2.01% of genetic variation could be explained by geographic origins while 16.18% of genetic variation was accounted for by subpopulations. Genome-wide association study (GWAS) for days to flowering, flower color, pubescent color, and growth habit confirmed that the core collection had the same genetic diversity for tested traits as the total collection. The Korean soybean core collection was constructed based on genotypic information of the 180k SNP data. Size and phenotypic diversity of the core collection accounted for approximately 14.9% and 18.1% of the total collection, respectively. GWAS of core and total collections successfully confirmed loci associated with tested traits. Consequently, the present study showed that the Korean soybean core collection could provide fundamental and practical material and information for both soybean genetic research and breeding programs.

Assuntos

Genoma de Planta , Estudo de Associação Genômica Ampla/métodos , Glycine max/classificação , Glycine max/genética , Melhoramento Vegetal , Proteínas de Plantas/genética , Polimorfismo de Nucleotídeo Único , Genótipo , Humanos , Fenótipo , República da Coreia

11.

Discovery of Genomic Characteristics and Selection Signatures in Korean Indigenous Goats Through Comparison of 10 Goat Breeds.

Kim, Jae-Yoon; Jeong, Seongmun; Kim, Kyoung Hyoun; Lim, Won-Jun; Lee, Ho-Yeon; Kim, Namshin.

Front Genet ; 10: 699, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31440273

RESUMO

Indigenous breeds develop their own genomic characteristics by adapting to local environments or cultures over long periods of time. Most of them are not particularly productive in commercial terms, but they have abilities to survive in harsh environments or tolerate to specific diseases. Their adaptive characteristics play an important role as genetic materials for improving commercial breeds. As a step toward this goal, we analyzed the genome of Korean indigenous goats within 10 goat breeds. We collected 136 goat individuals by sequencing 46 new goats and employing 90 publicly available goats. Our whole-genome data was comprised of three indigenous breeds (Korean indigenous goat, Iranian indigenous goat, and Moroccan indigenous goat; n = 29, 18, 20), six commercial breeds (Saanen, Boer, Anglo-Nubian, British Alpine, Alpine, and Korean crossbred; n = 16, 11, 5, 5, 2, 13), and their ancestral species (Capra aegagrus; n = 17). We identified that the Iranian indigenous goat and the Moroccan indigenous goat have relatively similar genomic characteristics within a large category of genomic diversity but found that the Korean indigenous goat has unique genomic characteristics distinguished from the other nine breeds. Through population analysis, we confirmed that these characteristics have resulted from a near-isolated environment with strong genetic drift. The Korean indigenous goat experienced a severe genetic bottleneck upon entering the Korean Peninsula about 2,000 years ago, and has subsequently rarely experienced genetic interactions with other goat breeds. From selection analysis and gene-set enrichment analysis, we revealed selection signals for Salmonella infection and cardiomyopathy in the genome of the Korean indigenous goat. These adaptive characteristics were further identified with genomic-based evidence. We uncovered genomic regions of selective sweeps in the LBP and BPI genes (Salmonella infection) and the TTN and ITGB6 genes (cardiomyopathy), among several candidate genes. Our research presents unique genomic characteristics and distinctive selection signals of the Korean indigenous goat based on the extensive comparison. Although the adaptive traits require further validation through biological experiments, our findings are expected to provide a direction for future biodiversity conservation strategies and to contribute another option to genomic-based breeding programmes for improving the viability of Capra hircus.

12.

Identification of DNA-Methylated CpG Islands Associated With Gene Silencing in the Adult Body Tissues of the Ogye Chicken Using RNA-Seq and Reduced Representation Bisulfite Sequencing.

Lim, Won-Jun; Kim, Kyoung Hyoun; Kim, Jae-Yoon; Jeong, Seongmun; Kim, Namshin.

Front Genet ; 10: 346, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31040866

RESUMO

DNA methylation is an epigenetic mark that plays an essential role in regulating gene expression. CpG islands are DNA methylations regions in promoters known to regulate gene expression through transcriptional silencing of the corresponding gene. DNA methylation at CpG islands is crucial for gene expression and tissue-specific processes. At the current time, a limited number of studies have reported on gene expression associated with DNA methylation in diverse adult tissues at the genome-wide level. Expression levels are rarely affected by DNA methylation in normal adult tissues; however, statistical differences in gene expression level correlated with DNA methylation have recently been revealed. In this study, we examined 20 pairs of DNA methylomes and transcriptomes from RNA-seq and reduced representation bisulfite sequencing (RRBS) data using adult Ogye chicken tissues. A total of 3,133 CpG islands were identified from 20 tissue data in a single chicken sample which could affect downstream genes. Analyzing these CpG island and gene pairs, 121 significant units were statistically correlated. Among them, six genes (CLDN3, DECR2, EVA1B, NME4, NTSR1, and XPNPEP2) were highly significantly changed by altered DNA methylation. Finally, our data demonstrated how DNA methylation correlated to gene expression in normal adult tissues. Our source codes can be found at https://github.com/wjlim/correlation-between-rna-seq-and-RRBS.

13.

Association between Serum Urate and Risk of Hypertension in Menopausal Women with XDH Gene.

Lee, Jong-Han; Go, Tae Hwa; Lee, San-Hui; Kim, Juwon; Huh, Ji Hye; Kim, Jang Young; Kang, Dae Ryong; Jeong, Seongmun; Koh, Sang-Baek; Choi, Jung Ran.

J Clin Med ; 8(5)2019 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-31126092

RESUMO

Elevated serum urate (sUA) concentrations have been associated with an increased risk of hypertension. We aimed to examine the association of sUA concentration on the risk of hypertension in pre- and post-menopausal women and investigated the association between the polymorphism of the xanthine dehydrogenase gene and the risk of hypertension. Among 7294 women, 1415 premenopausal and 5879 postmenopausal women were recruited. Anthropometric parameters as risk factors of hypertension were identify by logistic regression models. In addition, we investigated an association between xanthine dehydrogenase gene and sUA and their combined associations on the risk of hypertension. Body mass index (BMI) and waist circumference (WC) were significantly increased in accordance to the increase of sUA levels (p < 0.001). Multivariate logistic regression analysis showed postmenopausal women with a high sUA and high BMI were 3.18 times more likely to have hypertension than in those with normal and lower sUA (Odds ratio: 3.18, 95% confidence interval: 2.54-3.96). Postmenopausal women with a high WC were 1.62 times more likely to have hypertension than in those with normal and lower sUA. Subjects with the AG genotype of rs206860 was found to be at lower risk of hypertension (odd ratio: 0.287, 95% confidence interval: 0.091-0.905, p = 0.033). This cross-sectional study indicated a high sUA is associated with a higher risk of hypertension in postmenopausal women. Further well-designed prospective studies in other populations are warranted to validate our results.

14.

SEXCMD: Development and validation of sex marker sequences for whole-exome/genome and RNA sequencing.

Jeong, Seongmun; Kim, Jiwoong; Park, Won; Jeon, Hongmin; Kim, Namshin.

PLoS One ; 12(9): e0184087, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28886064

RESUMO

Over the last decade, a large number of nucleotide sequences have been generated by next-generation sequencing technologies and deposited to public databases. However, most of these datasets do not specify the sex of individuals sampled because researchers typically ignore or hide this information. Male and female genomes in many species have distinctive sex chromosomes, XX/XY and ZW/ZZ, and expression levels of many sex-related genes differ between the sexes. Herein, we describe how to develop sex marker sequences from syntenic regions of sex chromosomes and use them to quickly identify the sex of individuals being analyzed. Array-based technologies routinely use either known sex markers or the B-allele frequency of X or Z chromosomes to deduce the sex of an individual. The same strategy has been used with whole-exome/genome sequence data; however, all reads must be aligned onto a reference genome to determine the B-allele frequency of the X or Z chromosomes. SEXCMD is a pipeline that can extract sex marker sequences from reference sex chromosomes and rapidly identify the sex of individuals from whole-exome/genome and RNA sequencing after training with a known dataset through a simple machine learning approach. The pipeline counts total numbers of hits from sex-specific marker sequences and identifies the sex of the individuals sampled based on the fact that XX/ZZ samples do not have Y or W chromosome hits. We have successfully validated our pipeline with mammalian (Homo sapiens; XY) and avian (Gallus gallus; ZW) genomes. Typical calculation time when applying SEXCMD to human whole-exome or RNA sequencing datasets is a few minutes, and analyzing human whole-genome datasets takes about 10 minutes. Another important application of SEXCMD is as a quality control measure to avoid mixing samples before bioinformatics analysis. SEXCMD comprises simple Python and R scripts and is freely available at https://github.com/lovemun/SEXCMD.

Assuntos

Exoma , Marcadores Genéticos , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Processos de Determinação Sexual/genética , Animais , Mapeamento Cromossômico , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Genômica/métodos , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de RNA

15.

GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets.

Jeong, Seongmun; Kim, Jae-Yoon; Jeong, Soon-Chun; Kang, Sung-Taeg; Moon, Jung-Kyung; Kim, Namshin.

PLoS One ; 12(7): e0181420, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28727806

RESUMO

Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.

Assuntos

Algoritmos , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Acesso à Informação , Frequência do Gene , Internet , Oryza/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Análise de Componente Principal , Reprodutibilidade dos Testes , Software , Triticum/genética

16.

Identification of candidate domestication regions in the radish genome based on high-depth resequencing analysis of 17 genotypes.

Kim, Namshin; Jeong, Young-Min; Jeong, Seongmun; Kim, Goon-Bo; Baek, Seunghoon; Kwon, Young-Eun; Cho, Ara; Choi, Sang-Bong; Kim, Jiwoong; Lim, Won-Jun; Kim, Kyoung Hyoun; Park, Won; Kim, Jae-Yoon; Kim, Jin-Hyun; Yim, Bomi; Lee, Young Joon; Chun, Byung-Moon; Lee, Young-Pyo; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan.

Theor Appl Genet ; 129(9): 1797-814, 2016 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-27377547

RESUMO

KEY MESSAGE: This study provides high-quality variation data of diverse radish genotypes. Genome-wide SNP comparison along with RNA-seq analysis identified candidate genes related to domestication that have potential as trait-related markers for genetics and breeding of radish. Radish (Raphanus sativus L.) is an annual root vegetable crop that also encompasses diverse wild species. Radish has a long history of domestication, but the origins and selective sweep of cultivated radishes remain controversial. Here, we present comprehensive whole-genome resequencing analysis of radish to explore genomic variation between the radish genotypes and to identify genetic bottlenecks due to domestication in Asian cultivars. High-depth resequencing and multi-sample genotyping analysis of ten cultivated and seven wild accessions obtained 4.0 million high-quality homozygous single-nucleotide polymorphisms (SNPs)/insertions or deletions. Variation analysis revealed that Asian cultivated radish types are closely related to wild Asian accessions, but are distinct from European/American cultivated radishes, supporting the notion that Asian cultivars were domesticated from wild Asian genotypes. SNP comparison between Asian genotypes identified 153 candidate domestication regions (CDRs) containing 512 genes. Network analysis of the genes in CDRs functioning in plant signaling pathways and biochemical processes identified group of genes related to root architecture, cell wall, sugar metabolism, and glucosinolate biosynthesis. Expression profiling of the genes during root development suggested that domestication-related selective advantages included a main taproot with few branched lateral roots, reduced cell wall rigidity and favorable taste. Overall, this study provides evolutionary insights into domestication-related genetic selection in radish as well as identification of gene candidates with the potential to act as trait-related markers for background selection of elite lines in molecular breeding.

Assuntos

Domesticação , Genoma de Planta , Raphanus/genética , Evolução Molecular , Genótipo , Mutação INDEL , Polimorfismo de Nucleotídeo Único , RNA de Plantas/genética , Análise de Sequência de RNA

17.

Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan.

Theor Appl Genet ; 129(7): 1357-1372, 2016 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-27038817

RESUMO

KEYMESSAGE: This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.

Assuntos

Genoma de Planta , Raphanus/genética , Brassica/genética , Mapeamento Cromossômico , Cromossomos de Plantas , Hibridização Genômica Comparativa , DNA de Plantas/genética , Sequenciamento de Nucleotídeos em Larga Escala , Filogenia , Análise de Sequência de DNA

18.

The Clinical Significance and Molecular Features of the Spatial Tumor Shapes in Breast Cancers.

Moon, Hyeong-Gon; Kim, Namshin; Jeong, Seongmun; Lee, Minju; Moon, HyunHye; Kim, Jongjin; Yoo, Tae-Kyung; Lee, Han-Byoel; Kim, Jisun; Noh, Dong-Young; Han, Wonshik.

PLoS One ; 10(12): e0143811, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26669540

RESUMO

Each breast cancer has its unique spatial shape, but the clinical importance and the underlying mechanism for the three-dimensional tumor shapes are mostly unknown. We collected the data on the three-dimensional tumor size and tumor volume data of invasive breast cancers from 2,250 patients who underwent surgery between Jan 2000 and Jul 2007. The degree of tumor eccentricity was estimated by using the difference between the spheroid tumor volume and ellipsoid tumor volume (spheroid-ellipsoid discrepancy, SED). In 41 patients, transcriptome and exome sequencing data obtained. Estimation of more accurate tumor burden by calculating ellipsoid tumor volumes did not improve the outcome prediction when compared to the traditional longest diameter measurement. However, the spatial tumor eccentricity, which was measured by SED, showed significant variation between the molecular subtypes of breast cancer. Additionally, the degree of tumor eccentricity was associated with well-known prognostic factors of breast cancer such as tumor size and lymph node metastasis. Transcriptome data from 41 patients showed significant association between MMP13 and spatial tumor shapes. Network analysis and analysis of TCGA gene expression data suggest that MMP13 is regulated by ERBB2 and S100A7A. The present study validates the usefulness of the current tumor size method in determining tumor stages. Furthermore, we show that the tumors with high eccentricity are more likely to have aggressive tumor characteristics. Genes involved in the extracellular matrix remodeling can be candidate regulators of the spatial tumor shapes in breast cancer.

Assuntos

Neoplasias da Mama/patologia , Neoplasias da Mama/classificação , Neoplasias da Mama/genética , Células Clonais , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Imageamento Tridimensional , Estimativa de Kaplan-Meier , Metástase Linfática , Metaloproteinase 13 da Matriz/metabolismo , Mutação/genética , Carga Tumoral

19.

Prognostic and functional importance of the engraftment-associated genes in the patient-derived xenograft models of triple-negative breast cancers.

Moon, Hyeong-Gon; Oh, Keunhee; Lee, Jiwoo; Lee, Minju; Kim, Ju-Yeon; Yoo, Tae-Kyung; Seo, Myung Won; Park, Ae Kyung; Ryu, Han Suk; Jung, Eun-Jung; Kim, Namshin; Jeong, Seongmun; Han, Wonshik; Lee, Dong-Sup; Noh, Dong-Young.

Breast Cancer Res Treat ; 154(1): 13-22, 2015 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-26438141

RESUMO

We aimed to identify the factors affecting the successful tumor engraftment in breast cancer patient-derived xenograft (PDX) models. Further, we investigated the prognostic significance and the functional importance of the PDX engraftment-related genes in triple-negative breast cancers (TNBC). The clinico-pathologic features of 81 breast cancer patients whose tissues were used for PDX transplantation were analyzed to identify the factors affecting the PDX engraftment. A gene signature associated with the PDX engraftment was discovered and its clinical importance was tested in a publicly available dataset and in vitro assays. Nineteen out of 81 (23.4 %) transplanted tumors were successfully engrafted into the PDX models. The engraftment rate was highest in TNBC when compared to other subtypes (p = 0.001) and in recurrent or chemotherapy-resistant tumors compared to newly diagnosed primary tumors (p = 0.024). PDX tumors originated from the TNBC cases showed more rapid tumor growth in mice. Gene expression profiling showed that down-regulation of genes involved in the tumor-immune interaction was significantly associated with the successful PDX engraftment. The engraftment gene signature was associated with worse survival outcome when tested in publicly available mRNA datasets of TNBC cases. Among the engraftment-related genes, PHLDA2, TKT, and P4HA2 showed high expression in triple-negative breast cancer cell lines, and siRNA-based gene silencing resulted in reduced cell invasion and proliferation in vitro. Our results show that the PDX engraftment may reflect the aggressive phenotype in breast cancer. Genes associated with the PDX engraftment may provide a novel prognostic biomarker and therapeutic targets in TNBC.

Assuntos

Proliferação de Células/genética , Prognóstico , Neoplasias de Mama Triplo Negativas/genética , Ensaios Antitumorais Modelo de Xenoenxerto , Animais , Linhagem Celular Tumoral , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Camundongos , Proteínas Nucleares/biossíntese , Proteínas Nucleares/genética , Prolil Hidroxilases/biossíntese , Prolil Hidroxilases/genética , Neoplasias de Mama Triplo Negativas/patologia , Carga Tumoral/genética

20.

Grouped False-Discovery Rate for Removing the Gene-Set-Level Bias of RNA-seq.

Yang, Tae Young; Jeong, Seongmun.

Evol Bioinform Online ; 9: 467-78, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24277981

RESUMO

In recent years, RNA-seq has become a very competitive alternative to microarrays. In RNA-seq experiments, the expected read count for a gene is proportional to its expression level multiplied by its transcript length. Even when two genes are expressed at the same level, differences in length will yield differing numbers of total reads. The characteristics of these RNA-seq experiments create a gene-level bias such that the proportion of significantly differentially expressed genes increases with the transcript length, whereas such bias is not present in microarray data. Gene-set analysis seeks to identify the gene sets that are enriched in the list of the identified significant genes. In the gene-set analysis of RNA-seq, the gene-level bias subsequently yields the gene-set-level bias that a gene set with genes of long length will be more likely to show up as enriched than will a gene set with genes of shorter length. Because gene expression is not related to its transcript length, any gene set containing long genes is not of biologically greater interest than gene sets with shorter genes. Accordingly the gene-set-level bias should be removed to accurately calculate the statistical significance of each gene-set enrichment in the RNA-seq. We present a new gene set analysis method of RNA-seq, called FDRseq, which can accurately calculate the statistical significance of a gene-set enrichment score by the grouped false-discovery rate. Numerical examples indicated that FDRseq is appropriate for controlling the transcript length bias in the gene-set analysis of RNA-seq data. To implement FDRseq, we developed the R program, which can be downloaded at no cost from http://home.mju.ac.kr/home/index.action?siteId=tyang.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA