Búsqueda | Portal de Búsqueda de la BVS

1.

The Analyses of Cetacean Virus-Responsive Genes Reveal Evolutionary Marks in Mucosal Immunity-Associated Genes.

Chung, Oksung; Jung, Ye-Eun; Lee, Kyeong Won; An, Young Jun; Kim, Jungeun; Roh, Yoo-Rim; Bhak, Jong; Park, Kiejung; Weber, Jessica A; Cheong, Jaehun; Cha, Sun-Shin; Lee, Jung-Hyun; Yim, Hyung-Soon.

Biochem Genet ; 60(6): 2299-2312, 2022 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-35334059

RESUMEN

Viruses are the most common and abundant organisms in the marine environment. To better understand how cetaceans have adapted to this virus-rich environment, we compared cetacean virus-responsive genes to those from terrestrial mammals. We identified virus-responsive gene sequences in seven species of cetaceans, which we compared with orthologous sequences in seven terrestrial mammals. As a result of evolution analysis using the branch model and the branch-site model, 21 genes were selected using at least one model. IFN-Îµ, an antiviral cytokine expressed at mucous membranes, and its receptor IFNAR1 contain cetacean-specific amino acid substitutions that might change the interaction between the two proteins and lead to regulation of the immune system against viruses. Cetacean-specific amino acid substitutions in IL-6, IL-27, and the signal transducer and activator of transcription (STAT)1 are also predicted to alter the mucosal immune response of cetaceans. Since mucosal membranes are the first line of defense against the external environment and are involved in immune tolerance, our analysis of cetacean virus-responsive genes suggests that genes with cetacean-specific mutations in mucosal immunity-related genes play an important role in the protection and/or regulation of immune responses against viruses.

Asunto(s)

Cetáceos , Inmunidad Mucosa , Animales , Inmunidad Mucosa/genética , Filogenia , Cetáceos/genética , Mamíferos , Adaptación Fisiológica

2.

Global transcription network incorporating distal regulator binding reveals selective cooperation of cancer drivers and risk genes.

Kim, Kwoneel; Yang, Woojin; Lee, Kang Seon; Bang, Hyoeun; Jang, Kiwon; Kim, Sang Cheol; Yang, Jin Ok; Park, Seongjin; Park, Kiejung; Choi, Jung Kyoon.

Nucleic Acids Res ; 43(12): 5716-29, 2015 Jul 13.

Artículo en Inglés | MEDLINE | ID: mdl-26001967

RESUMEN

Global network modeling of distal regulatory interactions is essential in understanding the overall architecture of gene expression programs. Here, we developed a Bayesian probabilistic model and computational method for global causal network construction with breast cancer as a model. Whereas physical regulator binding was well supported by gene expression causality in general, distal elements in intragenic regions or loci distant from the target gene exhibited particularly strong functional effects. Modeling the action of long-range enhancers was critical in recovering true biological interactions with increased coverage and specificity overall and unraveling regulatory complexity underlying tumor subclasses and drug responses in particular. Transcriptional cancer drivers and risk genes were discovered based on the network analysis of somatic and genetic cancer-related DNA variants. Notably, we observed that the risk genes were functionally downstream of the cancer drivers and were selectively susceptible to network perturbation by tumorigenic changes in their upstream drivers. Furthermore, cancer risk alleles tended to increase the susceptibility of the transcription of their associated genes. These findings suggest that transcriptional cancer drivers selectively induce a combinatorial misregulation of downstream risk genes, and that genetic risk factors, mostly residing in distal regulatory regions, increase transcriptional susceptibility to upstream cancer-driving somatic changes.

Asunto(s)

Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Genes Relacionados con las Neoplasias , Transcripción Genética , Teorema de Bayes , Línea Celular Tumoral , Elementos de Facilitación Genéticos , Regulación Neoplásica de la Expresión Génica/efectos de los fármacos , Variación Genética , Genómica/métodos , Humanos , Células MCF-7 , Riesgo , Factores de Transcripción/metabolismo

3.

KGVDB: a population-based genomic map of CNVs tagged by SNPs in Koreans.

Moon, Sanghoon; Jung, Kwang Su; Kim, Young Jin; Hwang, Mi Yeong; Han, Kyungsook; Lee, Jong-Young; Park, Kiejung; Kim, Bong-Jo.

Bioinformatics ; 29(11): 1481-3, 2013 Jun 01.

Artículo en Inglés | MEDLINE | ID: mdl-23626002

RESUMEN

SUMMARY: Despite a growing interest in a correlation between copy number variations (CNVs) and flanking single nucleotide polymorphisms, few databases provide such information. In particular, most information on CNV available so far was obtained in Caucasian and Yoruba populations, and little is known about CNV in Asian populations. This article presents a database that provides CNV regions tagged by single nucleotide polymorphisms in about 4700 Koreans, which were detected under strict quality control, manually curated and experimentally validated. AVAILABILITY: KGVDB is freely available for non-commercial use at http://biomi.cdc.go.kr/KGVDB. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Pueblo Asiatico/genética , Variaciones en el Número de Copia de ADN , Bases de Datos de Ácidos Nucleicos , Polimorfismo de Nucleótido Simple , Mapeo Cromosómico , Genoma Humano , Estudio de Asociación del Genoma Completo , Genómica/métodos , Humanos , Corea (Geográfico)

4.

Molecular characterization of positively selected genes contributing aquatic adaptation in marine mammals.

Roh, Yoo-Rim; Yim, Hyung-Soon; Park, Kiejung; Lee, Jung-Hyun.

Genes Genomics ; 46(7): 775-783, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38733518

RESUMEN

BACKGROUND: Marine mammals, which have evolved independently into three distinct lineages, share common physiological features that contribute to their adaptation to the marine environment. OBJECTIVE: To identify positively selected genes (PSGs) for adaptation to the marine environment using available genomic data from three taxonomic orders: cetaceans, pinnipeds, and sirenians. METHODS: Based on the genomes within each group of Artiodactyla, Carnivora and Afrotheria, we performed selection analysis using the branch-site model in CODEML. RESULTS: Based on the branch-site model, 460, 614, and 359 PSGs were predicted for the cetaceans, pinnipeds, and sirenians, respectively. Functional enrichment analysis indicated that genes associated with hemostasis were positively selected across all lineages of marine mammals. We observed positive selection signals for the hemostasis and coagulation-related genes plasminogen activator, urokinase (PLAU), multimerin 1 (MMRN1), gamma-glutamyl carboxylase (GGCX), and platelet endothelial aggregation receptor 1 (PEAR1). Additionally, we found out that the sodium voltage-gated channel alpha subunit 9 (SCN9A), serine/arginine repetitive matrix 4 (SRRM4), and Ki-ras-induced actin-interacting protein (KRAP) are under positive selection pressure and are associated with cognition, neurite outgrowth, and IP3-mediated Ca2 + release, respectively. CONCLUSION: This study will contribute to our understanding of the adaptive evolution of marine mammals by providing information on a group of candidate genes that are predicted to influence adaptation to aquatic environments, as well as their functional characteristics.

Asunto(s)

Adaptación Fisiológica , Cetáceos , Selección Genética , Animales , Adaptación Fisiológica/genética , Cetáceos/genética , Mamíferos/genética , Organismos Acuáticos/genética , Filogenia , Evolución Molecular , Carnívoros/genética , Artiodáctilos/genética , Artiodáctilos/fisiología , Caniformia/genética

5.

An ensemble correlation-based gene selection algorithm for cancer classification with gene expression data.

Piao, Yongjun; Piao, Minghao; Park, Kiejung; Ryu, Keun Ho.

Bioinformatics ; 28(24): 3306-15, 2012 Dec 15.

Artículo en Inglés | MEDLINE | ID: mdl-23060613

RESUMEN

MOTIVATION: Gene selection for cancer classification is one of the most important topics in the biomedical field. However, microarray data pose a severe challenge for computational techniques. We need dimension reduction techniques that identify a small set of genes to achieve better learning performance. From the perspective of machine learning, the selection of genes can be considered to be a feature selection problem that aims to find a small subset of features that has the most discriminative information for the target. RESULTS: In this article, we proposed an Ensemble Correlation-Based Gene Selection algorithm based on symmetrical uncertainty and Support Vector Machine. In our method, symmetrical uncertainty was used to analyze the relevance of the genes, the different starting points of the relevant subset were used to generate the gene subsets and the Support Vector Machine was used as an evaluation criterion of the wrapper. The efficiency and effectiveness of our method were demonstrated through comparisons with other feature selection techniques, and the results show that our method outperformed other methods published in the literature.

Asunto(s)

Algoritmos , Perfilación de la Expresión Génica , Neoplasias/clasificación , Neoplasias/genética , Inteligencia Artificial , Expresión Génica , Humanos , Neoplasias/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos , Máquina de Vectores de Soporte

6.

Genovar: a detection and visualization tool for genomic variants.

Jung, Kwang Su; Moon, Sanghoon; Kim, Young Jin; Kim, Bong-Jo; Park, Kiejung.

BMC Bioinformatics ; 13 Suppl 7: S12, 2012 May 08.

Artículo en Inglés | MEDLINE | ID: mdl-22594998

RESUMEN

BACKGROUND: Along with single nucleotide polymorphisms (SNPs), copy number variation (CNV) is considered an important source of genetic variation associated with disease susceptibility. Despite the importance of CNV, the tools currently available for its analysis often produce false positive results due to limitations such as low resolution of array platforms, platform specificity, and the type of CNV. To resolve this problem, spurious signals must be separated from true signals by visual inspection. None of the previously reported CNV analysis tools support this function and the simultaneous visualization of comparative genomic hybridization arrays (aCGH) and sequence alignment. The purpose of the present study was to develop a useful program for the efficient detection and visualization of CNV regions that enables the manual exclusion of erroneous signals. RESULTS: A JAVA-based stand-alone program called Genovar was developed. To ascertain whether a detected CNV region is a novel variant, Genovar compares the detected CNV regions with previously reported CNV regions using the Database of Genomic Variants (DGV, http://projects.tcag.ca/variation) and the Single Nucleotide Polymorphism Database (dbSNP). The current version of Genovar is capable of visualizing genomic data from sources such as the aCGH data file and sequence alignment format files. CONCLUSIONS: Genovar is freely accessible and provides a user-friendly graphic user interface (GUI) to facilitate the detection of CNV regions. The program also provides comprehensive information to help in the elimination of spurious signals by visual inspection, making Genovar a valuable tool for reducing false positive CNV results. AVAILABILITY: http://genovar.sourceforge.net/.

Asunto(s)

Variaciones en el Número de Copia de ADN , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Programas Informáticos , Cromosomas Humanos , Hibridación Genómica Comparativa , Genómica , Humanos

7.

Exome sequencing and subsequent association studies identify five amino acid-altering variants influencing human height.

Kim, Jae-Jung; Park, Young-Mi; Baik, Kyu-Heum; Choi, Hye-Yeon; Yang, Gap-Seok; Koh, InSong; Hwang, Jung-Ah; Lee, Jieun; Lee, Yeon-Su; Rhee, Hwanseok; Kwon, Tae Soo; Han, Bok-Ghee; Heath, Karen E; Inoue, Hiroshi; Yoo, Han-Wook; Park, Kiejung; Lee, Jong-Keuk.

Hum Genet ; 131(3): 471-8, 2012 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-21959382

RESUMEN

Height is a highly heritable trait that involves multiple genetic loci. To identify causal variants that influence stature, we sequenced whole exomes of four children with idiopathic short stature. Ninety-five nonsynonymous single-nucleotide polymorphisms (nsSNPs) were selected as potential candidate variants. We performed association analysis in 740 cohort individuals and identified 11 nsSNPs in 10 loci (DIS3L2, ZBTB38, FAM154A, PTCH1, TSSC4, KIF18A, GPR133, ACAN, FAM59A, and NINL) associated with adult height (P < 0.05), including five novel loci. Of these, two nsSNPs (TSSC4 and KIF18A loci) were significant at P < 0.05 in the replication study (n = 1,000) and five (ZBTB38, FAM154A, TSSC4, KIF18A, and FAM59A loci) were significant at P < 0.01 in the combined analysis (n = 1,740). Together, the five nsSNPs accounted for approximately 2.5% of the height variation. This study demonstrated the utility of next-generation sequencing in identifying genetic variants and loci associated with complex traits.

Asunto(s)

Estatura/genética , Exoma , Polimorfismo de Nucleótido Simple , Femenino , Perfilación de la Expresión Génica , Genoma Humano , Trastornos del Crecimiento/genética , Humanos , Corea (Geográfico) , Masculino , Análisis de Secuencia de ADN

8.

Identification of methylation-dependent regulatory elements for intergenic miRNAs in human H4 cells.

Lee, Kwang Hee; Kim, Hyunyoung; Lee, Byeong Jae; Park, Kiejung.

Biochem Biophys Res Commun ; 420(2): 391-6, 2012 Apr 06.

Artículo en Inglés | MEDLINE | ID: mdl-22425776

RESUMEN

MicroRNAs (miRNAs) are important post-transcriptional regulators of various biological processes. Although our knowledge of miRNA expression and regulation has been increased considerably in recent years, the regulatory elements for miRNA gene expression (especially for intergenic miRNAs) are not fully understood. In this study, we identified differentially methylated regions (DMRs) within 1000 bp upstream from the start site of intergenic miRNAs in human neuroglioma cells using microarrays. Then we identified a unique sequence pattern, C[N](6)CT, within the DMRs using motif analysis. Interestingly, treatment of cells with a methyl transferase inhibitor (5-aza-2-deoxycytidine, DAC) significantly increased expression of miRNA genes with a high frequency of the C[N](6)CT motif in DMRs. Statistical analysis showed that the frequency of the C[N](6)CT motif in DMRs is highly correlated with intergenic miRNA gene expression, suggesting that C[N](6)CT motifs associated with DNA methylation regions play a role as regulatory elements for intergenic miRNA gene expression.

Asunto(s)

Metilación de ADN , Regulación de la Expresión Génica , MicroARNs/genética , Elementos Reguladores de la Transcripción , Azacitidina/análogos & derivados , Azacitidina/farmacología , Línea Celular Tumoral , Metilasas de Modificación del ADN/antagonistas & inhibidores , ADN Intergénico , Decitabina , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Análisis de Secuencia de ADN

9.

WeGAS: a web-based microbial genome annotation system.

Lee, Daesang; Seo, Hwajung; Park, Chankyu; Park, Kiejung.

Biosci Biotechnol Biochem ; 73(1): 213-6, 2009 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-19129632

RESUMEN

We have developed WeGAS, a Web based microbial Genome Annotation System, which provides features that include gene prediction, homology search, promoter/motif analysis, genome browsing, gene ontology analysis based on the COGs and GO, and metabolic pathway analysis with web-based interfaces. Most raw data and intermediate data from genome projects can be managed with the WeGAS database system, and analysis results, including information on each gene and final genome maps, are provided by its visualization modules. Especially, a pie-view browser displaying circular maps of contigs and a COG-GO combination browser are very helpful for an overview of projects. Major public microbial genome databases can be imported, searched, and browsed through the WeGAS modules. WeGAS is freely accessible via web site http://ns.smallsoft.co.kr:8051.

Asunto(s)

Algoritmos , Bases de Datos Genéticas , Genoma Bacteriano , Gráficos por Computador , Genes Bacterianos , Almacenamiento y Recuperación de la Información , Internet

10.

MapsiDB: an integrated web database for type I polyketide synthases.

Tae, Hongseok; Sohng, Jae Kyung; Park, Kiejung.

Bioprocess Biosyst Eng ; 32(6): 723-7, 2009 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-19205748

RESUMEN

Polyketides have diverse biological activities, including pharmacological functions such as antibiotic, antitumor and agrochemical properties. They are biosynthesized from short carboxylic acid precursors by polyketide synthases (PKSs). As natural polyketide products include many clinically important drugs and the volume of data on polyketides is rapidly increasing, the development of a database system to manage polyketide data is essential. MapsiDB is an integrated web database formulated to contain data on type I polyketides and their PKSs, including domain and module composition and related genome information. Data on polyketides were collected from journals and online resources and processed with analysis programs. Web interfaces were utilized to construct and to access this database, allowing polyketide researchers to add their data to this database and to use it easily.

Asunto(s)

Bases de Datos de Proteínas , Sintasas Poliquetidas/química , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Internet , Macrólidos/química , Macrólidos/clasificación , Macrólidos/metabolismo , Sintasas Poliquetidas/genética , Sintasas Poliquetidas/metabolismo , Interfaz Usuario-Computador

11.

Development of an analysis program of type I polyketide synthase gene clusters using homology search and profile hidden Markov model.

Tae, Hongseok; Sohng, Jae Kyung; Park, Kiejung.

J Microbiol Biotechnol ; 19(2): 140-6, 2009 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-19307762

RESUMEN

MAPSI (Management and Analysis for Polyketide Synthase Type I) has been developed to offer computational analysis methods to detect type I PKS (polyketide synthase) gene clusters in genome sequences. MAPSI provides a genome analysis component, which detects PKS gene clusters by identifying domains in proteins of a genome. MAPSI also contains databases on polyketides and genome annotation data, as well as analytic components such as new PKS assembly and domain analysis. The polyketide data and analysis component are accessible through Web interfaces and are displayed with diverse information. MAPSI, which was developed to aid researchers studying type I polyketides, provides diverse components to access and analyze polyketide information and should become a very powerful computational tool for polyketide research. The system can be extended through further studies of factors related to the biological activities of polyketides.

Asunto(s)

Bacteriemia/genética , Familia de Multigenes , Sintasas Poliquetidas/genética , Programas Informáticos , Algoritmos , Biología Computacional , Bases de Datos de Proteínas , Genoma Bacteriano , Cadenas de Markov , Alineación de Secuencia , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido , Interfaz Usuario-Computador

12.

Analysis of unmapped regions associated with long deletions in Korean whole genome sequences based on short read data.

Lee, Yuna; Park, Kiejung; Koh, Insong.

Genomics Inform ; 17(4): e40, 2019 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-31896240

RESUMEN

While studies aimed at detecting and analyzing indels or single nucleotide polymorphisms within human genomic sequences have been actively conducted, studies on detecting long insertions/deletions are not easy to orchestrate. For the last 10 years, the availability of long read data of human genomes from PacBio or Nanopore platforms has increased, which makes it easier to detect long insertions/deletions. However, because long read data have a critical disadvantage due to their relatively high cost, many next generation sequencing data are produced mainly by short read sequencing machines. Here, we constructed programs to detect so-called unmapped regions (UMRs, where no reads are mapped on the reference genome), scanned 40 Korean genomes to select UMR long deletion candidates, and compared the candidates with the long deletion break points within the genomes available from the 1000 Genomes Project (1KGP). An average of about 36,000 UMRs were found in the 40 Korean genomes tested, 284 UMRs were common across the 40 genomes, and a total of 37,943 UMRs were found. Compared with the 74,045 break points provided by the 1KGP, 30,698 UMRs overlapped. As the number of compared samples increased from 1 to 40, the number of UMRs that overlapped with the break points also increased. This eventually reached a peak of 80.9% of the total UMRs found in this study. As the total number of overlapped UMRs could probably grow to encompass 74,045 break points with the inclusion of more Korean genomes, this approach could be practically useful for studies on long deletions utilizing short read data.

13.

A scaffold analysis tool using mate-pair information in genome sequencing.

Kim, Pan-Gyu; Cho, Hwan-Gue; Park, Kiejung.

J Biomed Biotechnol ; 2008: 675741, 2008.

Artículo en Inglés | MEDLINE | ID: mdl-18414585

RESUMEN

We have developed a Windows-based program, ConPath, as a scaffold analyzer. ConPath constructs scaffolds by ordering and orienting separate sequence contigs by exploiting the mate-pair information between contig-pairs. Our algorithm builds directed graphs from link information and traverses them to find the longest acyclic graphs. Using end read pairs of fixed-sized mate-pair libraries, ConPath determines relative orientations of all contigs, estimates the gap size of each adjacent contig pair, and reports wrong assembly information by validating orientations and gap sizes. We have utilized ConPath in more than 10 microbial genome projects, including Mannheimia succiniciproducens and Vibro vulnificus, where we verified contig assembly and identified several erroneous contigs using the four types of error defined in ConPath. Also, ConPath supports some convenient features and viewers that permit investigation of each contig in detail; these include contig viewer, scaffold viewer, edge information list, mate-pair list, and the printing of complex scaffold structures.

Asunto(s)

Algoritmos , Mapeo Contig/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Disparidad de Par Base , Secuencia de Bases , Datos de Secuencia Molecular

14.

ASMPKS: an analysis system for modular polyketide synthases.

Tae, Hongseok; Kong, Eun-Bae; Park, Kiejung.

BMC Bioinformatics ; 8: 327, 2007 Sep 03.

Artículo en Inglés | MEDLINE | ID: mdl-17764579

RESUMEN

BACKGROUND: Polyketides are secondary metabolites of microorganisms with diverse biological activities, including pharmacological functions such as antibiotic, antitumor and agrochemical properties. Polyketides are synthesized by serialized reactions of a set of enzymes called polyketide synthase(PKS)s, which coordinate the elongation of carbon skeletons by the stepwise condensation of short carbon precursors. Due to their importance as drugs, the volume of data on polyketides is rapidly increasing and creating a need for computational analysis methods for efficient polyketide research. Moreover, the increasing use of genetic engineering to research new kinds of polyketides requires genome wide analysis. RESULTS: We describe a system named ASMPKS (Analysis System for Modular Polyketide Synthesis) for computational analysis of PKSs against genome sequences. It also provides overall management of information on modular PKS, including polyketide database construction, new PKS assembly, and chain visualization. ASMPKS operates on a web interface to construct the database and to analyze PKSs, allowing polyketide researchers to add their data to this database and to use it easily. In addition, the ASMPKS can predict functional modules for a protein sequence submitted by users, estimate the chemical composition of a polyketide synthesized from the modules, and display the carbon chain structure on the web interface. CONCLUSION: ASMPKS has powerful computation features to aid modular PKS research. As various factors, such as starter units and post-processing, are related to polyketide biosynthesis, ASMPKS will be improved through further development for study of the factors.

Asunto(s)

Biología Computacional/métodos , Sintasas Poliquetidas/química , Sintasas Poliquetidas/genética , Algoritmos , Carbono/química , Dominio Catalítico , Computadores , Ingeniería Genética , Genoma Bacteriano , Genómica/métodos , Modelos Biológicos , Modelos Teóricos , Complejos Multienzimáticos/química , Programas Informáticos

15.

HIA: a genome mapper using hybrid index-based sequence alignment.

Choi, Jongpill; Park, Kiejung; Cho, Seong Beom; Chung, Myungguen.

Algorithms Mol Biol ; 10: 30, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-26702294

RESUMEN

BACKGROUND: A number of alignment tools have been developed to align sequencing reads to the human reference genome. The scale of information from next-generation sequencing (NGS) experiments, however, is increasing rapidly. Recent studies based on NGS technology have routinely produced exome or whole-genome sequences from several hundreds or thousands of samples. To accommodate the increasing need of analyzing very large NGS data sets, it is necessary to develop faster, more sensitive and accurate mapping tools. RESULTS: HIA uses two indices, a hash table index and a suffix array index. The hash table performs direct lookup of a q-gram, and the suffix array performs very fast lookup of variable-length strings by exploiting binary search. We observed that combining hash table and suffix array (hybrid index) is much faster than the suffix array method for finding a substring in the reference sequence. Here, we defined the matching region (MR) is a longest common substring between a reference and a read. And, we also defined the candidate alignment regions (CARs) as a list of MRs that is close to each other. The hybrid index is used to find candidate alignment regions (CARs) between a reference and a read. We found that aligning only the unmatched regions in the CAR is much faster than aligning the whole CAR. In benchmark analysis, HIA outperformed in mapping speed compared with the other aligners, without significant loss of mapping accuracy. CONCLUSIONS: Our experiments show that the hybrid of hash table and suffix array is useful in terms of speed for mapping NGS sequencing reads to the human reference genome sequence. In conclusion, our tool is appropriate for aligning massive data sets generated by NGS sequencing.

16.

Increased expression of interferon signaling genes in the bone marrow microenvironment of myelodysplastic syndromes.

Kim, Miyoung; Hwang, Seungwoo; Park, Kiejung; Kim, Seon Young; Lee, Young Kyung; Lee, Dong Soon.

PLoS One ; 10(3): e0120602, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-25803272

RESUMEN

INTRODUCTION: The bone marrow (BM) microenvironment plays an important role in the pathogenesis of myelodysplastic syndromes (MDS) through a reciprocal interaction with resident BM hematopoietic cells. We investigated the differences between BM mesenchymal stromal cells (MSCs) in MDS and normal individuals and identified genes involved in such differences. MATERIALS AND METHODS: BM-derived MSCs from 7 MDS patients (3 RCMD, 3 RAEB-1, and 1 RAEB-2) and 7 controls were cultured. Global gene expression was analyzed using a microarray. RESULT: We found 314 differentially expressed genes (DEGs) in RCMD vs. control, 68 in RAEB vs. control, and 51 in RAEB vs. RCMD. All comparisons were clearly separated from one another by hierarchical clustering. The overall similarity between differential expression signatures from the RCMD vs. control comparison and the RAEB vs. control comparison was highly significant (p = 0), which indicates a common transcriptomic response in these two MDS subtypes. RCMD and RAEB simultaneously showed an up-regulation of interferon alpha/beta signaling and the ISG15 antiviral mechanism, and a significant fraction of the RAEB vs. control DEGs were also putative targets of transcription factors IRF and ICSBP. Pathways that involved RNA polymerases I and III and mitochondrial transcription were down-regulated in RAEB compared to RCMD. CONCLUSION: Gene expression in the MDS BM microenvironment was different from that in normal BM and exhibited altered expression according to disease progression. The present study provides genetic evidence that inflammation and immune dysregulation responses that involve the interferon signaling pathway in the BM microenvironment are associated with MDS pathogenesis, which suggests BM MSCs as a possible therapeutic target in MDS.

Asunto(s)

Células de la Médula Ósea/patología , Microambiente Celular/genética , Interferones/metabolismo , Síndromes Mielodisplásicos/genética , Síndromes Mielodisplásicos/patología , Transducción de Señal/genética , Transcriptoma , Adulto , Anciano , Células de la Médula Ósea/inmunología , Femenino , Humanos , Inmunofenotipificación , Masculino , Células Madre Mesenquimatosas/inmunología , Células Madre Mesenquimatosas/patología , Persona de Mediana Edad , Síndromes Mielodisplásicos/inmunología , Factores de Transcripción/metabolismo

17.

A method for identifying splice sites and translation start sites in human genomic sequences.

Kim, Ki-Bong; Park, Kiejung; Kong, Eun Bae.

J Biochem Mol Biol ; 35(5): 513-7, 2002 Sep 30.

Artículo en Inglés | MEDLINE | ID: mdl-12359095

RESUMEN

We describe a new method for identifying the sequences that signal the start of translation, and the boundaries between exons and introns (donor and acceptor sites) in human mRNA. According to the mandatory keyword, ORGANISM, and feature key, CDS, a large set of standard data for each signal site was extracted from the ASCII flat file, gbpri.seq, in the GenBank release 108.0. This was used to generate the scoring matrices, which summarize the sequence information for each signal site. The scoring matrices take into account the independent nucleotide frequencies between adjacent bases in each position within the signal site regions, and the relative weight on each nucleotide in proportion to their probabilities in the known signal sites. Using a scoring scheme that is based on the nucleotide scoring matrices, the method has great sensitivity and specificity when used to locate signals in uncharacterized human genomic DNA. These matrices are especially effective at distinguishing true and false sites.

Asunto(s)

Técnicas Genéticas , Genoma Humano , Biosíntesis de Proteínas , Sitios de Empalme de ARN , Secuencia de Consenso , Humanos , Análisis de Secuencia de ADN

18.

Identification of ethnically specific genetic variations in pan-asian ethnos.

Yang, Jin Ok; Hwang, Sohyun; Kim, Woo-Yeon; Park, Seong-Jin; Kim, Sang Cheol; Park, Kiejung; Lee, Byungwook.

Genomics Inform ; 12(1): 42-7, 2014 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-24748860

RESUMEN

Asian populations contain a variety of ethnic groups that have ethnically specific genetic differences. Ethnic variants may be highly relevant in disease and human differentiation studies. Here, we identified ethnically specific variants and then investigated their distribution across Asian ethnic groups. We obtained 58,960 Pan-Asian single nucleotide polymorphisms of 1,953 individuals from 72 ethnic groups of 11 Asian countries. We selected 9,306 ethnic variant single nucleotide polymorphisms (ESNPs) and 5,167 ethnic variant copy number polymorphisms (ECNPs) using the nearest shrunken centroid method. We analyzed ESNPs and ECNPs in 3 hierarchical levels: superpopulation, subpopulation, and ethnic population. We also identified ESNP- and ECNP-related genes and their features. This study represents the first attempt to identify Asian ESNP and ECNP markers, which can be used to identify genetic differences and predict disease susceptibility and drug effectiveness in Asian ethnic populations.

19.

Genetic factors underlying discordance in chromatin accessibility between monozygotic twins.

Kim, Kwoneel; Ban, Hyo-Jeong; Seo, Jungmin; Lee, Kibaick; Yavartanoo, Maryam; Kim, Sang Cheol; Park, Kiejung; Cho, Seong Beom; Choi, Jung Kyoon.

Genome Biol ; 15(5): R72, 2014 May 29.

Artículo en Inglés | MEDLINE | ID: mdl-24887574

RESUMEN

BACKGROUND: Open chromatin is implicated in regulatory processes; thus, variations in chromatin structure may contribute to variations in gene expression and other phenotypes. In this work, we perform targeted deep sequencing for open chromatin, and array-based genotyping across the genomes of 72 monozygotic twins to identify genetic factors regulating co-twin discordance in chromatin accessibility. RESULTS: We show that somatic mutations cause chromatin discordance mainly via the disruption of transcription factor binding sites. Structural changes in DNA due to C:G to A:T transversions are under purifying selection due to a strong impact on chromatin accessibility. We show that CpGs whose methylation is specifically regulated during cellular differentiation appear to be protected from high mutation rates of 5'-methylcytosines, suggesting that the spectrum of CpG variations may be shaped fully at the developmental level but not through natural selection. Based on the association mapping of within-pair chromatin differences, we search for cases in which twin siblings with a particular genotype had chromatin discordance at the relevant locus. We identify 1,325 chromatin sites that are differentially accessible, depending on the genotype of a nearby locus, suggesting that epigenetic differences can control regulatory variations via interactions with genetic factors. Poised promoters present high levels of chromatin discordance in association with either somatic mutations or genetic-epigenetic interactions. CONCLUSION: Our observations illustrate how somatic mutations and genetic polymorphisms may contribute to regulatory, and ultimately phenotypic, discordance.

Asunto(s)

Cromatina/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Factores de Transcripción/metabolismo , Gemelos Monocigóticos/genética , Adulto , Secuencia de Bases , Sitios de Unión , Cromatina/química , Cromatina/genética , Biología Computacional/métodos , Metilación de ADN , Epigénesis Genética , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Persona de Mediana Edad , Datos de Secuencia Molecular , Mutación , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo

20.

Expression signature defined by FOXM1-CCNB1 activation predicts disease recurrence in non-muscle-invasive bladder cancer.

Kim, Seon-Kyu; Roh, Yun-Gil; Park, Kiejung; Kang, Tae-Hong; Kim, Wun-Jae; Lee, Ju-Seog; Leem, Sun-Hee; Chu, In-Sun.

Clin Cancer Res ; 20(12): 3233-43, 2014 Jun 15.

Artículo en Inglés | MEDLINE | ID: mdl-24714775

RESUMEN

PURPOSE: Although standard treatment with transurethral resection and intravesical therapy (IVT) is known to be effective to address the clinical behavior of non-muscle-invasive bladder cancer (NMIBC), many patients fail to respond to the treatment and frequently experience disease recurrence. Here, we aim to identify a prognostic molecular signature that predicts the NMIBC heterogeneity and response to IVT. EXPERIMENTAL DESIGN: We analyzed the genomic profiles of 102 patients with NMIBC to identify a signature associated with disease recurrence. The validity of the signature was verified in three independent patient cohorts (n = 658). Various statistical methods, including a leave-one-out cross-validation and multivariate Cox regression analyses, were applied to identify a signature. We confirmed an association between the signature and tumor aggressiveness with experimental assays using bladder cancer cell lines. RESULTS: Gene expression profiling in 102 patients with NMIBC identified a CCNB1 signature associated with disease recurrence, which was validated in another three independent cohorts of 658 patients. The CCNB1 signature was shown to be an independent risk factor by a multivariate analysis and subset stratification according to stage and grade [HR, 2.93; 95% confidence intervals (CI), 1.302-6.594; P = 0.009]. The subset analysis also revealed that the signature could identify patients who would benefit from IVT. Finally, gene network analyses and experimental assays indicated that NMIBC recurrence could be mediated by FOXM1-CCNB1-Fanconi anemia pathways. CONCLUSIONS: The CCNB1 signature represents a promising diagnostic tool to identify patients with NMIBC who have a high risk of recurrence and to predict response to IVT.

Asunto(s)

Biomarcadores de Tumor/genética , Ciclina B1/genética , Factores de Transcripción Forkhead/genética , Perfilación de la Expresión Génica , Recurrencia Local de Neoplasia/genética , Neoplasias de la Vejiga Urinaria/genética , Adulto , Anciano , Anciano de 80 o más Años , Diferenciación Celular , Proliferación Celular , Estudios de Cohortes , Femenino , Estudios de Seguimiento , Proteína Forkhead Box M1 , Humanos , Masculino , Persona de Mediana Edad , Clasificación del Tumor , Invasividad Neoplásica , Recurrencia Local de Neoplasia/mortalidad , Recurrencia Local de Neoplasia/patología , Estadificación de Neoplasias , Análisis de Secuencia por Matrices de Oligonucleótidos , Pronóstico , ARN Mensajero/genética , Reacción en Cadena en Tiempo Real de la Polimerasa , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Tasa de Supervivencia , Células Tumorales Cultivadas , Neoplasias de la Vejiga Urinaria/mortalidad , Neoplasias de la Vejiga Urinaria/patología , Adulto Joven

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA