RESUMEN
The endangered whale shark (Rhincodon typus) is the largest fish on Earth and a long-lived member of the ancient Elasmobranchii clade. To characterize the relationship between genome features and biological traits, we sequenced and assembled the genome of the whale shark and compared its genomic and physiological features to those of 83 animals and yeast. We examined the scaling relationships between body size, temperature, metabolic rates, and genomic features and found both general correlations across the animal kingdom and features specific to the whale shark genome. Among animals, increased lifespan is positively correlated to body size and metabolic rate. Several genomic traits also significantly correlated with body size, including intron and gene length. Our large-scale comparative genomic analysis uncovered general features of metazoan genome architecture: Guanine and cytosine (GC) content and codon adaptation index are negatively correlated, and neural connectivity genes are longer than average genes in most genomes. Focusing on the whale shark genome, we identified multiple features that significantly correlate with lifespan. Among these were very long gene length, due to introns being highly enriched in repetitive elements such as CR1-like long interspersed nuclear elements, and considerably longer neural genes of several types, including connectivity, activity, and neurodegeneration genes. The whale shark genome also has the second slowest evolutionary rate observed in vertebrates to date. Our comparative genomics approach uncovered multiple genetic features associated with body size, metabolic rate, and lifespan and showed that the whale shark is a promising model for studies of neural architecture and lifespan.
Asunto(s)
Adaptación Fisiológica/genética , Tamaño Corporal/fisiología , Tiburones/genética , Animales , Secuencia de Bases/genética , Tamaño Corporal/genética , Genoma/genética , Genómica/métodos , Longevidad/genética , Tiburones/metabolismo , TemperaturaRESUMEN
BACKGROUND: Unique among cnidarians, jellyfish have remarkable morphological and biochemical innovations that allow them to actively hunt in the water column and were some of the first animals to become free-swimming. The class Scyphozoa, or true jellyfish, are characterized by a predominant medusa life-stage consisting of a bell and venomous tentacles used for hunting and defense, as well as using pulsed jet propulsion for mobility. Here, we present the genome of the giant Nomura's jellyfish (Nemopilema nomurai) to understand the genetic basis of these key innovations. RESULTS: We sequenced the genome and transcriptomes of the bell and tentacles of the giant Nomura's jellyfish as well as transcriptomes across tissues and developmental stages of the Sanderia malayensis jellyfish. Analyses of the Nemopilema and other cnidarian genomes revealed adaptations associated with swimming, marked by codon bias in muscle contraction and expansion of neurotransmitter genes, along with expanded Myosin type II family and venom domains, possibly contributing to jellyfish mobility and active predation. We also identified gene family expansions of Wnt and posterior Hox genes and discovered the important role of retinoic acid signaling in this ancient lineage of metazoans, which together may be related to the unique jellyfish body plan (medusa formation). CONCLUSIONS: Taken together, the Nemopilema jellyfish genome and transcriptomes genetically confirm their unique morphological and physiological traits, which may have contributed to the success of jellyfish as early multi-cellular predators.
Asunto(s)
Evolución Molecular , Genoma/fisiología , Conducta Predatoria , Escifozoos/fisiología , Animales , Evolución Biológica , Filogenia , Escifozoos/genéticaRESUMEN
Microsatellite instability (MSI) is a critical mechanism that drives genetic aberrations in cancer. To identify the entire MS mutation, we performed the first comprehensive genome- and transcriptome-wide analyses of mutations associated with MSI in Korean gastric cancer cell lines and primary tissues. We identified 18,377 MS mutations of five or more repeat nucleotides in coding sequences and untranslated regions of genes, and discovered 139 individual genes whose expression was down-regulated in association with UTR MS mutation. In addition, we found that 90.5% of MS mutations with deletions in gene regions occurred in UTRs. This analysis emphasizes the genetic diversity of MSI-H gastric tumors and provides clues to the mechanistic basis of instability in microsatellite unstable gastric cancers.
Asunto(s)
Pueblo Asiatico/genética , Estudio de Asociación del Genoma Completo , Inestabilidad de Microsatélites , Mutación , Neoplasias Gástricas/genética , Transcriptoma , Línea Celular Tumoral , Mutación del Sistema de Lectura , Regulación Neoplásica de la Expresión Génica , Frecuencia de los Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Repeticiones de Microsatélite , Procesamiento Postranscripcional del ARN , Estabilidad del ARN , República de Corea , Eliminación de Secuencia , Regiones no TraducidasRESUMEN
Homozygous deletion is a frequent mutational mechanism of silencing tumor suppressor genes in cancer. Therefore, homozygous deletions have been analyzed for identification of tumor suppressor genes that can be utilized as biomarkers or therapeutic targets for cancer treatment. In this study, to elucidate potential tumor suppressor genes involved in gastric cancer (GC), we analyzed the entire set of large homozygous deletions in six human GC cell lines through genome- and transcriptome-wide approaches. We identified 51 genes in homozygous deletion regions of chromosomes and confirmed the deletion frequency in tumor tissues of 219 GC patients from The Cancer Genome Atlas database. We evaluated the effect of homozygous deletions on the mRNA level and found significantly affected genes in chromosome bands 9p21, 3p22, 5p14, and 6q15. Among the genes in 9p21, we investigated the potential tumor suppressive effect of KLHL9. We demonstrated that ectopic expression of KLHL9 inhibited cell proliferation and tumor formation in KLHL9-deficient SNU-16 cell line. In addition, we observed that homozygous focal deletions generated truncated transcripts of TGFBR2, CTNNA1, and STXBP5. Ectopic expression of two kinds of TGFBR2-reverse GADL1 fusion genes suppressed TGF-ß signaling, which may lead to the loss of sensitivity to TGF-ß tumor suppressive activity. In conclusion, our findings suggest that novel tumor suppressor genes that are aberrantly expressed through homozygous deletions may play important roles in gastric tumorigenesis.
Asunto(s)
Deleción Cromosómica , Regulación Neoplásica de la Expresión Génica , Genes Supresores de Tumor , Neoplasias Gástricas/genética , Animales , Línea Celular Tumoral , Cromosomas Humanos Par 3 , Cromosomas Humanos Par 5 , Cromosomas Humanos Par 6 , Cromosomas Humanos Par 9 , Femenino , Humanos , Ratones , Ratones Desnudos , Proteínas Serina-Treonina Quinasas/genética , ARN Mensajero/metabolismo , Receptor Tipo II de Factor de Crecimiento Transformador beta , Receptores de Factores de Crecimiento Transformadores beta/genéticaRESUMEN
BACKGROUND: Pakistan covers a key geographic area in human history, being both part of the Indus River region that acted as one of the cradles of civilization and as a link between Western Eurasia and Eastern Asia. This region is inhabited by a number of distinct ethnic groups, the largest being the Punjabi, Pathan (Pakhtuns), Sindhi, and Baloch. RESULTS: We analyzed the first ethnic male Pathan genome by sequencing it to 29.7-fold coverage using the Illumina HiSeq2000 platform. A total of 3.8 million single nucleotide variations (SNVs) and 0.5 million small indels were identified by comparing with the human reference genome. Among the SNVs, 129,441 were novel, and 10,315 nonsynonymous SNVs were found in 5,344 genes. SNVs were annotated for health consequences and high risk diseases, as well as possible influences on drug efficacy. We confirmed that the Pathan genome presented here is representative of this ethnic group by comparing it to a panel of Central Asians from the HGDP-CEPH panels typed for ~650 k SNPs. The mtDNA (H2) and Y haplogroup (L1) of this individual were also typical of his geographic region of origin. Finally, we reconstruct the demographic history by PSMC, which highlights a recent increase in effective population size compatible with admixture between European and Asian lineages expected in this geographic region. CONCLUSIONS: We present a whole-genome sequence and analyses of an ethnic Pathan from the north-west province of Pakistan. It is a useful resource to understand genetic variation and human migration across the whole Asian continent.
Asunto(s)
Variación Genética , Genoma Humano , Cromosomas Humanos Y , ADN Mitocondrial/química , Demografía , Humanos , Masculino , Pakistán/etnología , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: The horse (Equus ferus caballus) is one of the earliest domesticated species and has played an important role in the development of human societies over the past 5,000 years. In this study, we characterized the genome of the Marwari horse, a rare breed with unique phenotypic characteristics, including inwardly turned ear tips. It is thought to have originated from the crossbreeding of local Indian ponies with Arabian horses beginning in the 12th century. RESULTS: We generated 101 Gb (~30 × coverage) of whole genome sequences from a Marwari horse using the Illumina HiSeq2000 sequencer. The sequences were mapped to the horse reference genome at a mapping rate of ~98% and with ~95% of the genome having at least 10 × coverage. A total of 5.9 million single nucleotide variations, 0.6 million small insertions or deletions, and 2,569 copy number variation blocks were identified. We confirmed a strong Arabian and Mongolian component in the Marwari genome. Novel variants from the Marwari sequences were annotated, and were found to be enriched in olfactory functions. Additionally, we suggest a potential functional genetic variant in the TSHZ1 gene (p.Ala344>Val) associated with the inward-turning ear tip shape of the Marwari horses. CONCLUSIONS: Here, we present an analysis of the Marwari horse genome. This is the first genomic data for an Asian breed, and is an invaluable resource for future studies of genetic variation associated with phenotypes and diseases in horses.
Asunto(s)
Genoma/genética , Genómica , Caballos/genética , Análisis de Secuencia de ADN , Secuencia de Aminoácidos , Animales , Evolución Molecular , Variación Genética , Genotipo , Humanos , Hibridación Genética , Masculino , Datos de Secuencia Molecular , Fenotipo , Selección Genética , Especificidad de la EspecieRESUMEN
BACKGROUND: In contrast with wild species, cultivated crop genomes consist of reshuffled recombination blocks, which occurred by crossing and selection processes. Accordingly, recombination block-based genomics analysis can be an effective approach for the screening of target loci for agricultural traits. RESULTS: We propose the variation block method, which is a three-step process for recombination block detection and comparison. The first step is to detect variations by comparing the short-read DNA sequences of the cultivar to the reference genome of the target crop. Next, sequence blocks with variation patterns are examined and defined. The boundaries between the variation-containing sequence blocks are regarded as recombination sites. All the assumed recombination sites in the cultivar set are used to split the genomes, and the resulting sequence regions are termed variation blocks. Finally, the genomes are compared using the variation blocks. The variation block method identified recurring recombination blocks accurately and successfully represented block-level diversities in the publicly available genomes of 31 soybean and 23 rice accessions. The practicality of this approach was demonstrated by the identification of a putative locus determining soybean hilum color. CONCLUSIONS: We suggest that the variation block method is an efficient genomics method for the recombination block-level comparison of crop genomes. We expect that this method will facilitate the development of crop genomics by bringing genomics technologies to the field of crop breeding.
Asunto(s)
Productos Agrícolas/genética , Genoma de Planta , Glycine max/genética , Secuencia de Bases , Mapeo Cromosómico , Proteínas de Plantas/genética , Polimorfismo de Nucleótido Simple , Regiones Promotoras Genéticas , Análisis de Secuencia de ADNRESUMEN
The common data model (CDM) has found widespread application in healthcare studies, but its utilization in cancer research has been limited. This article describes the development and implementation strategy for Cancer Clinical Library Databases (CCLDs), which are standardized cancer-specific databases established under the Korea-Clinical Data Utilization Network for Research Excellence (K-CURE) project by the Korean Ministry of Health and Welfare. Fifteen leading hospitals and fourteen academic associations in Korea are engaged in constructing CCLDs for 10 primary cancer types. For each cancer type-specific CCLD, cancer data experts determine key clinical data items essential for cancer research, standardize these items across cancer types, and create a standardized schema. Comprehensive clinical records covering diagnosis, treatment, and outcomes, with annual updates, are collected for each cancer patient in the target population, and quality control is based on six-sigma standards. To protect patient privacy, CCLDs follow stringent data security guidelines by pseudonymizing personal identification information and operating within a closed analysis environment. Researchers can apply for access to CCLD data through the K-CURE portal, which is subject to Institutional Review Board and Data Review Board approval. The CCLD is considered a pioneering standardized cancer-specific database, significantly representing Korea's cancer data. It is expected to overcome limitations of previous CDMs and provide a valuable resource for multicenter cancer research in Korea.
RESUMEN
H2 production via water-gas shift reaction (WGS) is an important process and applied widely. Cobalt-modified CeO2 are promising catalysts for WGS reaction. Herein, a series of Co/Nb-CeO2 catalysts were prepared by varying the rate of precipitant addition during the coprecipitation method and examined for hydrogen generation through WGS reaction. The rates of precipitant addition were 1, 5, 15, and 25 mL/min. We obtained ceria supported cobalt catalysts with different sizes and morphology such as 3, 8 nm nanoclusters, 30 nm cubic nanoparticles, and 50 nm hexagonal nanoparticles. The well dispersed small cobalt particles in Co/Nb-CeO2 that was prepared at 5 mL/min titration rate exhibit strong interaction between cobalt oxide and CeO2 that retards the reduction of CoOx producing Co-CoOx pairs. In contrast, 1-Co/Nb-CeO2 and 25-Co/Nb-CeO2 result in bigger and aggregated Co particles, resulting in fewer interfaces with CeO2. The Co0, Coδ+, Ce3+, and Ov species are responsible for improved reducibility in Co/Nb-CeO2 catalysts and were quantitively measured using XPS, XAS, and Raman spectroscopy. The Co-CoOx interface assists dissociation of the H2O molecule; CO oxidation requires low activation energy and realizes a high turnover frequency of 9.8 s-1. The 5-Co/Nb-CeO2 catalyst achieved thermodynamic equilibrium equivalent CO conversion with efficient H2 production during WGS reaction at a gas hourly space velocity of 315,282 h-1. Successively, the 5-Co/Nb-CeO2 catalyst exhibited stable performance for straight 168 h attributed to stable CO-Coδ+ intermediate formation, achieving efficient inhibition of typical CO chemistry over the Co metal, suitable for hydrogen generation from waste derived synthesis gas.
RESUMEN
We attempted to assess the performance of an ethnic-specific polygenic risk score (PRS) designed from a Korean population to predict aggressive prostate cancer (PCa) and early-onset (age < 60). A PRS score comprised of 22 SNPs was computed in 3695 patients gathered from one of 4 tertiary centers in Korea. Males with biopsy or radical prostatectomy-proven PCa were included for analysis, collecting additional clinical parameters such as age, BMI, PSA, Gleason Group (GG), and staging. Patients were divided into 4 groups of PRS quartiles. Intergroup differences were assessed, as well as risk ratio and predictive performance based on GG using logistic regression analysis and AUC. No significant intergroup differences were observed for BMI, PSA, and rate of ≥ T3a tumors on pathology. Rate of GG ≥ 2, GG ≥ 3, and GG ≥ 4 showed a significant pattern of increase by PRS quartile (p < 0.001, < 0.001, and 0.039, respectively). With the lowest PRS quartile as reference, higher PRS groups showed sequentially escalating risk for GG ≥ 2 and GG ≥ 3 pathology, with a 4.6-fold rise in GG ≥ 2 (p < 0.001) and 2.0-fold rise in GG ≥ 3 (p < 0.001) for the highest PRS quartiles. Combining PRS with PSA improved prediction of early onset csPCa (AUC 0.759) compared to PRS (AUC 0.627) and PSA alone (AUC 0.736). To conclude, an ethnic-specific PRS was found to predict susceptibility of aggressive PCa in addition to improving detection of csPCa when combined with PSA in early onset populations. PRS may have a role as a risk-stratification model in actual practice. Large scale, multi-ethnic trials are required to validate our results.
Asunto(s)
Antígeno Prostático Específico , Neoplasias de la Próstata , Humanos , Masculino , Próstata/cirugía , Próstata/patología , Prostatectomía , Neoplasias de la Próstata/patología , Factores de Riesgo , Pueblo AsiaticoRESUMEN
PURPOSE: Germline mutations in DNA damage repair (DDR) genes such as BRCA2 have been associated with prostate cancer (PC) risk but has not been thoroughly evaluated for metastatic prostate cancer (mPC) in Asian men. This study attempts to evaluate frequency of DDR mutations in the largest cohort of Koreans. MATERIALS AND METHODS: We recruited 340 patients with mPC unselected for family history of cancer and compared to 495 controls. Whole genome sequencing was applied to assess germline pathogenic/likely pathogenic variants (PV/LPVs) in 26 DDR genes and HOXB13, including 7 genes (ATM, BRCA1/2, CHEK2, BRIP1, PALB2, and NBN) associated with hereditary PC. Comparisons to published Caucasian and Japanese cohorts were performed. RESULTS: Total of 28 PV/LPVs were identified in 30 (8.8%) patients; mutations were found in 13 genes, including BRCA2 (15 men [4.41%]), ATM (2 men [0.59%]), NBN (2 men [0.59%], and BRIP1 (2 men [0.59%]). Only one patient had HOXB13 mutation (0.29%). A lower rate of overall germline variant frequency was observed in Korean mPC compared to Caucasians (8.8% vs. 11.8%), but individual variants notably differed from Caucasian and geographically similar Japanese cohorts. PV/LPVs in DDR genes tended to increase gradually with higher Gleason scores (GS 7, 7.1%; GS 8, 7.5%; GS 9-10, 9.9%). CONCLUSIONS: BRCA2 was the most frequently mutated gene common to different cohorts supporting its importance, but differences in variant distribution in Korean mPC underscore the need for ethnic-specific genetic models. Future ethnic-specific analyses are warranted to verify our findings.
RESUMEN
BACKGROUND: Thoroughbred horses are the most expensive domestic animals, and their running ability and knowledge about their muscle-related diseases are important in animal genetics. While the horse reference genome is available, there has been no large-scale functional annotation of the genome using expressed genes derived from transcriptomes. RESULTS: We present a large-scale analysis of whole transcriptome data. We sequenced the whole mRNA from the blood and muscle tissues of six thoroughbred horses before and after exercise. By comparing current genome annotations, we identified 32,361 unigene clusters spanning 51.83 Mb that contained 11,933 (36.87%) annotated genes. More than 60% (20,428) of the unigene clusters did not match any current equine gene model. We also identified 189,973 single nucleotide variations (SNVs) from the sequences aligned against the horse reference genome. Most SNVs (171,558 SNVs; 90.31%) were novel when compared with over 1.1 million equine SNPs from two SNP databases. Using differential expression analysis, we further identified a number of exercise-regulated genes: 62 up-regulated and 80 down-regulated genes in the blood, and 878 up-regulated and 285 down-regulated genes in the muscle. Six of 28 previously-known exercise-related genes were over-expressed in the muscle after exercise. Among the differentially expressed genes, there were 91 transcription factor-encoding genes, which included 56 functionally unknown transcription factor candidates that are probably associated with an early regulatory exercise mechanism. In addition, we found interesting RNA expression patterns where different alternative splicing forms of the same gene showed reversed expressions before and after exercising. CONCLUSION: The first sequencing-based horse transcriptome data, extensive analyses results, deferentially expressed genes before and after exercise, and candidate genes that are related to the exercise are provided in this study.
Asunto(s)
Perfilación de la Expresión Génica/métodos , Caballos/genética , Caballos/fisiología , Condicionamiento Físico Animal/fisiología , ARN/genética , AnimalesRESUMEN
Various agents, including ethylenediaminetetraacetic acid, oxalic acid, citric acid, and HCl, were applied to remove heavy metals from raw paper incineration ash and render the ash recyclable. Among these prepared agent solutions, ethylenediaminetetraacetic acid showed the highest efficiency for Pb removal, while oxalic acid showed the highest efficiencies for Cu, Cd, and As removal. Additionally, three modes of an advanced removal method, which involved the use of both ethylenediaminetetraacetic acid and oxalic acid, were considered for use at the end of the rendering process. Among these three modes of the advanced removal method, that which involved the simultaneous use of ethylenediaminetetraacetic acid and oxalic acid, i.e., a mixture of both solutions, showed the best heavy metal removal efficiencies. In detail, 11.9% of Cd, 10% of Hg, 28.42% of As, 31.29% of Cu, and 49.19% of Pb were removed when this method was used. Furthermore, the application of these three modes of the advanced removal method resulted in a decrease in the amounts of heavy metals eluted and brought about an increase in the CaO content of the treated incineration ash, while decreasing its Cl content. These combined results enhanced the solidification effect of the treated incineration ash. Thus, it was confirmed that the advanced removal method is a promising strategy by which recyclable paper incineration ash can be obtained.
RESUMEN
In this study, hydrogen production using food waste was optimized by investigating the effect of agitator types in anaerobic digestion reactors and catalysts for biogas reforming. The applied agitators were pitched blade and hydrofoil, and their effect on homogeneity was estimated using computational fluid dynamics. Reactors with different agitators were operated for 60 days for biogas production. Increased biogas production was observed in the reactor equipped with a hydrofoil agitator owing to its high homogeneity. In addition, Ni-CeZrO2 catalysts promoted with La2O3, CaO, or MgO were investigated for stable hydrogen production during the biogas reforming reaction using simulated gas based on biogas from the anaerobic digestion equipped the hydrofoil. Among the promoted catalysts, the MgO-promoted Ni-CeZrO2 catalyst displayed the best results for hydrogen production without significant deactivation. The stable catalytic performance of the MgO-promoted catalyst resulted from the close interaction between Ni and MgO, and its high oxygen storage capacity. Thus, 1216 L hydrogen and 646 L carbon monoxide were produced per kilogram volatile solid via the hydrogen production system that included anaerobic digestion and biogas reforming.
Asunto(s)
Biocombustibles , Eliminación de Residuos , Anaerobiosis , Reactores Biológicos , Alimentos , Hidrógeno , Óxido de Magnesio , MetanoRESUMEN
Cymbidium goeringii, commonly known as the spring orchid, has long been favoured for horticultural purposes in Asian countries. It is a popular orchid with much demand for improvement and development for its valuable varieties. Until now, its reference genome has not been published despite its popularity and conservation efforts. Here, we report the de novo assembly of the C. goeringii genome, which is the largest among the orchids published to date, using a strategy that combines short- and long-read sequencing and chromosome conformation capture (Hi-C) information. The total length of all scaffolds is 3.99 Gb, with an N50 scaffold size of 178.2 Mb. A total of 29,556 protein-coding genes were annotated and 3.55 Gb (88.87% of genome) repetitive sequences were identified. We constructed pseudomolecular chromosomes using Hi-C, incorporating 89.4% of the scaffolds in 20 chromosomes. We identified 220 expanded and 106 contracted genes families in C. goeringii after divergence from its close relative. We also identified new gene families, resistance gene analogues and changes within the MADS-box genes, which control a diverse set of developmental processes during orchid evolution. Our high quality chromosomal-level assembly of C. goeringii can provide a platform for elucidating the genomic evolution of orchids, mining functional genes for agronomic traits and for developing molecular markers for accelerated breeding as well as accelerating conservation efforts.
Asunto(s)
Orchidaceae , Fitomejoramiento , Cromosomas , Genoma , Humanos , Anotación de Secuencia Molecular , Orchidaceae/genéticaRESUMEN
A total of 39 water samples from 23 different groundwater wells in Korea were collected and analyzed in order to monitor the occurrence of norovirus (NoV) and other indicator microbes as the first part of a national survey of groundwater. More than 500 L of untreated groundwater were filtered through 1MDS filters. Following elution and concentration by organic flocculation, PCR and sequence analysis were employed to detect and identify NoV, enterovirus, rotavirus, hepatitis A virus and adenovirus (Adv). Somatic and F-specific phages, heterotrophic bacteria, total coliforms and Escherichia coli were also analyzed to infer possible fecal contamination. NoVs were detected in 18% of the 39 samples. Five out of seven NoV-positive samples (71%) were identified as GI while the other two (29%) were GII. Enteroviruses and Advs were detected in two and three samples, respectively. Rotavirus and hepatitis A virus were not detected. Total coliforms, E. coli and coliphages were detected in 49, 15 and 13% of the samples, respectively, but did not appear to be suitable indicators of enteric virus contamination in groundwater. These results suggest that additional treatment may be needed for a significant number of groundwaters prior to use as drinking water.
Asunto(s)
Enterovirus/aislamiento & purificación , Agua Subterránea/virología , Norovirus/aislamiento & purificación , Microbiología del Agua , Contaminantes del Agua/análisis , Colifagos/aislamiento & purificación , Cartilla de ADN , Enterovirus/genética , Heces/microbiología , Humanos , Norovirus/genética , República de Corea , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Calidad del AguaRESUMEN
BACKGROUND: DNBSEQ-T7 is a new whole-genome sequencer developed by Complete Genomics and MGI using DNA nanoball and combinatorial probe anchor synthesis technologies to generate short reads at a very large scale-up to 60 human genomes per day. However, it has not been objectively and systematically compared against Illumina short-read sequencers. FINDINGS: By using the same KOREF sample, the Korean Reference Genome, we have compared 7 sequencing platforms including BGISEQ-500, DNBSEQ-T7, HiSeq2000, HiSeq2500, HiSeq4000, HiSeqX10, and NovaSeq6000. We measured sequencing quality by comparing sequencing statistics (base quality, duplication rate, and random error rate), mapping statistics (mapping rate, depth distribution, and percent GC coverage), and variant statistics (transition/transversion ratio, dbSNP annotation rate, and concordance rate with single-nucleotide polymorphism [SNP] genotyping chip) across the 7 sequencing platforms. We found that MGI platforms showed a higher concordance rate for SNP genotyping than HiSeq2000 and HiSeq4000. The similarity matrix of variant calls confirmed that the 2 MGI platforms have the most similar characteristics to the HiSeq2500 platform. CONCLUSIONS: Overall, MGI and Illumina sequencing platforms showed comparable levels of sequencing quality, uniformity of coverage, percent GC coverage, and variant accuracy; thus we conclude that the MGI platforms can be used for a wide range of genomics research fields at a lower cost than the Illumina platforms.
Asunto(s)
Benchmarking , Secuenciación de Nucleótidos de Alto Rendimiento , Genoma Humano , Humanos , República de Corea , Análisis de Secuencia de ADN , Secuenciación Completa del GenomaRESUMEN
The red-crowned crane (Grus japonensis) is an endangered, large-bodied crane native to East Asia. It is a traditional symbol of longevity and its long lifespan has been confirmed both in captivity and in the wild. Lifespan in birds is known to be positively correlated with body size and negatively correlated with metabolic rate, though the genetic mechanisms for the red-crowned crane's long lifespan have not previously been investigated. Using whole genome sequencing and comparative evolutionary analyses against the grey-crowned crane and other avian genomes, including the long-lived common ostrich, we identified redcrowned crane candidate genes with known associations with longevity. Among these are positively selected genes in metabolism and immunity pathways (NDUFA5, NDUFA8, NUDT12, SOD3, CTH , RPA1, PHAX, HNMT , HS2ST1 , PPCDC , PSTK CD8B, GP9, IL-9R, and PTPRC). Our analyses provide genetic evidence for low metabolic rate and longevity, accompanied by possible convergent adaptation signatures among distantly related large and long-lived birds. Finally, we identified low genetic diversity in the red-crowned crane, consistent with its listing as an endangered species, and this genome should provide a useful genetic resource for future conservation studies of this rare and iconic species.
Asunto(s)
Proteínas Aviares/genética , Aves/fisiología , Animales , Especies en Peligro de Extinción , Inmunidad/genética , Longevidad/genética , Polimorfismo Genético , Especificidad de la Especie , Transcriptoma , Secuenciación Completa del GenomaRESUMEN
We present the initial phase of the Korean Genome Project (Korea1K), including 1094 whole genomes (sequenced at an average depth of 31×), along with data of 79 quantitative clinical traits. We identified 39 million single-nucleotide variants and indels of which half were singleton or doubleton and detected Korean-specific patterns based on several types of genomic variations. A genome-wide association study illustrated the power of whole-genome sequences for analyzing clinical traits, identifying nine more significant candidate alleles than previously reported from the same linkage disequilibrium blocks. Also, Korea1K, as a reference, showed better imputation accuracy for Koreans than the 1KGP panel. As proof of utility, germline variants in cancer samples could be filtered out more effectively when the Korea1K variome was used as a panel of normals compared to non-Korean variome sets. Overall, this study shows that Korea1K can be a useful genotypic and phenotypic resource for clinical and ethnogenetic studies.
Asunto(s)
Genoma Humano , Estudio de Asociación del Genoma Completo , Pueblo Asiatico , Genotipo , Humanos , Polimorfismo de Nucleótido Simple , República de CoreaRESUMEN
BACKGROUND: A disease-causing mutation refers to a heritable genetic change that is associated with a specific phenotype (disease). The detection of a mutation from a patient's sample is critical for the diagnosis, treatment, and prognosis of the disease. There are numerous databases and applications with which to archive mutation data. However, none of them have been implemented with any automated bioinformatics tools for mutation detection and analysis starting from raw data materials from patients. We present a Locus Specific mutation DB (LSDB) construction system that supports both mutation detection and deposition in one package. RESULTS: COMUS (Clinician-Oriented locus specific MUtation detection and deposition System) is a mutation detection and deposition system for developing specific LSDBs. COMUS contains 1) a DNA sequence mutation analysis method for clinicians' mutation data identification and deposition and 2) a curation system for variation detection from clinicians' input data. To embody the COMUS system and to validate its clinical utility, we have chosen the disease hemophilia as a test database. A set of data files from bench experiments and clinical information from hemophilia patients were tested on the LSDB, KoHemGene http://www.kohemgene.org, which has proven to be a clinician-friendly interface for mutation detection and deposition. CONCLUSION: COMUS is a bioinformatics system for detecting and depositing new mutations from patient DNA with a clinician-friendly interface. LSDBs made using COMUS will promote the clinical utility of LSDBs. COMUS is available at http://www.comus.info.