RESUMEN
The genome of many species in the biosphere is a diploid consisting of paternal and maternal haplotypes. The differences between these two haplotypes range from single nucleotide polymorphisms (SNPs) to large-scale structural variations (SVs). Existing genome assemblers for next-generation sequencing platforms attempt to reconstruct one consensus sequence, which is a mosaic of two parental haplotypes. Reconstructing paternal and maternal haplotypes is an important task in linkage analysis and association studies. This study designs and implemented HapSVAssembler on the basis of Genetic Algorithm (GA) and paired-end sequencing. The proposed method builds a consensus sequence, identifies various types of heterozygous variants, and reconstructs the paternal and maternal haplotypes by solving an optimization problem with a GA algorithm. Experimental results indicate that the HapSVAssembler has high accuracy and contiguity under various sequencing coverage, error rates, and insert sizes. The program is tested on pilot sequencing of a highly heterozygous genome, and 12,781 heterozygous SNPs and 602 hemizygous SVs are identified. We observe that, although the number of SVs is much less than that of SNPs, the genomic regions occupied by SVs are much larger, implying the heterozygosity computed using SNPs or k-mer spectrum may be under-estimated.
Asunto(s)
Biología Computacional/métodos , Diploidia , Genoma , Genómica/métodos , Programas Informáticos , Algoritmos , Simulación por Computador , Evolución Molecular , Haplotipos , Heterocigoto , Secuenciación de Nucleótidos de Alto Rendimiento , Mutación , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados , Análisis de Secuencia de ADNRESUMEN
While aberrant JAK/STAT signaling is crucial to the development of gastric cancer (GC), its effects on epigenetic alterations of its transcriptional targets remains unclear. In this study, by expression microarrays coupled with bioinformatic analyses, we identified a putative STAT3 target gene, NR4A3 that was downregulated in MKN28 GC daughter cells overexpressing a constitutively activated STAT3 mutant (S16), as compared to an empty vector control (C9). Bisulphite pyrosequencing and demethylation treatment showed that NR4A3 was epigenetically silenced by promoter DNA methylation in S16 and other GC cell lines including AGS cells, showing constitutive activation of STAT3. Subsequent experiments revealed that NR4A3 promoter binding by STAT3 might repress its transcription. Long-term depletion of STAT3 derepressed NR4A3 expression, by promoter demethylation, in AGS GC cells. NR4A3 re-expression in GC cell lines sensitized the cells to cisplatin, and inhibited tumor growth in vitro and in vivo, in an animal model. Clinically, GC patients with high NR4A3 methylation, or lower NR4A3 protein expression, had significantly shorter overall survival. Intriguingly, STAT3 activation significantly associated only with NR4A3 methylation in low-stage patient samples. Taken together, aberrant JAK/STAT3 signaling epigenetically silences a potential tumor suppressor, NR4A3, in gastric cancer, plausibly representing a reliable biomarker for gastric cancer prognosis.