Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Front Genet ; 14: 1225248, 2023.
Article in English | MEDLINE | ID: mdl-37636268

ABSTRACT

Whole genome sequencing has revolutionized infectious disease surveillance for tracking and monitoring the spread and evolution of pathogens. However, using a linear reference genome for genomic analyses may introduce biases, especially when studies are conducted on highly variable bacterial genomes of the same species. Pangenome graphs provide an efficient model for representing and analyzing multiple genomes and their variants as a graph structure that includes all types of variations. In this study, we present a practical bioinformatics pipeline that employs the PanGenome Graph Builder and the Variation Graph toolkit to build pangenomes from assembled genomes, align whole genome sequencing data and call variants against a graph reference. The pangenome graph enables the identification of structural variants, rearrangements, and small variants (e.g., single nucleotide polymorphisms and insertions/deletions) simultaneously. We demonstrate that using a pangenome graph, instead of a single linear reference genome, improves mapping rates and variant calling for both simulated and real datasets of the pathogen Neisseria meningitidis. Overall, pangenome graphs offer a promising approach for comparative genomics and comprehensive genetic variation analysis in infectious disease. Moreover, this innovative pipeline, leveraging pangenome graphs, can bridge variant analysis, genome assembly, population genetics, and evolutionary biology, expanding the reach of genomic understanding and applications.

2.
bioRxiv ; 2023 Apr 06.
Article in English | MEDLINE | ID: mdl-37066137

ABSTRACT

Pangenome graphs can represent all variation between multiple genomes, but existing methods for constructing them are biased due to reference-guided approaches. In response, we have developed PanGenome Graph Builder (PGGB), a reference-free pipeline for constructing unbi-ased pangenome graphs. PGGB uses all-to-all whole-genome alignments and learned graph embeddings to build and iteratively refine a model in which we can identify variation, measure conservation, detect recombination events, and infer phylogenetic relationships.

3.
Mol Biol Evol ; 39(1)2022 01 07.
Article in English | MEDLINE | ID: mdl-34850073

ABSTRACT

Spatial genetic and phenotypic diversity within solid tumors has been well documented. Nevertheless, how this heterogeneity affects temporal dynamics of tumorigenesis has not been rigorously examined because solid tumors do not evolve as the standard population genetic model due to the spatial constraint. We therefore, propose a neutral spatial (NS) model whereby the mutation accumulation increases toward the periphery; the genealogical relationship is spatially determined and the selection efficacy is blunted (due to kin competition). In this model, neutral mutations are accrued and spatially distributed in manners different from those of advantageous mutations. Importantly, the distinctions could be blurred in the conventional model. To test the NS model, we performed a three-dimensional multiple microsampling of two hepatocellular carcinomas. Whole-genome sequencing (WGS) revealed a 2-fold increase in mutations going from the center to the periphery. The operation of natural selection can then be tested by examining the spatially determined clonal relationships and the clonal sizes. Due to limited migration, only the expansion of highly advantageous clones can sweep through a large part of the tumor to reveal the selective advantages. Hence, even multiregional sampling can only reveal a fraction of fitness differences in solid tumors. Our results suggest that the NS patterns are crucial for testing the influence of natural selection during tumorigenesis, especially for small solid tumors.


Subject(s)
Neoplasms , Carcinogenesis , Humans , Mutation , Neoplasms/genetics , Selection, Genetic
4.
Emerg Infect Dis ; 27(4): 1087-1097, 2021 04.
Article in English | MEDLINE | ID: mdl-33754994

ABSTRACT

Genomic surveillance is an essential part of effective disease control, enabling identification of emerging and expanding strains and monitoring of subsequent interventions. Whole-genome sequencing was used to analyze the genomic diversity of all Neisseria meningitidis isolates submitted to the New Zealand Meningococcal Reference Laboratory during 2013-2018. Of the 347 isolates submitted for whole-genome sequencing, we identified 68 sequence types belonging to 18 clonal complexes (CC). The predominant CC was CC41/44; next in predominance was CC11. Comparison of the 45 New Zealand group W CC11 isolates with worldwide representatives of group W CC11 isolates revealed that the original UK strain, the 2013 UK strain, and a distinctive variant (the 2015 strain) were causing invasive group W meningococcal disease in New Zealand. The 2015 strain also demonstrated increased resistance to penicillin and has been circulating in Canada and several countries in Europe, highlighting that close monitoring is needed to prevent future outbreaks around the world.


Subject(s)
Meningococcal Infections , Neisseria meningitidis , Canada , Europe , Genomics , Humans , New Zealand , Serogroup
5.
Genomics Proteomics Bioinformatics ; 17(6): 576-589, 2019 12.
Article in English | MEDLINE | ID: mdl-32205176

ABSTRACT

Uncovering the functionally essential variations related to tumorigenesis and tumor progression from cancer genomics data is still challenging due to the genetic diversity among patients, and extensive inter- and intra-tumoral heterogeneity at different levels of gene expression regulation, including but not limited to the genomic, epigenomic, and transcriptional levels. To minimize the impact of germline genetic heterogeneities, in this study, we establish multiple primary cultures from the primary and recurrent tumors of a single patient with hepatocellular carcinoma (HCC). Multi-omics sequencing was performed for these cultures that encompass the diversity of tumor cells from the same patient. Variations in the genome sequence, epigenetic modification, and gene expression are used to infer the phylogenetic relationships of these cell cultures. We find the discrepancy among the relationships revealed by single nucleotide variations (SNVs) and transcriptional/epigenomic profiles from the cell cultures. We fail to find overlap between sample-specific mutated genes and differentially expressed genes (DEGs), suggesting that most of the heterogeneous SNVs among tumor stages or lineages of the patient are functionally insignificant. Moreover, copy number alterations (CNAs) and DNA methylation variation within gene bodies, rather than promoters, are significantly correlated with gene expression variability among these cell cultures. Pathway analysis of CNA/DNA methylation-related genes indicates that a single cell clone from the recurrent tumor exhibits distinct cellular characteristics and tumorigenicity, and such an observation is further confirmed by cellular experiments both in vitro and in vivo. Our systematic analysis reveals that CNAs and epigenomic changes, rather than SNVs, are more likely to contribute to the phenotypic diversity among subpopulations in the tumor. These findings suggest that new therapeutic strategies targeting gene dosage and epigenetic modification should be considered in personalized cancer medicine. This culture model may be applied to the further identification of plausible determinants of cancer metastasis and relapse.


Subject(s)
Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/pathology , Epigenomics , Liver Neoplasms/genetics , Liver Neoplasms/pathology , Phenotype , Primary Cell Culture , DNA Copy Number Variations , DNA Methylation , Gene Dosage , Humans , Phylogeny
6.
PLoS One ; 12(11): e0187551, 2017.
Article in English | MEDLINE | ID: mdl-29117265

ABSTRACT

With the development of high-throughput genomic analysis, sequencing a mouse primary cancer model provides a new opportunity to understand fundamental mechanisms of tumorigenesis and progression. Here, we characterized the genomic variations in a hepatitis-related primary hepatocellular carcinoma (HCC) mouse model. A total of 12 tumor sections and four adjacent non-tumor tissues from four mice were used for whole exome and/or whole genome sequencing and validation of genotyping. The functions of the mutated genes in tumorigenesis were studied by analyzing their mutation frequency and expression in clinical HCC samples. A total of 46 single nucleotide variations (SNVs) were detected within coding regions. All SNVs were only validated in the sequencing samples, except the Hras mutation, which was shared by three tumors in the M1 mouse. However, the mutated allele frequency varied from high (0.4) to low (0.1), and low frequency (0.1-0.2) mutations existed in almost every tumor. Together with a diploid karyotype and an equal distribution pattern of these SNVs within the tumor, these results suggest the existence of subclones within tumors. A total of 26 mutated genes were mapped to 17 terms describing different molecular and cellular functions. All 41 human homologs of the mutated genes were mutated in the clinical samples, and some mutations were associated with clinical outcomes, suggesting a high probability of cancer driver genes in the spontaneous tumors of the mouse model. Genomic sequencing shows that a few mutations can drive the independent origin of primary liver tumors and reveals high heterogeneity among tumors in the early stage of hepatitis-related primary hepatocellular carcinoma.


Subject(s)
Genomics/methods , Hepatitis, Chronic/genetics , Liver Neoplasms/genetics , Mutation/genetics , Sequence Analysis, DNA , Alleles , Animals , Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/pathology , DNA Copy Number Variations/genetics , Disease Models, Animal , Genetic Variation , Hepatitis B virus/physiology , Hepatitis, Chronic/pathology , Humans , Mice, Inbred C57BL , Mice, Transgenic , Molecular Sequence Annotation , Mutation Rate , Phylogeny , Ploidies , Reproducibility of Results , Treatment Outcome
7.
Mol Biol Evol ; 34(7): 1730-1742, 2017 07 01.
Article in English | MEDLINE | ID: mdl-28369576

ABSTRACT

Although intratumor diversity driven by selection has been the prevailing view in cancer biology, recent population genetic analyses have been unable to reject the neutral interpretation. As the power to reject neutrality in tumors is often low, it will be desirable to have an alternative means to test selection directly. Here, we utilize gene expression data as a surrogate for functional significance in intra- and intertumor comparisons. The expression divergence between samples known to be driven by selection (e.g., between tumor and normal tissues) is always higher than the divergence between normal samples, which should be close to the neutral level of divergence. In contrast, the expression differentiation between regions of the same tumor, being lower than the neutral divergence, is incompatible with the hypothesis of selectively driven divergence. To further test the hypothesis of neutral evolution, we select a hepatocellular carcinoma tumor that has large intratumor SNV and CNV (single nucleotide variation and copy number variation, respectively) diversity. This tumor enables us to calibrate the level of expression divergence against that of genetic divergence. We observe that intratumor divergence in gene expression profile lags far behind genetic divergence, indicating insufficient phenotypic differences for selection to operate. All these expression analyses corroborate that natural selection does not operate effectively within tumors, supporting recent interpretations of within-tumor diversity. As the expected level of genetic diversity, hence the potential for drug resistance, would be much higher under neutrality than under selection, the issue is of both theoretical and clinical significance.


Subject(s)
Gene Expression Regulation, Neoplastic/genetics , Neoplasms/genetics , Transcriptome/genetics , DNA Copy Number Variations/genetics , Databases, Nucleic Acid , Evolution, Molecular , Gene Expression , Genetic Drift , Genetic Variation/genetics , Humans , Selection, Genetic/genetics , Sequence Analysis, DNA/methods
8.
Proc Natl Acad Sci U S A ; 112(47): E6496-505, 2015 Nov 24.
Article in English | MEDLINE | ID: mdl-26561581

ABSTRACT

The prevailing view that the evolution of cells in a tumor is driven by Darwinian selection has never been rigorously tested. Because selection greatly affects the level of intratumor genetic diversity, it is important to assess whether intratumor evolution follows the Darwinian or the non-Darwinian mode of evolution. To provide the statistical power, many regions in a single tumor need to be sampled and analyzed much more extensively than has been attempted in previous intratumor studies. Here, from a hepatocellular carcinoma (HCC) tumor, we evaluated multiregional samples from the tumor, using either whole-exome sequencing (WES) (n = 23 samples) or genotyping (n = 286) under both the infinite-site and infinite-allele models of population genetics. In addition to the many single-nucleotide variations (SNVs) present in all samples, there were 35 "polymorphic" SNVs among samples. High genetic diversity was evident as the 23 WES samples defined 20 unique cell clones. With all 286 samples genotyped, clonal diversity agreed well with the non-Darwinian model with no evidence of positive Darwinian selection. Under the non-Darwinian model, MALL (the number of coding region mutations in the entire tumor) was estimated to be greater than 100 million in this tumor. DNA sequences reveal local diversities in small patches of cells and validate the estimation. In contrast, the genetic diversity under a Darwinian model would generally be orders of magnitude smaller. Because the level of genetic diversity will have implications on therapeutic resistance, non-Darwinian evolution should be heeded in cancer treatments even for microscopic tumors.


Subject(s)
Biological Evolution , Genetic Variation , Neoplasms/genetics , Neoplasms/pathology , Selection, Genetic , Aged , Base Sequence , Cell Count , Cell Line, Tumor , Clone Cells , Computer Simulation , Gene Library , Genes, Neoplasm , Genotype , Humans , Male , Microdissection , Models, Genetic , Molecular Sequence Data , Mutation , Mutation Rate , Polymorphism, Single Nucleotide/genetics , Reproducibility of Results , Sequence Analysis, DNA
9.
Mol Phylogenet Evol ; 82 Pt A: 1-14, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25462996

ABSTRACT

Abies, the second largest genus of Pinaceae, consists of approximately 48 species occurring in the north temperate region. Previous molecular phylogenetic studies improved our understanding of relationships within the genus, but were limited by relying on only DNA sequence data from single genome and low taxonomic sampling. Here we use DNA data from three genomes (sequences of internal transcribed spacer of nrITS, three chloroplast DNA intergenic spacers, and two mitochondrial intergenic spacers) from 42 species to elucidate species relationships and construct the biogeographic history of Abies. We further estimated the divergence times of intercontinental disjunction using a relaxed molecular clock calibrated with three macro-fossils. Our phylogenetic analyses recovered six robust clades largely consistent with previous classifications of sections. A sister relationship between the eastern Asian and Europe-Mediterranean clades was highly supported. The monophyly of section Balsamea, disjunct in Far East and western North America, is supported by the nrITS data but not by the cpDNA data. Discordance on placement of section Balsamea between the paternally inherited cpDNA and maternally inherited mtDNA trees was also observed. The data suggested that ancient hybridization was likely involved in the origin of sect. Balsamea. Results from biogeographic analyses and divergence time estimation suggested an origin and early diversification of Abies in an area of high latitude around the Pacific during the Eocene. The present disjunction in eastern Asia and Europe-Mediterranean area of Abies was likely the result of southward migration and isolation by the Turgai Strait in the Late Eocene. An 'out-of-America' migration, for the origin of an eastern Asian and western North American disjunct species pairs in section Amabilis was supported. The results suggested a western North American origin of the section with subsequent dispersal across the Bering Land Bridge (BLB) to Japan during the Middle Miocene.


Subject(s)
Abies/classification , Biological Evolution , Genome, Chloroplast , Genome, Mitochondrial , Phylogeny , Abies/genetics , Asia , Bayes Theorem , DNA, Plant/genetics , DNA, Ribosomal Spacer/genetics , Europe , Fossils , Genome, Plant , Hybridization, Genetic , Likelihood Functions , Models, Genetic , North America , Sequence Analysis, DNA
10.
PLoS One ; 9(9): e107679, 2014.
Article in English | MEDLINE | ID: mdl-25222863

ABSTRACT

Phylogenetic reconstruction is fundamental to study evolutionary biology and historical biogeography. However, there was not a molecular phylogeny of gymnosperms represented by extensive sampling at the genus level, and most published phylogenies of this group were constructed based on cytoplasmic DNA markers and/or the multi-copy nuclear ribosomal DNA. In this study, we use LFY and NLY, two single-copy nuclear genes that originated from an ancient gene duplication in the ancestor of seed plants, to reconstruct the phylogeny and estimate divergence times of gymnosperms based on a complete sampling of extant genera. The results indicate that the combined LFY and NLY coding sequences can resolve interfamilial relationships of gymnosperms and intergeneric relationships of most families. Moreover, the addition of intron sequences can improve the resolution in Podocarpaceae but not in cycads, although divergence times of the cycad genera are similar to or longer than those of the Podocarpaceae genera. Our study strongly supports cycads as the basal-most lineage of gymnosperms rather than sister to Ginkgoaceae, and a sister relationship between Podocarpaceae and Araucariaceae and between Cephalotaxaceae-Taxaceae and Cupressaceae. In addition, intergeneric relationships of some families that were controversial, and the relationships between Taxaceae and Cephalotaxaceae and between conifers and Gnetales are discussed based on the nuclear gene evidence. The molecular dating analysis suggests that drastic extinctions occurred in the early evolution of gymnosperms, and extant coniferous genera in the Northern Hemisphere are older than those in the Southern Hemisphere on average. This study provides an evolutionary framework for future studies on gymnosperms.


Subject(s)
Cycadopsida/genetics , DNA, Ribosomal/genetics , Evolution, Molecular , Phylogeny , Cell Nucleus/genetics , Gene Dosage , Introns/genetics , Molecular Sequence Data , Plant Proteins/genetics , Sequence Analysis, DNA
11.
Mol Phylogenet Evol ; 64(3): 452-70, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22609823

ABSTRACT

Phylogenetic information is essential to interpret the evolution of species. While DNA sequences from different genomes have been widely utilized in phylogenetic reconstruction, it is still difficult to use nuclear genes to reconstruct phylogenies of plant groups with large genomes and complex gene families, such as gymnosperms. Here, we use two single-copy nuclear genes, together with chloroplast and mitochondrial genes, to reconstruct the phylogeny of the ecologically-important conifer family Cupressaceae s.l., based on a complete sampling of its 32 genera. The different gene trees generated are highly congruent in topology, supporting the basal position of Cunninghamia and the seven-subfamily classification, and the estimated divergence times based on different datasets correspond well with each other and with the oldest fossil record. These results imply that we have obtained the species phylogeny of Cupressaceae s.l. In addition, possible origins of all three polyploid conifers were investigated, and a hybrid origin was suggested for Cupressus, Fitzroya and Sequoia. Moreover, we found that the biogeographic history of Cupressaceae s.l. is associated with the separation between Laurasia and Gondwana and the further break-up of the latter. Our study also provides new evidence for the gymnosperm phylogeny.


Subject(s)
Biological Evolution , Cupressaceae/classification , Phylogeny , Cell Nucleus/genetics , Cupressaceae/genetics , DNA, Plant/genetics , Fossils , Genes, Chloroplast , Genes, Mitochondrial , Genes, Plant , Genome, Plant , Likelihood Functions , Phylogeography , Sequence Analysis, DNA
12.
Mol Phylogenet Evol ; 55(3): 776-85, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20214996

ABSTRACT

Climatic oscillations and geological events play major roles in shaping species diversity and the distribution of plants. The mechanisms underlying the high level of plant species diversity in eastern Asia are hotly debated. In this study, five cpDNA regions, two mtDNA fragments and one nuclear gene (LEAFY) were employed to investigate species diversification and the historical biogeography of Pseudotsuga (Pinaceae), a genus with a typical eastern Asia and western North America disjunct distribution. Both the nuclear LEAFY gene and cpDNA phylogenies strongly suggest that eastern Asian and North American species are monophyletic, respectively. Within the eastern Asia clade, the cpDNA tree placed P. japonica as sister to the rest of the Asian species, but the LEAFY gene tree showed a sister relationship between P. japonica-P. sinensis-P. gaussenii and P. brevifolia-P. forrestii. Molecular dating indicated that the Asian species last shared a common ancestor 20.26+/-5.84 mya and the species diversification of Pseudotsuga was correlated with the Tertiary climatic and tectonic changes. These results, together with the fossil evidence, suggest that Pseudotsuga might have originated from North America and then migrated to eastern Asia by the Bering land bridge during the early Miocene. The Taiwanese species P. wilsoniana harbored two divergent types of LEAFY sequences, which implies that this species might have originated by hybridization between P. brevifolia or its ancestor and the ancestor of P. japonica-P. sinensis-P. gaussenii. Our study also suggests that Taiwan is closely related to both southwest and east China in flora.


Subject(s)
Evolution, Molecular , Phylogeny , Pseudotsuga/genetics , Cell Nucleus/genetics , DNA, Chloroplast/genetics , DNA, Mitochondrial/genetics , DNA, Plant/genetics , Asia, Eastern , Genes, Plant , Genetics, Population , Geography , North America , Sequence Alignment , Sequence Analysis, DNA , Taiwan
SELECTION OF CITATIONS
SEARCH DETAIL
...