RESUMEN
Gene transfer agents (GTAs) randomly transfer short fragments of a bacterial genome. A novel putative GTA was recently discovered in the mouse-infecting bacterium Bartonella grahamii. Although GTAs are widespread in phylogenetically diverse bacteria, their role in evolution is largely unknown. Here, we present a comparative analysis of 16 Bartonella genomes ranging from 1.4 to 2.6 Mb in size, including six novel genomes from Bartonella isolated from a cow, two moose, two dogs, and a kangaroo. A phylogenetic tree inferred from 428 orthologous core genes indicates that the deadly human pathogen B. bacilliformis is related to the ruminant-adapted clade, rather than being the earliest diverging species in the genus as previously thought. A gene flux analysis identified 12 genes for a GTA and a phage-derived origin of replication as the most conserved innovations. These are located in a region of a few hundred kb that also contains 8 insertions of gene clusters for type III, IV, and V secretion systems, and genes for putatively secreted molecules such as cholera-like toxins. The phylogenies indicate a recent transfer of seven genes in the virB gene cluster for a type IV secretion system from a cat-adapted B. henselae to a dog-adapted B. vinsonii strain. We show that the B. henselae GTA is functional and can transfer genes in vitro. We suggest that the maintenance of the GTA is driven by selection to increase the likelihood of horizontal gene transfer and argue that this process is beneficial at the population level, by facilitating adaptive evolution of the host-adaptation systems and thereby expansion of the host range size. The process counters gene loss and forces all cells to contribute to the production of the GTA and the secreted molecules. The results advance our understanding of the role that GTAs play for the evolution of bacterial genomes.
Asunto(s)
Bartonella , Evolución Biológica , Transferencia de Gen Horizontal , Genoma Bacteriano , Animales , Bartonella/genética , Bartonella/patogenicidad , Gatos , Perros , Radiación Electromagnética , Humanos , Macropodidae/genética , Macropodidae/microbiología , Ratones , Familia de Multigenes , Filogenia , Análisis de Secuencia de ADNRESUMEN
Genomic characterization of pediatric acute lymphoblastic leukemia (ALL) has identified distinct patterns of genes and pathways altered in patients with well-defined genetic aberrations. To extend the spectrum of known somatic variants in ALL, we performed whole genome and transcriptome sequencing of three B-cell precursor patients, of which one carried the t(12;21)ETV6-RUNX1 translocation and two lacked a known primary genetic aberration, and one T-ALL patient. We found that each patient had a unique genome, with a combination of well-known and previously undetected genomic aberrations. By targeted sequencing in 168 patients, we identified KMT2D and KIF1B as novel putative driver genes. We also identified a putative regulatory non-coding variant that coincided with overexpression of the growth factor MDK. Our results contribute to an increased understanding of the biological mechanisms that lead to ALL and suggest that regulatory variants may be more important for cancer development than recognized to date. The heterogeneity of the genetic aberrations in ALL renders whole genome sequencing particularly well suited for analysis of somatic variants in both research and diagnostic applications.
Asunto(s)
Proteínas de Unión al ADN/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Cinesinas/genética , Mutación , Proteínas de Neoplasias/genética , Factores de Crecimiento Nervioso/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Niño , Preescolar , Femenino , Genoma Humano , Humanos , Lactante , Masculino , Midkina , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ARN/métodosRESUMEN
BACKGROUND: Target enrichment and resequencing is a widely used approach for identification of cancer genes and genetic variants associated with diseases. Although cost effective compared to whole genome sequencing, analysis of many samples constitutes a significant cost, which could be reduced by pooling samples before capture. Another limitation to the number of cancer samples that can be analyzed is often the amount of available tumor DNA. We evaluated the performance of whole genome amplified DNA and the power to detect subclonal somatic single nucleotide variants in non-indexed pools of cancer samples using the HaloPlex technology for target enrichment and next generation sequencing. RESULTS: We captured a set of 1528 putative somatic single nucleotide variants and germline SNPs, which were identified by whole genome sequencing, with the HaloPlex technology and sequenced to a depth of 792-1752. We found that the allele fractions of the analyzed variants are well preserved during whole genome amplification and that capture specificity or variant calling is not affected. We detected a large majority of the known single nucleotide variants present uniquely in one sample with allele fractions as low as 0.1 in non-indexed pools of up to ten samples. We also identified and experimentally validated six novel variants in the samples included in the pools. CONCLUSION: Our work demonstrates that whole genome amplified DNA can be used for target enrichment equally well as genomic DNA and that accurate variant detection is possible in non-indexed pools of cancer samples. These findings show that analysis of a large number of samples is feasible at low cost, even when only small amounts of DNA is available, and thereby significantly increases the chances of indentifying recurrent mutations in cancer samples.
Asunto(s)
Genoma Humano , Estudio de Asociación del Genoma Completo/métodos , Neoplasias/genética , Polimorfismo de Nucleótido Simple , Alelos , Niño , Preescolar , Frecuencia de los Genes , Genotipo , Células Germinativas/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Reproducibilidad de los Resultados , Sensibilidad y EspecificidadRESUMEN
BACKGROUND: Rates of recombination vary by three orders of magnitude in bacteria but the reasons for this variation is unclear. We performed a genome-wide study of recombination rate variation among genes in the intracellular bacterium Bartonella henselae, which has among the lowest estimated ratio of recombination relative to mutation in prokaryotes. RESULTS: The 1.9 Mb genomes of B. henselae strains IC11, UGA10 and Houston-1 genomes showed only minor gene content variation. Nucleotide sequence divergence levels were less than 1% and the relative rate of recombination to mutation was estimated to 1.1 for the genome overall. Four to eight segments per genome presented significantly enhanced divergences, the most pronounced of which were the virB and trw gene clusters for type IV secretion systems that play essential roles in the infection process. Consistently, multiple recombination events were identified inside these gene clusters. High recombination frequencies were also observed for a gene putatively involved in iron metabolism. A phylogenetic study of this gene in 80 strains of Bartonella quintana, B. henselae and B. grahamii indicated different population structures for each species and revealed horizontal gene transfers across Bartonella species with different host preferences. CONCLUSIONS: Our analysis has shown little novel gene acquisition in B. henselae, indicative of a closed pan-genome, but higher recombination frequencies within the population than previously estimated. We propose that the dramatically increased fixation rate for recombination events at gene clusters for type IV secretion systems is driven by selection for sequence variability.
Asunto(s)
Bartonella henselae/genética , Hibridación Genómica Comparativa , Transferencia de Gen Horizontal , Genoma Bacteriano , Sistemas de Secreción Bacterianos/genética , ADN Bacteriano/genética , Familia de Multigenes , Filogenia , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADNRESUMEN
The genus Bartonella comprises facultative intracellular bacteria adapted to mammals, including previously recognized and emerging human pathogens. We report the 2,341,328 bp genome sequence of Bartonella grahamii, one of the most prevalent Bartonella species in wild rodents. Comparative genomics revealed that rodent-associated Bartonella species have higher copy numbers of genes for putative host-adaptability factors than the related human-specific pathogens. Many of these gene clusters are located in a highly dynamic region of 461 kb. Using hybridization to a microarray designed for the B. grahamii genome, we observed a massive, putatively phage-derived run-off replication of this region. We also identified a novel gene transfer agent, which packages the bacterial genome, with an over-representation of the amplified DNA, in 14 kb pieces. This is the first observation associating the products of run-off replication with a gene transfer agent. Because of the high concentration of gene clusters for host-adaptation proteins in the amplified region, and since the genes encoding the gene transfer agent and the phage origin are well conserved in Bartonella, we hypothesize that these systems are driven by selection. We propose that the coupling of run-off replication with gene transfer agents promotes diversification and rapid spread of host-adaptability factors, facilitating host shifts in Bartonella.
Asunto(s)
Bacteriófagos/fisiología , Infecciones por Bartonella/microbiología , Bartonella/virología , Reservorios de Enfermedades/microbiología , Transferencia de Gen Horizontal , Genoma Bacteriano , Ratones/microbiología , Replicación Viral , Animales , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Bacteriófagos/genética , Bartonella/clasificación , Bartonella/genética , Bartonella/metabolismo , Interacciones Huésped-Patógeno , Humanos , Datos de Secuencia Molecular , FilogeniaRESUMEN
The mechanisms driving clonal heterogeneity and evolution in relapsed pediatric acute lymphoblastic leukemia (ALL) are not fully understood. We performed whole genome sequencing of samples collected at diagnosis, relapse(s) and remission from 29 Nordic patients. Somatic point mutations and large-scale structural variants were called using individually matched remission samples as controls, and allelic expression of the mutations was assessed in ALL cells using RNA-sequencing. We observed an increased burden of somatic mutations at relapse, compared to diagnosis, and at second relapse compared to first relapse. In addition to 29 known ALL driver genes, of which nine genes carried recurrent protein-coding mutations in our sample set, we identified putative non-protein coding mutations in regulatory regions of seven additional genes that have not previously been described in ALL. Cluster analysis of hundreds of somatic mutations per sample revealed three distinct evolutionary trajectories during ALL progression from diagnosis to relapse. The evolutionary trajectories provide insight into the mutational mechanisms leading relapse in ALL and could offer biomarkers for improved risk prediction in individual patients.
Asunto(s)
Biomarcadores de Tumor/genética , Evolución Clonal , Mutación , Recurrencia Local de Neoplasia/patología , Leucemia-Linfoma Linfoblástico de Células Precursoras/patología , Niño , Humanos , Recurrencia Local de Neoplasia/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Análisis de Secuencia de ARN/métodos , Secuenciación Completa del Genoma/métodosRESUMEN
BACKGROUND: Rodents represent a high-risk reservoir for the emergence of new human pathogens. The recent completion of the 2.3 Mb genome of Bartonella grahamii, one of the most prevalent blood-borne bacteria in wild rodents, revealed a higher abundance of genes for host-cell interaction systems than in the genomes of closely related human pathogens. The sequence variability within the global B. grahamii population was recently investigated by multi locus sequence typing, but no study on the variability of putative host-cell interaction systems has been performed. RESULTS: To study the population dynamics of B. grahamii, we analyzed the genomic diversity on a whole-genome scale of 27 B. grahamii strains isolated from four different species of wild rodents in three geographic locations separated by less than 30 km. Even using highly variable spacer regions, only 3 sequence types were identified. This low sequence diversity contrasted with a high variability in genome content. Microarray comparative genome hybridizations identified genes for outer surface proteins, including a repeated region containing the fha gene for filamentous hemaggluttinin and a plasmid that encodes a type IV secretion system, as the most variable. The estimated generation times in liquid culture medium for a subset of strains ranged from 5 to 22 hours, but did not correlate with sequence type or presence/absence patterns of the fha gene or the plasmid. CONCLUSION: Our study has revealed a geographic microstructure of B. grahamii in wild rodents. Despite near-identity in nucleotide sequence, major differences were observed in gene presence/absence patterns that did not segregate with host species. This suggests that genetically similar strains can infect a range of different hosts.
Asunto(s)
Bartonella/genética , Genética de Población , Genoma Bacteriano , Roedores/microbiología , Animales , Bartonella/crecimiento & desarrollo , Bartonella/aislamiento & purificación , Infecciones por Bartonella/microbiología , Hibridación Genómica Comparativa , ADN Bacteriano/genética , Geografía , Interacciones Huésped-Patógeno , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN , Especificidad de la EspecieRESUMEN
Bartonella is a genus of vector-borne bacteria that infect the red blood cells of mammals, and includes several human-specific and zoonotic pathogens. Bartonella grahamii has a wide host range and is one of the most prevalent Bartonella species in wild rodents. We studied the population structure, genome content and genome plasticity of a collection of 26 B. grahamii isolates from 11 species of wild rodents in seven countries. We found strong geographic patterns, high recombination frequencies and large variations in genome size in B. grahamii compared with previously analysed cat- and human-associated Bartonella species. The extent of sequence divergence in B. grahamii populations was markedly lower in Europe and North America than in Asia, and several recombination events were predicted between the Asian strains. We discuss environmental and demographic factors that may underlie the observed differences.
Asunto(s)
Bartonella/genética , Genoma Bacteriano , Recombinación Genética , Roedores/microbiología , Animales , Asia , Bartonella/clasificación , Infecciones por Bartonella/microbiología , Análisis por Conglomerados , Hibridación Genómica Comparativa , ADN Bacteriano/genética , Europa (Continente) , Genética de Población , Islas Genómicas , Geografía , América del Norte , Análisis de Secuencia por Matrices de Oligonucleótidos , Filogenia , Plásmidos , Profagos/genética , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL). In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. METHODS: We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. RESULTS: We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. CONCLUSION: Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.
Asunto(s)
Metilación de ADN/genética , Proteínas de Fusión Oncogénica/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Adolescente , Niño , Preescolar , Femenino , Humanos , Lactante , Masculino , Leucemia-Linfoma Linfoblástico de Células Precursoras/patología , Factores de Transcripción , TranscriptomaRESUMEN
To characterize the mutational patterns of acute lymphoblastic leukemia (ALL) we performed deep next generation sequencing of 872 cancer genes in 172 diagnostic and 24 relapse samples from 172 pediatric ALL patients. We found an overall greater mutational burden and more driver mutations in T-cell ALL (T-ALL) patients compared to B-cell precursor ALL (BCP-ALL) patients. In addition, the majority of the mutations in T-ALL had occurred in the original leukemic clone, while most of the mutations in BCP-ALL were subclonal. BCP-ALL patients carrying any of the recurrent translocations ETV6-RUNX1, BCR-ABL or TCF3-PBX1 harbored few mutations in driver genes compared to other BCP-ALL patients. Specifically in BCP-ALL, we identified ATRX as a novel putative driver gene and uncovered an association between somatic mutations in the Notch signaling pathway at ALL diagnosis and increased risk of relapse. Furthermore, we identified EP300, ARID1A and SH2B3 as relapse-associated genes. The genes highlighted in our study were frequently involved in epigenetic regulation, associated with germline susceptibility to ALL, and present in minor subclones at diagnosis that became dominant at relapse. We observed a high degree of clonal heterogeneity and evolution between diagnosis and relapse in both BCP-ALL and T-ALL, which could have implications for the treatment efficiency.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Mutación , Recurrencia Local de Neoplasia/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Proteínas Adaptadoras Transductoras de Señales , Niño , Preescolar , Estudios de Cohortes , Análisis Mutacional de ADN , Proteínas de Unión al ADN , Proteína p300 Asociada a E1A/genética , Epigénesis Genética , Humanos , Inmunofenotipificación , Lactante , Péptidos y Proteínas de Señalización Intracelular , Proteínas Nucleares/genética , Proteínas de Fusión Oncogénica/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras B/genética , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Proteínas/genética , Recurrencia , Inducción de Remisión , Análisis de Secuencia de ADN , Factores de Transcripción/genética , Translocación GenéticaAsunto(s)
Infecciones Bacterianas/genética , Enfermedades Transmisibles/diagnóstico , Enfermedades Transmisibles/genética , Vigilancia de la Población/métodos , Animales , Infecciones Bacterianas/diagnóstico , Control de Enfermedades Transmisibles/métodos , Enfermedades Transmisibles/transmisión , Brotes de Enfermedades , Reservorios de Enfermedades , Genoma , Humanos , Internacionalidad , Internet , Microbiología , Desarrollo de Programa , Estados Unidos , VirulenciaRESUMEN
BACKGROUND: Although aberrant DNA methylation has been observed previously in acute lymphoblastic leukemia (ALL), the patterns of differential methylation have not been comprehensively determined in all subtypes of ALL on a genome-wide scale. The relationship between DNA methylation, cytogenetic background, drug resistance and relapse in ALL is poorly understood. RESULTS: We surveyed the DNA methylation levels of 435,941 CpG sites in samples from 764 children at diagnosis of ALL and from 27 children at relapse. This survey uncovered four characteristic methylation signatures. First, compared with control blood cells, the methylomes of ALL cells shared 9,406 predominantly hypermethylated CpG sites, independent of cytogenetic background. Second, each cytogenetic subtype of ALL displayed a unique set of hyper- and hypomethylated CpG sites. The CpG sites that constituted these two signatures differed in their functional genomic enrichment to regions with marks of active or repressed chromatin. Third, we identified subtype-specific differential methylation in promoter and enhancer regions that were strongly correlated with gene expression. Fourth, a set of 6,612 CpG sites was predominantly hypermethylated in ALL cells at relapse, compared with matched samples at diagnosis. Analysis of relapse-free survival identified CpG sites with subtype-specific differential methylation that divided the patients into different risk groups, depending on their methylation status. CONCLUSIONS: Our results suggest an important biological role for DNA methylation in the differences between ALL subtypes and in their clinical outcome after treatment.
Asunto(s)
Cromatina/metabolismo , Aberraciones Cromosómicas , Metilación de ADN , Genoma Humano , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Adolescente , Antineoplásicos/uso terapéutico , Niño , Preescolar , Cromatina/química , Islas de CpG , Supervivencia sin Enfermedad , Elementos de Facilitación Genéticos , Femenino , Perfilación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Leucemia-Linfoma Linfoblástico de Células Precursoras/tratamiento farmacológico , Leucemia-Linfoma Linfoblástico de Células Precursoras/mortalidad , Pronóstico , Regiones Promotoras Genéticas , Recurrencia , RiesgoRESUMEN
Rapid advances in the development of sequencing technologies in recent years have enabled an increasing number of applications in biology and medicine. Here, we review key technical aspects of the preparation of DNA templates for sequencing, the biochemical reaction principles and assay formats underlying next-generation sequencing systems, methods for imaging and base calling, quality control, and bioinformatic approaches for sequence alignment, variant calling and assembly. We also discuss some of the most important advances that the new sequencing technologies have brought to the fields of human population genetics, human genetic history and forensic genetics.