RESUMO
The exchange of genes between cells is known to play an important physiological and pathological role in many organisms. We show that circulating tumor DNA (ctDNA) facilitates cell-specific gene transfer between human cancer cells and explain part of the mechanisms behind this phenomenon. As ctDNA migrates into the nucleus, genetic information is transferred. Cell targeting and ctDNA integration require ERVL, SINE or LINE DNA sequences. Chemically manufactured AluSp and MER11C sequences replicated multiple myeloma (MM) ctDNA cell targeting and integration. Additionally, we found that ctDNA may alter the treatment response of MM and pancreatic cancer models. This study shows that retrotransposon DNA sequences promote cancer gene transfer. However, because cell-free DNA has been detected in physiological and other pathological conditions, our findings have a broader impact than just cancer. Furthermore, the discovery that transposon DNA sequences mediate tissue-specific targeting will open up a new avenue for the delivery of genes and therapies.
Assuntos
DNA Tumoral Circulante , Elementos de DNA Transponíveis , Humanos , DNA Tumoral Circulante/genética , DNA Tumoral Circulante/sangue , Elementos de DNA Transponíveis/genética , Linhagem Celular Tumoral , Mieloma Múltiplo/genética , Mieloma Múltiplo/terapia , Animais , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/terapia , Camundongos , Especificidade de Órgãos/genética , Retroelementos/genética , Técnicas de Transferência de GenesRESUMO
BACKGROUND: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public variant repositories such as dbSNP and gnomAD. In response, we have developed VVP, the VAAST Variant Prioritizer. VVP is ultrafast, scales to even the largest variant repositories and genome collections, and its outputs are designed to simplify clinical interpretation of variants of uncertain significance. RESULTS: We show that scoring the entire contents of dbSNP (> 155 million variants) requires only 95 min using a machine with 4 cpus and 16 GB of RAM, and that a 60X WGS can be processed in less than 5 min. We also demonstrate that VVP can score variants anywhere in the genome, regardless of type, effect, or location. It does so by integrating sequence conservation, the type of sequence change, allele frequencies, variant burden, and zygosity. Finally, we also show that VVP scores are consistently accurate, and easily interpreted, traits not shared by many commonly used tools such as SIFT and CADD. CONCLUSIONS: VVP provides rapid and scalable means to prioritize any sequence variant, anywhere in the genome, and its scores are designed to facilitate variant interpretation using ACMG and NHS guidelines. These traits make it well suited for operation on very large collections of WGS sequences.
Assuntos
Biologia Computacional/métodos , Variação Genética , Genoma Humano , Software , Bases de Dados Genéticas , Humanos , Polimorfismo de Nucleotídeo Único/genética , Curva ROC , Fatores de Tempo , Sequenciamento Completo do Genoma , Zigoto/metabolismoRESUMO
Background: Community-acquired pneumonia (CAP) is a leading cause of pediatric hospitalization. Pathogen identification fails in approximately 20% of children but is critical for optimal treatment and prevention of hospital-acquired infections. We used two broad-spectrum detection strategies to identify pathogens in test-negative children with CAP and asymptomatic controls. Methods: Nasopharyngeal/oropharyngeal (NP/OP) swabs from 70 children <5 years with CAP of unknown etiology and 90 asymptomatic controls were tested by next-generation sequencing (RNA-seq) and pan viral group (PVG) PCR for 19 viral families. Association of viruses with CAP was assessed by adjusted odds ratios (aOR) and 95% confidence intervals controlling for season and age group. Results: RNA-seq/PVG PCR detected previously missed, putative pathogens in 34% of patients. Putative viral pathogens included human parainfluenza virus 4 (aOR 9.3, P = .12), human bocavirus (aOR 9.1, P < .01), Coxsackieviruses (aOR 5.1, P = .09), rhinovirus A (aOR 3.5, P = .34), and rhinovirus C (aOR 2.9, P = .57). RNA-seq was more sensitive for RNA viruses whereas PVG PCR detected more DNA viruses. Conclusions: RNA-seq and PVG PCR identified additional viruses, some known to be pathogenic, in NP/OP specimens from one-third of children hospitalized with CAP without a previously identified etiology. Both broad-range methods could be useful tools in future epidemiologic and diagnostic studies.
Assuntos
Infecções Comunitárias Adquiridas/virologia , Metagenômica/métodos , Pneumonia Viral/virologia , Reação em Cadeia da Polimerase/métodos , Vírus/genética , Pré-Escolar , Estudos de Coortes , Infecções Comunitárias Adquiridas/diagnóstico , Humanos , Lactente , Recém-Nascido , Pneumonia Viral/diagnóstico , Análise de Sequência de RNA/métodosRESUMO
Background: The role of human bocavirus (HBoV) in respiratory illness is uncertain. HBoV genomic DNA is frequently detected in both ill and healthy children. We hypothesized that spliced viral capsid messenger RNA (mRNA) produced during active replication might be a better marker for acute infection. Methods: As part of the Etiology of Pneumonia in the Community (EPIC) study, children aged <18 years who were hospitalized with community-acquired pneumonia (CAP) and children asymptomatic at the time of elective outpatient surgery (controls) were enrolled. Nasopharyngeal/oropharyngeal specimens were tested for HBoV mRNA and genomic DNA by quantitative polymerase chain reaction. Results: HBoV DNA was detected in 10.4% of 1295 patients with CAP and 7.5% of 721 controls (odds ratio [OR], 1.4 [95% confidence interval {CI}, 1.0-2.0]); HBoV mRNA was detected in 2.1% and 0.4%, respectively (OR, 5.1 [95% CI, 1.6-26]). When adjusted for age, enrollment month, and detection of other respiratory viruses, HBoV mRNA detection (adjusted OR, 7.6 [95% CI, 1.5-38.4]) but not DNA (adjusted OR, 1.2 [95% CI, .6-2.4]) was associated with CAP. Among children with no other pathogens detected, HBoV mRNA (OR, 9.6 [95% CI, 1.9-82]) was strongly associated with CAP. Conclusions: Detection of HBoV mRNA but not DNA was associated with CAP, supporting a pathogenic role for HBoV in CAP. HBoV mRNA could be a useful target for diagnostic testing.
Assuntos
Bocavirus/isolamento & purificação , Proteínas do Capsídeo/genética , Infecções por Parvoviridae/diagnóstico , Pneumonia Viral/diagnóstico , RNA Mensageiro/isolamento & purificação , RNA Viral/isolamento & purificação , Doença Aguda , Bocavirus/genética , Estudos de Casos e Controles , Criança , Pré-Escolar , Infecções Comunitárias Adquiridas/diagnóstico , Infecções Comunitárias Adquiridas/virologia , Hospitalização , Humanos , Lactente , Masculino , Nasofaringe/virologia , Orofaringe/virologia , Estudos Prospectivos , Manejo de EspécimesRESUMO
Current infectious disease molecular tests are largely pathogen specific, requiring test selection based on the patient's symptoms. For many syndromes caused by a large number of viral, bacterial, or fungal pathogens, such as respiratory tract infections, this necessitates large panels of tests and has limited yield. In contrast, next-generation sequencing-based metagenomics can be used for unbiased detection of any expected or unexpected pathogen. However, barriers for its diagnostic implementation include incomplete understanding of analytical performance and complexity of sequence data analysis. We compared detection of known respiratory virus-positive (n= 42) and unselected (n= 67) pediatric nasopharyngeal swabs using an RNA sequencing (RNA-seq)-based metagenomics approach and Taxonomer, an ultrarapid, interactive, web-based metagenomics data analysis tool, with an FDA-cleared respiratory virus panel (RVP; GenMark eSensor). Untargeted metagenomics detected 86% of known respiratory virus infections, and additional PCR testing confirmed RVP results for only 2 (33%) of the discordant samples. In unselected samples, untargeted metagenomics had excellent agreement with the RVP (93%). In addition, untargeted metagenomics detected an additional 12 viruses that were either not targeted by the RVP or missed due to highly divergent genome sequences. Normalized viral read counts for untargeted metagenomics correlated with viral burden determined by quantitative PCR and showed high intrarun and interrun reproducibility. Partial or full-length viral genome sequences were generated in 86% of RNA-seq-positive samples, allowing assessment of antiviral resistance, strain-level typing, and phylogenetic relatedness. Overall, untargeted metagenomics had high agreement with a sensitive RVP, detected viruses not targeted by the RVP, and yielded epidemiologically and clinically valuable sequence information.
Assuntos
Metagenômica/métodos , Reação em Cadeia da Polimerase/métodos , Infecções Respiratórias/diagnóstico , Infecções Respiratórias/virologia , Análise de Sequência de RNA/métodos , Vírus/classificação , Vírus/isolamento & purificação , Pré-Escolar , Biologia Computacional/métodos , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Nasofaringe/virologia , Estudos Retrospectivos , Vírus/genéticaRESUMO
Genome-wide association studies suggest that common genetic variants explain only a modest fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability. Although DNA sequencing costs have fallen markedly, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions ('exomes'), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of 12 humans. These include eight HapMap individuals representing three populations, and four unrelated individuals with a rare dominantly inherited disorder, Freeman-Sheldon syndrome (FSS). We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for Mendelian disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of non-synonymous variants by predicted functional impact.
Assuntos
Éxons/genética , Predisposição Genética para Doença/genética , Testes Genéticos/métodos , Variação Genética/genética , Genoma Humano/genética , Análise de Sequência de DNA/métodos , Frequência do Gene/genética , Biblioteca Gênica , Genes Dominantes/genética , Haplótipos/genética , Humanos , Mutação INDEL/genética , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único/genética , Sítios de Splice de RNA/genética , Tamanho da Amostra , Sensibilidade e Especificidade , SíndromeRESUMO
The need for improved algorithmic support for variant prioritization and disease-gene identification in personal genomes data is widely acknowledged. We previously presented the Variant Annotation, Analysis, and Search Tool (VAAST), which employs an aggregative variant association test that combines both amino acid substitution (AAS) and allele frequencies. Here we describe and benchmark VAAST 2.0, which uses a novel conservation-controlled AAS matrix (CASM), to incorporate information about phylogenetic conservation. We show that the CASM approach improves VAAST's variant prioritization accuracy compared to its previous implementation, and compared to SIFT, PolyPhen-2, and MutationTaster. We also show that VAAST 2.0 outperforms KBAC, WSS, SKAT, and variable threshold (VT) using published case-control datasets for Crohn disease (NOD2), hypertriglyceridemia (LPL), and breast cancer (CHEK2). VAAST 2.0 also improves search accuracy on simulated datasets across a wide range of allele frequencies, population-attributable disease risks, and allelic heterogeneity, factors that compromise the accuracies of other aggregative variant association tests. We also demonstrate that, although most aggregative variant association tests are designed for common genetic diseases, these tests can be easily adopted as rare Mendelian disease-gene finders with a simple ranking-by-statistical-significance protocol, and the performance compares very favorably to state-of-art filtering approaches. The latter, despite their popularity, have suboptimal performance especially with the increasing case sample size.
Assuntos
Algoritmos , Substituição de Aminoácidos , Predisposição Genética para Doença , Variação Genética , Neoplasias da Mama/genética , Estudos de Casos e Controles , Quinase do Ponto de Checagem 2/genética , Doença de Crohn/genética , Bases de Dados Factuais , Feminino , Frequência do Gene , Humanos , Hipertrigliceridemia/genética , Lipase Lipoproteica/genética , Proteína Adaptadora de Sinalização NOD2/genética , Filogenia , Tamanho da Amostra , SoftwareRESUMO
BACKGROUND: High-throughput sequencing enables unbiased profiling of microbial communities, universal pathogen detection, and host response to infectious diseases. However, computation times and algorithmic inaccuracies have hindered adoption. RESULTS: We present Taxonomer, an ultrafast, web-tool for comprehensive metagenomics data analysis and interactive results visualization. Taxonomer is unique in providing integrated nucleotide and protein-based classification and simultaneous host messenger RNA (mRNA) transcript profiling. Using real-world case-studies, we show that Taxonomer detects previously unrecognized infections and reveals antiviral host mRNA expression profiles. To facilitate data-sharing across geographic distances in outbreak settings, Taxonomer is publicly available through a web-based user interface. CONCLUSIONS: Taxonomer enables rapid, accurate, and interactive analyses of metagenomics data on personal computers and mobile devices.
Assuntos
Perfilação da Expressão Gênica , Interações Hospedeiro-Patógeno/genética , Metagenômica/métodos , Software , Transcriptoma , Algoritmos , Bactérias/classificação , Bactérias/genética , Bases de Dados de Ácidos Nucleicos , Fungos/classificação , Fungos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Interface Usuário-Computador , Vírus/classificação , Vírus/genética , NavegadorRESUMO
Skeletal muscle is essential for mobility, stability and whole body metabolism, and muscle loss, for instance, during sarcopenia, has profound consequences. Satellite cells (muscle stem cells) have been hypothesized, but not yet demonstrated, to contribute to muscle homeostasis and a decline in their contribution to myofibre homeostasis to play a part in sarcopenia. To test their role in muscle maintenance, we genetically labelled and ablated satellite cells in adult sedentary mice. We demonstrate via genetic lineage experiments that, even in the absence of injury, satellite cells contribute to myofibres in all adult muscles, although the extent and timing differs. However, genetic ablation experiments showed that satellite cells are not globally required to maintain myofibre cross-sectional area of uninjured adult muscle.
Assuntos
Fibras Musculares Esqueléticas/patologia , Alelos , Animais , Cruzamentos Genéticos , Proteínas de Fluorescência Verde/metabolismo , Homeostase , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Fator de Transcrição PAX7/metabolismo , Regeneração , Sarcopenia/genética , Células Satélites de Músculo Esquelético/citologia , Fatores de TempoRESUMO
Adult muscle's exceptional capacity for regeneration is mediated by muscle stem cells, termed satellite cells. As with many stem cells, Wnt/ß-catenin signaling has been proposed to be critical in satellite cells during regeneration. Using new genetic reagents, we explicitly test in vivo whether Wnt/ß-catenin signaling is necessary and sufficient within satellite cells and their derivatives for regeneration. We find that signaling is transiently active in transit-amplifying myoblasts, but is not required for regeneration or satellite cell self-renewal. Instead, downregulation of transiently activated ß-catenin is important to limit the regenerative response, as continuous regeneration is deleterious. Wnt/ß-catenin activation in adult satellite cells may simply be a vestige of their developmental lineage, in which ß-catenin signaling is critical for fetal myogenesis. In the adult, surprisingly, we show that it is not activation but rather silencing of Wnt/ß-catenin signaling that is important for muscle regeneration.
Assuntos
Inativação Gênica , Músculos/fisiologia , Regeneração , Células-Tronco/citologia , Via de Sinalização Wnt , beta Catenina/genética , Animais , Linhagem Celular , Camundongos , Camundongos Endogâmicos C57BL , Desenvolvimento Muscular , Músculos/lesões , Mioblastos/citologia , Mioblastos/metabolismo , Células-Tronco/metabolismoRESUMO
The VAAST pipeline is specifically designed to identify disease-associated alleles in next-generation sequencing data. In the protocols presented in this paper, we outline the best practices for variant prioritization using VAAST. Examples and test data are provided for case-control, small pedigree, and large pedigree analyses. These protocols will teach users the fundamentals of VAAST, VAAST 2.0, and pVAAST analyses.
Assuntos
Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Estudos de Casos e Controles , Feminino , Humanos , Masculino , LinhagemRESUMO
ImagePlane is a modular pipeline for automated, high-throughput image analysis and information extraction. Designed to support planarian research, ImagePlane offers a self-parameterizing adaptive thresholding algorithm; an algorithm that can automatically segment animals into anterior-posterior/left-right quadrants for automated identification of region-specific differences in gene and protein expression; and a novel algorithm for quantification of morphology of animals, independent of their orientations and sizes. ImagePlane also provides methods for automatic report generation, and its outputs can be easily imported into third-party tools such as R and Excel. Here we demonstrate the pipeline's utility for identification of genes involved in stem cell proliferation in the planarian Schmidtea mediterranea. Although designed to support planarian studies, ImagePlane will prove useful for cell-based studies as well.