Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 48
Filter
Add more filters

Publication year range
1.
Brief Bioinform ; 23(5)2022 09 20.
Article in English | MEDLINE | ID: mdl-35945154

ABSTRACT

As recently demonstrated by the COVID-19 pandemic, large-scale pathogen genomic data are crucial to characterize transmission patterns of human infectious diseases. Yet, current methods to process raw sequence data into analysis-ready variants remain slow to scale, hampering rapid surveillance efforts and epidemiological investigations for disease control. Here, we introduce an accelerated, scalable, reproducible, and cost-effective framework for pathogen genomic variant identification and present an evaluation of its performance and accuracy across benchmark datasets of Plasmodium falciparum malaria genomes. We demonstrate superior performance of the GPU framework relative to standard pipelines with mean execution time and computational costs reduced by 27× and 4.6×, respectively, while delivering 99.9% accuracy at enhanced reproducibility.


Subject(s)
COVID-19 , Communicable Diseases , Malaria , COVID-19/epidemiology , COVID-19/genetics , Genomics/methods , Humans , Pandemics , Reproducibility of Results
2.
Hum Mutat ; 43(12): 1979-1993, 2022 12.
Article in English | MEDLINE | ID: mdl-36054329

ABSTRACT

Detection of de novo variants (DNVs) is critical for studies of disease-related variation and mutation rates. To accelerate DNV calling, we developed a graphics processing units-based workflow. We applied our workflow to whole-genome sequencing data from three parent-child sequenced cohorts including the Simons Simplex Collection (SSC), Simons Foundation Powering Autism Research (SPARK), and the 1000 Genomes Project (1000G) that were sequenced using DNA from blood, saliva, and lymphoblastoid cell lines (LCLs), respectively. The SSC and SPARK DNV callsets were within expectations for number of DNVs, percent at CpG sites, phasing to the paternal chromosome of origin, and average allele balance. However, the 1000G DNV callset was not within expectations and contained excessive DNVs that are likely cell line artifacts. Mutation signature analysis revealed 30% of 1000G DNV signatures matched B-cell lymphoma. Furthermore, we found variants in DNA repair genes and at Clinvar pathogenic or likely-pathogenic sites and significant excess of protein-coding DNVs in IGLL5; a gene known to be involved in B-cell lymphomas. Our study provides a new rapid DNV caller for the field and elucidates important implications of using sequencing data from LCLs for reference building and disease-related projects.


Subject(s)
Neoplasms , Humans , Alleles , Mutation , Neoplasms/genetics , Whole Genome Sequencing
3.
PLoS Genet ; 12(3): e1005851, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26943675

ABSTRACT

Controlling for background demographic effects is important for accurately identifying loci that have recently undergone positive selection. To date, the effects of demography have not yet been explicitly considered when identifying loci under selection during dog domestication. To investigate positive selection on the dog lineage early in the domestication, we examined patterns of polymorphism in six canid genomes that were previously used to infer a demographic model of dog domestication. Using an inferred demographic model, we computed false discovery rates (FDR) and identified 349 outlier regions consistent with positive selection at a low FDR. The signals in the top 100 regions were frequently centered on candidate genes related to brain function and behavior, including LHFPL3, CADM2, GRIK3, SH3GL2, MBP, PDE7B, NTAN1, and GLRA1. These regions contained significant enrichments in behavioral ontology categories. The 3rd top hit, CCRN4L, plays a major role in lipid metabolism, that is supported by additional metabolism related candidates revealed in our scan, including SCP2D1 and PDXC1. Comparing our method to an empirical outlier approach that does not directly account for demography, we found only modest overlaps between the two methods, with 60% of empirical outliers having no overlap with our demography-based outlier detection approach. Demography-aware approaches have lower-rates of false discovery. Our top candidates for selection, in addition to expanding the set of neurobehavioral candidate genes, include genes related to lipid metabolism, suggesting a dietary target of selection that was important during the period when proto-dogs hunted and fed alongside hunter-gatherers.


Subject(s)
Genetics, Population , Genomics , Lipid Metabolism/genetics , Selection, Genetic , Animals , Demography , Dogs , Genome , Polymorphism, Single Nucleotide
4.
Int J Gynecol Cancer ; 28(3): 479-485, 2018 03.
Article in English | MEDLINE | ID: mdl-29324546

ABSTRACT

OBJECTIVES: The objectives of this study were to assess if targeted investigation for tumor-specific mutations by ultradeep DNA sequencing of peritoneal washes of ovarian cancer patients after primary surgical debulking and chemotherapy, and clinically diagnosed as disease free, provides a more sensitive and specific method to assess actual treatment response and tailor future therapy and to compare this "molecular second look" with conventional cytology and histopathology-based findings. METHODS/MATERIALS: We identified 10 patients with advanced-stage, high-grade serous ovarian cancer who had undergone second-look laparoscopy and for whom DNA could be isolated from biobanked paired blood, primary and recurrent tumor, and second-look peritoneal washes. A targeted 56 gene cancer-relevant panel was used for next-generation sequencing (average coverage, >6500×). Mutations were validated using either digital droplet polymerase chain reaction (ddPCR) or Sanger sequencing. RESULTS: A total of 25 tumor-specific mutations were identified (median, 2/patient; range, 1-8). TP53 mutations were identified in at least 1 sample from all patients. All 5 pathology-based second-look positive patients were confirmed positive by molecular second look. Genetic analysis revealed that 3 of the 5 pathology-based negative second looks were actually positive. In the 2 patients, the second-look mutations were present in either the original primary or recurrent tumors. In the third, 2 high-frequency, novel frameshift mutations in MSH6 and HNF1A were identified. CONCLUSIONS: The molecular second look detects tumor-specific evidence of residual disease and provides genetic insight into tumor evolution and future recurrences beyond standard pathology. In the precision medicine era, detecting and genetically characterizing residual disease after standard treatment will be invaluable for improving patient outcomes.


Subject(s)
Cystadenocarcinoma, Serous/genetics , Ovarian Neoplasms/genetics , Aged , Alleles , Cystadenocarcinoma, Serous/pathology , DNA Mutational Analysis , DNA, Neoplasm/genetics , DNA, Neoplasm/isolation & purification , Female , High-Throughput Nucleotide Sequencing , Humans , Middle Aged , Mutation , Ovarian Neoplasms/pathology , Precision Medicine/methods , Proof of Concept Study
5.
Genome Res ; 24(2): 200-11, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24221193

ABSTRACT

Intra-tumor heterogeneity is a hallmark of many cancers and may lead to therapy resistance or interfere with personalized treatment strategies. Here, we combined topographic mapping of somatic breakpoints and transcriptional profiling to probe intra-tumor heterogeneity of treatment-naïve stage IIIC/IV epithelial ovarian cancer. We observed that most substantial differences in genomic rearrangement landscapes occurred between metastases in the omentum and peritoneum versus tumor sites in the ovaries. Several cancer genes such as NF1, CDKN2A, and FANCD2 were affected by lesion-specific breakpoints. Furthermore, the intra-tumor variability involved different mutational hallmarks including lesion-specific kataegis (local mutation shower coinciding with genomic breakpoints), rearrangement classes, and coding mutations. In one extreme case, we identified two independent TP53 mutations in ovary tumors and omentum/peritoneum metastases, respectively. Examination of gene expression dynamics revealed up-regulation of key cancer pathways including WNT, integrin, chemokine, and Hedgehog signaling in only subsets of tumor samples from the same patient. Finally, we took advantage of the multilevel tumor analysis to understand the effects of genomic breakpoints on qualitative and quantitative gene expression changes. We show that intra-tumor gene expression differences are caused by site-specific genomic alterations, including formation of in-frame fusion genes. These data highlight the plasticity of ovarian cancer genomes, which may contribute to their strong capacity to adapt to changing environmental conditions and give rise to the high rate of recurrent disease following standard treatment regimes.


Subject(s)
Chromosome Aberrations , Gene Expression Regulation, Neoplastic , Genome, Human , Ovarian Neoplasms/genetics , Aged , Cyclin-Dependent Kinase Inhibitor p16/genetics , Fanconi Anemia Complementation Group D2 Protein/genetics , Female , Gene Expression Profiling , Humans , Middle Aged , Neoplasm Metastasis , Neoplasm Staging , Neurofibromatosis 1/genetics , Omentum/metabolism , Omentum/pathology , Oncogene Proteins, Fusion/genetics , Ovarian Neoplasms/pathology , Peritoneum/metabolism , Peritoneum/pathology , Tumor Suppressor Protein p53/genetics
6.
Genome Res ; 24(5): 733-42, 2014 May.
Article in English | MEDLINE | ID: mdl-24760347

ABSTRACT

The somatic mutation burden in healthy white blood cells (WBCs) is not well known. Based on deep whole-genome sequencing, we estimate that approximately 450 somatic mutations accumulated in the nonrepetitive genome within the healthy blood compartment of a 115-yr-old woman. The detected mutations appear to have been harmless passenger mutations: They were enriched in noncoding, AT-rich regions that are not evolutionarily conserved, and they were depleted for genomic elements where mutations might have favorable or adverse effects on cellular fitness, such as regions with actively transcribed genes. The distribution of variant allele frequencies of these mutations suggests that the majority of the peripheral white blood cells were offspring of two related hematopoietic stem cell (HSC) clones. Moreover, telomere lengths of the WBCs were significantly shorter than telomere lengths from other tissues. Together, this suggests that the finite lifespan of HSCs, rather than somatic mutation effects, may lead to hematopoietic clonal evolution at extreme ages.


Subject(s)
Clonal Evolution , Hematopoiesis , Leukocytes/metabolism , Longevity/genetics , Mutation , AT Rich Sequence , Aged, 80 and over , Cell Lineage , Conserved Sequence , Female , Gene Frequency , Genome , Hematopoietic Stem Cells/cytology , Hematopoietic Stem Cells/metabolism , Hematopoietic Stem Cells/physiology , Humans , Leukocytes/cytology , Leukocytes/physiology , Telomere/genetics , Telomere Shortening
7.
PLoS Genet ; 10(5): e1004353, 2014 May.
Article in English | MEDLINE | ID: mdl-24809476

ABSTRACT

Genome sequencing of the 5,300-year-old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has yielded new insights into his origin and relationship to modern European populations. A key finding of that study was an apparent recent common ancestry with individuals from Sardinia, based largely on the Y chromosome haplogroup and common autosomal SNP variation. Here, we compiled and analyzed genomic datasets from both modern and ancient Europeans, including genome sequence data from over 400 Sardinians and two ancient Thracians from Bulgaria, to investigate this result in greater detail and determine its implications for the genetic structure of Neolithic Europe. Using whole-genome sequencing data, we confirm that the Iceman is, indeed, most closely related to Sardinians. Furthermore, we show that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context. We hypothesize that this genetic affinity of ancient samples from different parts of Europe with Sardinians represents a common genetic component that was geographically widespread across Europe during the Neolithic, likely related to migrations and population expansions associated with the spread of agriculture.


Subject(s)
Fossils , Genetics, Population , Genome, Human , Europe , Female , Humans , Polymorphism, Single Nucleotide
8.
PLoS Genet ; 10(1): e1004016, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24453982

ABSTRACT

To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow. In dogs, the domestication bottleneck involved at least a 16-fold reduction in population size, a much more severe bottleneck than estimated previously. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was substantially larger than represented by modern wolf populations. We narrow the plausible range for the date of initial dog domestication to an interval spanning 11-16 thousand years ago, predating the rise of agriculture. In light of this finding, we expand upon previous work regarding the increase in copy number of the amylase gene (AMY2B) in dogs, which is believed to have aided digestion of starch in agricultural refuse. We find standing variation for amylase copy number variation in wolves and little or no copy number increase in the Dingo and Husky lineages. In conjunction with the estimated timing of dog origins, these results provide additional support to archaeological finds, suggesting the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that, surprisingly, none of the extant wolf lineages from putative domestication centers is more closely related to dogs, and, instead, the sampled wolves form a sister monophyletic clade. This result, in combination with dog-wolf admixture during the process of domestication, suggests that a re-evaluation of past hypotheses regarding dog origins is necessary.


Subject(s)
Amylases/genetics , Animals, Domestic/genetics , DNA Copy Number Variations/genetics , Evolution, Molecular , Animals , DNA, Mitochondrial/genetics , Diet , Dogs , Genetic Variation , Phylogeny , Population Density , Wolves/classification , Wolves/genetics
9.
PLoS Med ; 13(12): e1002206, 2016 Dec.
Article in English | MEDLINE | ID: mdl-28027320

ABSTRACT

BACKGROUND: Endometrial cancer is the most common gynecologic malignancy, and its incidence and associated mortality are increasing. Despite the immediate need to detect these cancers at an earlier stage, there is no effective screening methodology or protocol for endometrial cancer. The comprehensive, genomics-based analysis of endometrial cancer by The Cancer Genome Atlas (TCGA) revealed many of the molecular defects that define this cancer. Based on these cancer genome results, and in a prospective study, we hypothesized that the use of ultra-deep, targeted gene sequencing could detect somatic mutations in uterine lavage fluid obtained from women undergoing hysteroscopy as a means of molecular screening and diagnosis. METHODS AND FINDINGS: Uterine lavage and paired blood samples were collected and analyzed from 107 consecutive patients who were undergoing hysteroscopy and curettage for diagnostic evaluation from this single-institution study. The lavage fluid was separated into cellular and acellular fractions by centrifugation. Cellular and cell-free DNA (cfDNA) were isolated from each lavage. Two targeted next-generation sequencing (NGS) gene panels, one composed of 56 genes and the other of 12 genes, were used for ultra-deep sequencing. To rule out potential NGS-based errors, orthogonal mutation validation was performed using digital PCR and Sanger sequencing. Seven patients were diagnosed with endometrial cancer based on classic histopathologic analysis. Six of these patients had stage IA cancer, and one of these cancers was only detectable as a microscopic focus within a polyp. All seven patients were found to have significant cancer-associated gene mutations in both cell pellet and cfDNA fractions. In the four patients in whom adequate tumor sample was available, all tumor mutations above a specific allele fraction were present in the uterine lavage DNA samples. Mutations originally only detected in lavage fluid fractions were later confirmed to be present in tumor but at allele fractions significantly less than 1%. Of the remaining 95 patients diagnosed with benign or non-cancer pathology, 44 had no significant cancer mutations detected. Intriguingly, 51 patients without histopathologic evidence of cancer had relatively high allele fraction (1.0%-30.4%), cancer-associated mutations. Participants with detected driver and potential driver mutations were significantly older (mean age mutated = 57.96, 95% confidence interval [CI]: 3.30-∞, mean age no mutations = 50.35; p-value = 0.002; Benjamini-Hochberg [BH] adjusted p-value = 0.015) and more likely to be post-menopausal (p-value = 0.004; BH-adjusted p-value = 0.015) than those without these mutations. No associations were detected between mutation status and race/ethnicity, body mass index, diabetes, parity, and smoking status. Long-term follow-up was not presently available in this prospective study for those women without histopathologic evidence of cancer. CONCLUSIONS: Using ultra-deep NGS, we identified somatic mutations in DNA extracted both from cell pellets and a never previously reported cfDNA fraction from the uterine lavage. Using our targeted sequencing approach, endometrial driver mutations were identified in all seven women who received a cancer diagnosis based on classic histopathology of tissue curettage obtained at the time of hysteroscopy. In addition, relatively high allele fraction driver mutations were identified in the lavage fluid of approximately half of the women without a cancer diagnosis. Increasing age and post-menopausal status were associated with the presence of these cancer-associated mutations, suggesting the prevalent existence of a premalignant landscape in women without clinical evidence of cancer. Given that a uterine lavage can be easily and quickly performed even outside of the operating room and in a physician's office-based setting, our findings suggest the future possibility of this approach for screening women for the earliest stages of endometrial cancer. However, our findings suggest that further insight into development of cancer or its interruption are needed before translation to the clinic.


Subject(s)
DNA, Neoplasm , Endometrial Neoplasms/genetics , Genome , Mutation , Uterus/metabolism , Adult , Aged , Aged, 80 and over , Cross-Sectional Studies , Endometrial Neoplasms/pathology , Female , Humans , Middle Aged , Prospective Studies , Therapeutic Irrigation
10.
Nature ; 463(7283): 943-7, 2010 Feb 18.
Article in English | MEDLINE | ID: mdl-20164927

ABSTRACT

The genetic structure of the indigenous hunter-gatherer peoples of southern Africa, the oldest known lineage of modern human, is important for understanding human diversity. Studies based on mitochondrial and small sets of nuclear markers have shown that these hunter-gatherers, known as Khoisan, San, or Bushmen, are genetically divergent from other humans. However, until now, fully sequenced human genomes have been limited to recently diverged populations. Here we present the complete genome sequences of an indigenous hunter-gatherer from the Kalahari Desert and a Bantu from southern Africa, as well as protein-coding regions from an additional three hunter-gatherers from disparate regions of the Kalahari. We characterize the extent of whole-genome and exome diversity among the five men, reporting 1.3 million novel DNA differences genome-wide, including 13,146 novel amino acid variants. In terms of nucleotide substitutions, the Bushmen seem to be, on average, more different from each other than, for example, a European and an Asian. Observed genomic differences between the hunter-gatherers and others may help to pinpoint genetic adaptations to an agricultural lifestyle. Adding the described variants to current databases will facilitate inclusion of southern Africans in medical research efforts, particularly when family and medical histories can be correlated with genome-wide data.


Subject(s)
Black People/genetics , Ethnicity/genetics , Genome, Human/genetics , Asian People/genetics , Exons/genetics , Genetics, Medical , Humans , Phylogeny , Polymorphism, Single Nucleotide/genetics , South Africa/ethnology , White People/genetics
11.
Scand J Gastroenterol ; 50(9): 1076-87, 2015.
Article in English | MEDLINE | ID: mdl-25865706

ABSTRACT

OBJECTIVE: Breath testing and duodenal culture studies suggest that a significant proportion of irritable bowel syndrome (IBS) patients have small intestinal bacterial overgrowth. In this study, we extended these data through 16S rDNA amplicon sequencing and quantitative PCR (qPCR) analyses of duodenal aspirates from a large cohort of IBS, non-IBS and control subjects. MATERIALS AND METHODS: Consecutive subjects presenting for esophagogastroduodenoscopy only and healthy controls were recruited. Exclusion criteria included recent antibiotic or probiotic use. Following extensive medical work-up, patients were evaluated for symptoms of IBS. DNAs were isolated from duodenal aspirates obtained during endoscopy. Microbial populations in a subset of IBS subjects and controls were compared by 16S profiling. Duodenal microbes were then quantitated in the entire cohort by qPCR and the results compared with quantitative live culture data. RESULTS: A total of 258 subjects were recruited (21 healthy, 163 non-healthy non-IBS, and 74 IBS). 16S profiling in five IBS and five control subjects revealed significantly lower microbial diversity in the duodenum in IBS, with significant alterations in 12 genera (false discovery rate < 0.15), including overrepresentation of Escherichia/Shigella (p = 0.005) and Aeromonas (p = 0.051) and underrepresentation of Acinetobacter (p = 0.024), Citrobacter (p = 0.031) and Microvirgula (p = 0.036). qPCR in all 258 subjects confirmed greater levels of Escherichia coli in IBS and also revealed increases in Klebsiella spp, which correlated strongly with quantitative culture data. CONCLUSIONS: 16S rDNA sequencing confirms microbial overgrowth in the small bowel in IBS, with a concomitant reduction in diversity. qPCR supports alterations in specific microbial populations in IBS.


Subject(s)
DNA, Bacterial/analysis , DNA, Bacterial/isolation & purification , Duodenum/microbiology , Feces/microbiology , Gastrointestinal Microbiome/genetics , Irritable Bowel Syndrome/microbiology , Adult , Aged , Aged, 80 and over , Case-Control Studies , Endoscopy, Gastrointestinal , Female , Humans , Male , Middle Aged , Prospective Studies , Real-Time Polymerase Chain Reaction
12.
Proc Natl Acad Sci U S A ; 109(20): 7693-8, 2012 May 15.
Article in English | MEDLINE | ID: mdl-22529356

ABSTRACT

Using a combination of whole-genome resequencing and high-density genotyping arrays, genome-wide haplotypes were reconstructed for two of the most important bulls in the history of the dairy cattle industry, Pawnee Farm Arlinda Chief ("Chief") and his son Walkway Chief Mark ("Mark"), each accounting for ∼7% of all current genomes. We aligned 20.5 Gbp (∼7.3× coverage) and 37.9 Gbp (∼13.5× coverage) of the Chief and Mark genomic sequences, respectively. More than 1.3 million high-quality SNPs were detected in Chief and Mark sequences. The genome-wide haplotypes inherited by Mark from Chief were reconstructed using ∼1 million informative SNPs. Comparison of a set of 15,826 SNPs that overlapped in the sequence-based and BovineSNP50 SNPs showed the accuracy of the sequence-based haplotype reconstruction to be as high as 97%. By using the BovineSNP50 genotypes, the frequencies of Chief alleles on his two haplotypes then were determined in 1,149 of his descendants, and the distribution was compared with the frequencies that would be expected assuming no selection. We identified 49 chromosomal segments in which Chief alleles showed strong evidence of selection. Candidate polymorphisms for traits that have been under selection in the dairy cattle population then were identified by referencing Chief's DNA sequence within these selected chromosome blocks. Eleven candidate genes were identified with functions related to milk-production, fertility, and disease-resistance traits. These data demonstrate that haplotype reconstruction of an ancestral proband by whole-genome resequencing in combination with high-density SNP genotyping of descendants can be used for rapid, genome-wide identification of the ancestor's alleles that have been subjected to artificial selection.


Subject(s)
Breeding/methods , Cattle/genetics , Genome/genetics , Haplotypes/genetics , Selection, Genetic , Animals , Base Sequence , Genetic Association Studies/veterinary , Genotype , Male , Molecular Sequence Data , Polymorphism, Single Nucleotide/genetics , Sequence Analysis, DNA
13.
BMC Genomics ; 15: 654, 2014 Aug 05.
Article in English | MEDLINE | ID: mdl-25096633

ABSTRACT

BACKGROUND: Vibrio cholerae is a globally dispersed pathogen that has evolved with humans for centuries, but also includes non-pathogenic environmental strains. Here, we identify the genomic variability underlying this remarkable persistence across the three major niche dimensions space, time, and habitat. RESULTS: Taking an innovative approach of genome-wide association applicable to microbial genomes (GWAS-M), we classify 274 complete V. cholerae genomes by niche, including 39 newly sequenced for this study with the Ion Torrent DNA-sequencing platform. Niche metadata were collected for each strain and analyzed together with comprehensive annotations of genetic and genomic attributes, including point mutations (single-nucleotide polymorphisms, SNPs), protein families, functions and prophages. CONCLUSIONS: Our analysis revealed that genomic variations, in particular mobile functions including phages, prophages, transposable elements, and plasmids underlie the metadata structuring in each of the three niche dimensions. This underscores the role of phages and mobile elements as the most rapidly evolving elements in bacterial genomes, creating local endemicity (space), leading to temporal divergence (time), and allowing the invasion of new habitats. Together, we take a data-driven approach for comparative functional genomics that exploits high-volume genome sequencing and annotation, in conjunction with novel statistical and machine learning analyses to identify connections between genotype and phenotype on a genome-wide scale.


Subject(s)
Genome, Bacterial , Vibrio cholerae/genetics , Cholera/epidemiology , Cholera/microbiology , DNA Transposable Elements , Environmental Microbiology , Evolution, Molecular , Genetic Variation , Genotype , Humans , Molecular Sequence Annotation , Phylogeny , Phylogeography , Polymorphism, Single Nucleotide , Sequence Analysis, DNA , Vibrio cholerae/isolation & purification
14.
Appl Environ Microbiol ; 80(24): 7583-91, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25261520

ABSTRACT

High-throughput sequencing of the taxonomically informative 16S rRNA gene provides a powerful approach for exploring microbial diversity. Here we compare the performances of two common "benchtop" sequencing platforms, Illumina MiSeq and Ion Torrent Personal Genome Machine (PGM), for bacterial community profiling by 16S rRNA (V1-V2) amplicon sequencing. We benchmarked performance by using a 20-organism mock bacterial community and a collection of primary human specimens. We observed comparatively higher error rates with the Ion Torrent platform and report a pattern of premature sequence truncation specific to semiconductor sequencing. Read truncation was dependent on both the directionality of sequencing and the target species, resulting in organism-specific biases in community profiles. We found that these sequencing artifacts could be minimized by using bidirectional amplicon sequencing and an optimized flow order on the Ion Torrent platform. Results of bacterial community profiling performed on the mock community and a collection of 18 human-derived microbiological specimens were generally in good agreement for both platforms; however, in some cases, results differed significantly. Disparities could be attributed to the failure to generate full-length reads for particular organisms on the Ion Torrent platform, organism-dependent differences in sequence error rates affecting classification of certain species, or some combination of these factors. This study demonstrates the potential for differential bias in bacterial community profiles resulting from the choice of sequencing platform alone.


Subject(s)
Bacteria/isolation & purification , Bacterial Infections/microbiology , DNA, Bacterial/genetics , High-Throughput Nucleotide Sequencing/methods , RNA, Ribosomal, 16S/genetics , Bacteria/classification , Bacteria/genetics , High-Throughput Nucleotide Sequencing/instrumentation , Humans
15.
Nature ; 456(7220): 387-90, 2008 Nov 20.
Article in English | MEDLINE | ID: mdl-19020620

ABSTRACT

In 1994, two independent groups extracted DNA from several Pleistocene epoch mammoths and noted differences among individual specimens. Subsequently, DNA sequences have been published for a number of extinct species. However, such ancient DNA is often fragmented and damaged, and studies to date have typically focused on short mitochondrial sequences, never yielding more than a fraction of a per cent of any nuclear genome. Here we describe 4.17 billion bases (Gb) of sequence from several mammoth specimens, 3.3 billion (80%) of which are from the woolly mammoth (Mammuthus primigenius) genome and thus comprise an extensive set of genome-wide sequence from an extinct species. Our data support earlier reports that elephantid genomes exceed 4 Gb. The estimated divergence rate between mammoth and African elephant is half of that between human and chimpanzee. The observed number of nucleotide differences between two particular mammoths was approximately one-eighth of that between one of them and the African elephant, corresponding to a separation between the mammoths of 1.5-2.0 Myr. The estimated probability that orthologous elephant and mammoth amino acids differ is 0.002, corresponding to about one residue per protein. Differences were discovered between mammoth and African elephant in amino-acid positions that are otherwise invariant over several billion years of combined mammalian evolution. This study shows that nuclear genome sequencing of extinct species can reveal population differences not evident from the fossil record, and perhaps even discover genetic factors that affect extinction.


Subject(s)
Cell Nucleus/genetics , Elephants/genetics , Evolution, Molecular , Extinction, Biological , Fossils , Genome/genetics , Genomics , Sequence Analysis, DNA/methods , Africa , Animals , Conserved Sequence/genetics , Elephants/anatomy & histology , Female , Hair/metabolism , Humans , India , Male , Phylogeny
16.
Proc Natl Acad Sci U S A ; 108(30): 12449-54, 2011 Jul 26.
Article in English | MEDLINE | ID: mdl-21746916

ABSTRACT

Anticancer drugs are effective against tumors that depend on the molecular target of the drug. Known targets of cytotoxic anticancer drugs are involved in cell proliferation; drugs acting on such targets are ineffective against nonproliferating tumor cells, survival of which leads to eventual therapy failure. Function-based genomic screening identified the coatomer protein complex ζ1 (COPZ1) gene as essential for different tumor cell types but not for normal cells. COPZ1 encodes a subunit of coatomer protein complex 1 (COPI) involved in intracellular traffic and autophagy. The knockdown of COPZ1, but not of COPZ2 encoding isoform coatomer protein complex ζ2, caused Golgi apparatus collapse, blocked autophagy, and induced apoptosis in both proliferating and nondividing tumor cells. In contrast, inhibition of normal cell growth required simultaneous knockdown of both COPZ1 and COPZ2. COPZ2 (but not COPZ1) was down-regulated in the majority of tumor cell lines and in clinical samples of different cancer types. Reexpression of COPZ2 protected tumor cells from killing by COPZ1 knockdown, indicating that tumor cell dependence on COPZ1 is the result of COPZ2 silencing. COPZ2 displays no tumor-suppressive activities, but it harbors microRNA 152, which is silenced in tumor cells concurrently with COPZ2 and acts as a tumor suppressor in vitro and in vivo. Silencing of microRNA 152 in different cancers and the ensuing down-regulation of its host gene COPZ2 offer a therapeutic opportunity for proliferation-independent selective killing of tumor cells by COPZ1-targeting agents.


Subject(s)
Coatomer Protein/genetics , Neoplasms/genetics , Apoptosis/genetics , Autophagy/genetics , Base Sequence , Cell Line, Tumor , DNA, Neoplasm/genetics , Female , Gene Knockdown Techniques , Gene Silencing , Golgi Apparatus/genetics , Golgi Apparatus/pathology , Humans , Male , MicroRNAs/genetics , Neoplasms/pathology , Protein Isoforms/genetics , RNA, Neoplasm/genetics , RNA, Small Interfering/genetics , Suppression, Genetic
17.
PLoS Genet ; 7(2): e1002007, 2011 Feb 10.
Article in English | MEDLINE | ID: mdl-21347285

ABSTRACT

Leaf-cutter ants are one of the most important herbivorous insects in the Neotropics, harvesting vast quantities of fresh leaf material. The ants use leaves to cultivate a fungus that serves as the colony's primary food source. This obligate ant-fungus mutualism is one of the few occurrences of farming by non-humans and likely facilitated the formation of their massive colonies. Mature leaf-cutter ant colonies contain millions of workers ranging in size from small garden tenders to large soldiers, resulting in one of the most complex polymorphic caste systems within ants. To begin uncovering the genomic underpinnings of this system, we sequenced the genome of Atta cephalotes using 454 pyrosequencing. One prediction from this ant's lifestyle is that it has undergone genetic modifications that reflect its obligate dependence on the fungus for nutrients. Analysis of this genome sequence is consistent with this hypothesis, as we find evidence for reductions in genes related to nutrient acquisition. These include extensive reductions in serine proteases (which are likely unnecessary because proteolysis is not a primary mechanism used to process nutrients obtained from the fungus), a loss of genes involved in arginine biosynthesis (suggesting that this amino acid is obtained from the fungus), and the absence of a hexamerin (which sequesters amino acids during larval development in other insects). Following recent reports of genome sequences from other insects that engage in symbioses with beneficial microbes, the A. cephalotes genome provides new insights into the symbiotic lifestyle of this ant and advances our understanding of host-microbe symbioses.


Subject(s)
Ants/physiology , Genome, Insect/genetics , Plant Leaves/physiology , Symbiosis , Animals , Ants/genetics , Arginine/genetics , Arginine/metabolism , Base Sequence , Fungi/genetics , Insect Proteins/genetics , Insect Proteins/metabolism , Sequence Analysis, DNA , Serine Proteases/genetics , Serine Proteases/metabolism
18.
Hum Mutat ; 34(9): 1231-41, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23636849

ABSTRACT

Metastatic castration-resistant prostate cancer (mCRPC) is a lethal disease, and molecular markers that differentiate indolent from aggressive subtypes are needed. We sequenced the exomes of five metastatic tumors and healthy kidney tissue from an index case with mCRPC to identify lesions associated with disease progression and metastasis. An Ashkenazi Jewish (AJ) germline founder mutation, del185AG in BRCA1, was observed and AJ ancestry was confirmed. Sixty-two somatic variants altered proteins in tumors, including cancer-associated genes, TMPRSS2-ERG, PBRM1, and TET2. The majority (n = 53) of somatic variants were present in all metastases and only a subset (n = 31) was observed in the primary tumor. Integrating tumor next-generation sequencing and DNA copy number showed somatic loss of BRCA1 and TMPRSS2-ERG. We sequenced 19 genes with deleterious mutations in the index case in additional mCRPC samples and detected a frameshift, two somatic missense alterations, tumor loss of heterozygosity, and combinations of germline missense SNPs in TET2. In summary, genetic analysis of metastases from an index case permitted us to infer a chronology for the clonal spread of disease based on sequential accrual of somatic lesions. The role of TET2 in mCRPC deserves additional analysis and may define a subset of metastatic disease.


Subject(s)
DNA-Binding Proteins/genetics , Genes, BRCA1 , Neoplasm Metastasis/genetics , Nuclear Proteins/genetics , Oncogene Proteins, Fusion/genetics , Prostatic Neoplasms, Castration-Resistant/genetics , Prostatic Neoplasms, Castration-Resistant/pathology , Proto-Oncogene Proteins/genetics , Transcription Factors/genetics , Aged , Amino Acid Sequence , Dioxygenases , Frameshift Mutation , Germ-Line Mutation , Humans , Loss of Heterozygosity , Male , Middle Aged , Molecular Sequence Data , Mutation, Missense , Neoplasm Metastasis/pathology , Phylogeny , Polymorphism, Single Nucleotide , Sequence Analysis, DNA
19.
Dev Biol ; 370(1): 42-51, 2012 Oct 01.
Article in English | MEDLINE | ID: mdl-22841627

ABSTRACT

The capacity for tissue and organ regeneration in humans is dwarfed by comparison to that of salamanders. Emerging evidence suggests that mechanisms learned from the early phase of salamander limb regeneration-wound healing, cellular dedifferentiation and blastemal formation-will reveal therapeutic approaches for tissue regeneration in humans. Here we describe a unique transcriptional fingerprint of regenerating limb tissue in the Mexican axolotl (Ambystoma mexicanum) that is indicative of cellular reprogramming of differentiated cells to a germline-like state. Two genes that are required for self-renewal of germ cells in mice and flies, Piwi-like 1 (PL1) and Piwi-like 2 (PL2), are expressed in limb blastemal cells, the basal layer keratinocytes and the thickened apical epithelial cap in the wound epidermis in the regenerating limb. Depletion of PL1 and PL2 by morpholino oligonucleotides decreased cell proliferation and increased cell death in the blastema leading to a significant retardation of regeneration. Examination of key molecules that are known to be required for limb development or regeneration further revealed that FGF8 is transcriptionally downregulated in the presence of the morpholino oligos, indicating PL1 and PL2 might participate in FGF signaling during limb regeneration. Given the requirement for FGF signaling in limb development and regeneration, the results suggest that PL1 and PL2 function to establish a unique germline-like state that is associated with successful regeneration.


Subject(s)
Ambystoma mexicanum/physiology , Extremities/physiology , Gene Expression Regulation, Developmental/physiology , Germ Cells/metabolism , Regeneration/physiology , Ambystoma mexicanum/genetics , Amino Acid Sequence , Animals , Argonaute Proteins/genetics , Argonaute Proteins/metabolism , Cell Differentiation/physiology , Cell Proliferation , Gene Expression Regulation, Developmental/genetics , Gene Knockdown Techniques , Molecular Sequence Data , Morpholinos/genetics , Regeneration/genetics , Wound Healing/physiology
20.
BMC Genomics ; 14: 257, 2013 Apr 16.
Article in English | MEDLINE | ID: mdl-23590730

ABSTRACT

BACKGROUND: Paired-tag sequencing approaches are commonly used for the analysis of genome structure. However, mammalian genomes have a complex organization with a variety of repetitive elements that complicate comprehensive genome-wide analyses. RESULTS: Here, we systematically assessed the utility of paired-end and mate-pair (MP) next-generation sequencing libraries with insert sizes ranging from 170 bp to 25 kb, for genome coverage and for improving scaffolding of a mammalian genome (Rattus norvegicus). Despite a lower library complexity, large insert MP libraries (20 or 25 kb) provided very high physical genome coverage and were found to efficiently span repeat elements in the genome. Medium-sized (5, 8 or 15 kb) MP libraries were much more efficient for genome structure analysis than the more commonly used shorter insert paired-end and 3 kb MP libraries. Furthermore, the combination of medium- and large insert libraries resulted in a 3-fold increase in N50 in scaffolding processes. Finally, we show that our data can be used to evaluate and improve contig order and orientation in the current rat reference genome assembly. CONCLUSIONS: We conclude that applying combinations of mate-pair libraries with insert sizes that match the distributions of repetitive elements improves contig scaffolding and can contribute to the finishing of draft genomes.


Subject(s)
Gene Library , Genome , Rats/genetics , Sequence Analysis, DNA/methods , Animals , Base Sequence , Contig Mapping/methods , Interspersed Repetitive Sequences/genetics
SELECTION OF CITATIONS
SEARCH DETAIL