ABSTRACT
Murine leukemia virus (MLV)-derived vectors are widely used for hematopoietic stem cell (HSC) gene transfer, but lentiviral vectors such as the simian immunodeficiency virus (SIV) may allow higher efficiency transfer and better expression. Recent studies in cell lines have challenged the notion that retroviruses and retroviral vectors integrate randomly into their host genome. Medical applications using these vectors are aimed at HSCs, and thus large-scale comprehensive analysis of MLV and SIV integration in long-term repopulating HSCs is crucial to help develop improved integrating vectors. We studied integration sites in HSCs of rhesus monkeys that had been transplanted 6 mo to 6 y prior with MLV- or SIV-transduced CD34(+)cells. Unique MLV (491) and SIV (501) insertions were compared to a set of in silico-generated random integration sites. While MLV integrants were located predominantly around transcription start sites, SIV integrants strongly favored transcription units and gene-dense regions of the genome. These integration patterns suggest different mechanisms for integration as well as distinct safety implications for MLV versus SIV vectors.
Subject(s)
Genetic Vectors , Genome , Hematopoietic Stem Cells/virology , Leukemia Virus, Murine/metabolism , Simian Immunodeficiency Virus/metabolism , Stem Cells/virology , Animals , Antigens, CD34/biosynthesis , Binding Sites , Cell Line , Cloning, Molecular , Cluster Analysis , DNA Primers/chemistry , Gene Transfer Techniques , Macaca mulatta , Molecular Sequence Data , Mutation , Polymerase Chain Reaction , Retroviridae/genetics , Time Factors , Transcription, GeneticABSTRACT
Methanotrophs are ubiquitous bacteria that can use the greenhouse gas methane as a sole carbon and energy source for growth, thus playing major roles in global carbon cycles, and in particular, substantially reducing emissions of biologically generated methane to the atmosphere. Despite their importance, and in contrast to organisms that play roles in other major parts of the carbon cycle such as photosynthesis, no genome-level studies have been published on the biology of methanotrophs. We report the first complete genome sequence to our knowledge from an obligate methanotroph, Methylococcus capsulatus (Bath), obtained by the shotgun sequencing approach. Analysis revealed a 3.3-Mb genome highly specialized for a methanotrophic lifestyle, including redundant pathways predicted to be involved in methanotrophy and duplicated genes for essential enzymes such as the methane monooxygenases. We used phylogenomic analysis, gene order information, and comparative analysis with the partially sequenced methylotroph Methylobacterium extorquens to detect genes of unknown function likely to be involved in methanotrophy and methylotrophy. Genome analysis suggests the ability of M. capsulatus to scavenge copper (including a previously unreported nonribosomal peptide synthetase) and to use copper in regulation of methanotrophy, but the exact regulatory mechanisms remain unclear. One of the most surprising outcomes of the project is evidence suggesting the existence of previously unsuspected metabolic flexibility in M. capsulatus, including an ability to grow on sugars, oxidize chemolithotrophic hydrogen and sulfur, and live under reduced oxygen tension, all of which have implications for methanotroph ecology. The availability of the complete genome of M. capsulatus (Bath) deepens our understanding of methanotroph biology and its relationship to global carbon cycles. We have gained evidence for greater metabolic flexibility than was previously known, and for genetic components that may have biotechnological potential.
Subject(s)
Gene Expression Regulation, Bacterial , Genome , Methane/metabolism , Methylococcus capsulatus/genetics , Bacterial Proteins/chemistry , Carbon/chemistry , Electron Transport , Fatty Acids/chemistry , Genome, Bacterial , Genomics/methods , Methane/chemistry , Models, Biological , Molecular Sequence Data , Nitrogen/chemistry , Oxygen/chemistry , Oxygen/metabolism , Peptides/chemistry , Phylogeny , Sequence Analysis, DNAABSTRACT
A major goal in genomics is to understand how genes are regulated in different tissues, stages of development, diseases, and species. Mapping DNase I hypersensitive (HS) sites within nuclear chromatin is a powerful and well-established method of identifying many different types of regulatory elements, but in the past it has been limited to analysis of single loci. We have recently described a protocol to generate a genome-wide library of DNase HS sites. Here, we report high-throughput analysis, using massively parallel signature sequencing (MPSS), of 230,000 tags from a DNase library generated from quiescent human CD4+ T cells. Of the tags that uniquely map to the genome, we identified 14,190 clusters of sequences that group within close proximity to each other. By using a real-time PCR strategy, we determined that the majority of these clusters represent valid DNase HS sites. Approximately 80% of these DNase HS sites uniquely map within one or more annotated regions of the genome believed to contain regulatory elements, including regions 2 kb upstream of genes, CpG islands, and highly conserved sequences. Most DNase HS sites identified in CD4+ T cells are also HS in CD8+ T cells, B cells, hepatocytes, human umbilical vein endothelial cells (HUVECs), and HeLa cells. However, approximately 10% of the DNase HS sites are lymphocyte specific, indicating that this procedure can identify gene regulatory elements that control cell type specificity. This strategy, which can be applied to any cell line or tissue, will enable a better understanding of how chromatin structure dictates cell function and fate.
Subject(s)
Chromatin/genetics , Genome, Human/genetics , Regulatory Elements, Transcriptional/genetics , Cell Differentiation/genetics , Cell Differentiation/immunology , Chromosome Mapping/methods , Deoxyribonuclease I/chemistry , Endothelial Cells/cytology , Endothelial Cells/physiology , Genome, Human/immunology , Genomic Library , Genomics/methods , HeLa Cells , Hepatocytes/cytology , Hepatocytes/physiology , Humans , Lymphocytes/cytology , Lymphocytes/physiology , Organ Specificity/genetics , Regulatory Elements, Transcriptional/immunology , Sequence Analysis, DNA , Sequence Tagged Sites , Umbilical Veins/cytology , Umbilical Veins/physiologyABSTRACT
Analysis of the human genome sequence has identified approximately 25000-30000 protein-coding genes, but little is known about how most of these are regulated. Mapping DNase I hypersensitive (HS) sites has traditionally represented the gold-standard experimental method for identifying regulatory elements, but the labor-intensive nature of this technique has limited its application to only a small number of human genes. We have developed a protocol to generate a genome-wide library of gene regulatory sequences by cloning DNase HS sites. We generated a library of DNase HS sites from quiescent primary human CD4(+) T cells and analyzed approximately 5600 of the resulting clones. Compared to sequences from randomly generated in silico libraries, sequences from these clones were found to map more frequently to regions of the genome known to contain regulatory elements, such as regions upstream of genes, within CpG islands, and in sequences that align between mouse and human. These cloned sites also tend to map near genes that have detectable transcripts in CD4(+) T cells, demonstrating that transcriptionally active regions of the genome are being selected. Validation of putative regulatory elements was achieved by repeated recovery of the same sequence and real-time PCR. This cloning strategy, which can be scaled up and applied to any cell line or tissue, will be useful in identifying regulatory elements controlling global expression differences that delineate tissue types, stages of development, and disease susceptibility.
Subject(s)
Deoxyribonucleases/metabolism , Genome , Regulatory Sequences, Nucleic Acid , CD4-Positive T-Lymphocytes/enzymology , Cells, Cultured , Cloning, Molecular , Computational Biology , Humans , Polymerase Chain ReactionABSTRACT
Comparative genomics promises to rapidly accelerate the identification and functional classification of biologically important human genes. We developed the TIGR Orthologous Gene Alignment (TOGA;
Subject(s)
Eukaryotic Cells , Genes/genetics , Sequence Alignment/methods , Algorithms , Animals , Cattle , Computational Biology/methods , Consensus Sequence/genetics , Databases, Genetic , Eukaryotic Cells/chemistry , Eukaryotic Cells/metabolism , Genome, Human , Humans , Mice , Phylogeny , Rats , Sequence Homology, Nucleic AcidABSTRACT
The complete genome of the green-sulfur eubacterium Chlorobium tepidum TLS was determined to be a single circular chromosome of 2,154,946 bp. This represents the first genome sequence from the phylum Chlorobia, whose members perform anoxygenic photosynthesis by the reductive tricarboxylic acid cycle. Genome comparisons have identified genes in C. tepidum that are highly conserved among photosynthetic species. Many of these have no assigned function and may play novel roles in photosynthesis or photobiology. Phylogenomic analysis reveals likely duplications of genes involved in biosynthetic pathways for photosynthesis and the metabolism of sulfur and nitrogen as well as strong similarities between metabolic processes in C. tepidum and many Archaeal species.