Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 154
Filter
1.
Clin Pharmacol Ther ; 84(3): 306-9, 2008 Sep.
Article in English | MEDLINE | ID: mdl-18714319

ABSTRACT

The cost of sequencing and genotyping is aggressively decreasing, enabling pervasive personalized genomic screening for drug reactions. Drug-metabolizing genes have been characterized sufficiently to enable practitioners to go beyond simplistic ethnic characterization and into the precisely targeted world of personal genomics. We examine six drug-metabolizing genes in J. Craig Venter and James Watson, two Caucasian men whose genomes were recently sequenced. Their genetic differences underscore the importance of personalized genomics over a race-based approach to medicine. To attain truly personalized medicine, the scientific community must aim to elucidate the genetic and environmental factors that contribute to drug reactions and not be satisfied with a simple race-based approach.


Subject(s)
Anticoagulants/metabolism , Ethnicity/genetics , Genetic Testing/trends , Pharmacogenetics/trends , Warfarin/metabolism , Anticoagulants/administration & dosage , Anticoagulants/adverse effects , Genetic Privacy , Genetic Testing/economics , Humans , Male , Warfarin/administration & dosage , Warfarin/adverse effects
2.
J Bacteriol ; 184(19): 5479-90, 2002 10.
Article in English | MEDLINE | ID: mdl-12218036

ABSTRACT

Virulence and immunity are poorly understood in Mycobacterium tuberculosis. We sequenced the complete genome of the M. tuberculosis clinical strain CDC1551 and performed a whole-genome comparison with the laboratory strain H37Rv in order to identify polymorphic sequences with potential relevance to disease pathogenesis, immunity, and evolution. We found large-sequence and single-nucleotide polymorphisms in numerous genes. Polymorphic loci included a phospholipase C, a membrane lipoprotein, members of an adenylate cyclase gene family, and members of the PE/PPE gene family, some of which have been implicated in virulence or the host immune response. Several gene families, including the PE/PPE gene family, also had significantly higher synonymous and nonsynonymous substitution frequencies compared to the genome as a whole. We tested a large sample of M. tuberculosis clinical isolates for a subset of the large-sequence and single-nucleotide polymorphisms and found widespread genetic variability at many of these loci. We performed phylogenetic and epidemiological analysis to investigate the evolutionary relationships among isolates and the origins of specific polymorphic loci. A number of these polymorphisms appear to have occurred multiple times as independent events, suggesting that these changes may be under selective pressure. Together, these results demonstrate that polymorphisms among M. tuberculosis strains are more extensive than initially anticipated, and genetic variation may have an important role in disease pathogenesis and immunity.


Subject(s)
Evolution, Molecular , Genome, Bacterial , Mycobacterium tuberculosis/pathogenicity , Sequence Analysis, DNA , Tuberculosis/microbiology , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Genetic Variation , Humans , Molecular Sequence Data , Mycobacterium tuberculosis/genetics , Mycobacterium tuberculosis/immunology , Phylogeny , Polymorphism, Genetic , Polymorphism, Single Nucleotide , Sequence Alignment , Tuberculosis/immunology
3.
Arch Neurol ; 58(11): 1772-8, 2001 Nov.
Article in English | MEDLINE | ID: mdl-11708983

ABSTRACT

The recent publication of the sequence of the human genome will accelerate the discovery of new genetic susceptibility factors for human disease, leading to the development of novel diagnostics and therapeutics. The exhaustive analysis of the human genome sequence will be the focus of the biomedical research community for many years to come. In particular, comparative analysis of the available eukaryotic genome sequences is an important approach to further our understanding of gene structure, function, and evolution. Our initial analysis of the human genome sequence has revealed many interesting features that are relevant to nervous system function, evolution, and disease. We analyzed the prominent features of predicted human proteins involved in neuronal function and prepared a comparative analysis of 146 human genes that have alleles (or mutations) conferring susceptibility for 168 neurologic diseases.


Subject(s)
Genome, Human , Nervous System Diseases/genetics , Nervous System Physiological Phenomena , Proteins/genetics , Sequence Analysis, DNA , Animals , Databases, Genetic , Evolution, Molecular , Gene Duplication , Genetic Predisposition to Disease , Humans , Nervous System Diseases/diagnosis , Protein Structure, Tertiary , Proteins/chemistry , Proteins/classification
4.
JAMA ; 286(18): 2296-307, 2001 Nov 14.
Article in English | MEDLINE | ID: mdl-11710896

ABSTRACT

Clinical researchers, practicing physicians, patients, and the general public now live in a world in which the 2.9 billion nucleotide codes of the human genome are available as a resource for scientific discovery. Some of the findings from the sequencing of the human genome were expected, confirming knowledge presaged by many decades of research in both human and comparative genetics. Other findings are unexpected in their scientific and philosophical implications. In either case, the availability of the human genome is likely to have significant implications, first for clinical research and then for the practice of medicine. This article provides our reflections on what the new genomic knowledge might mean for the future of medicine and how the new knowledge relates to what we knew in the era before the availability of the genome sequence. In addition, practicing physicians in many communities are traditionally also ambassadors of science, called on to translate arcane data or the complex ramifications of biology into a language understood by the public at large. This article also may be useful for physicians who serve in this capacity in their communities. We address the following issues: the number of protein-coding genes in the human genome and certain classes of noncoding repeat elements in the genome; features of genome evolution, including large-scale duplications; an overview of the predicted protein set to highlight prominent differences between the human genome and other sequenced eukaryotic genomes; and DNA variation in the human genome. In addition, we show how this information lays the foundations for ongoing and future endeavors that will revolutionize biomedical research and our understanding of human health.


Subject(s)
Clinical Medicine/trends , Genetics, Medical/trends , Genome, Human , Molecular Biology , Gene Duplication , Gene Expression , Genetic Code , Genetic Variation , Humans , Molecular Sequence Data , Proteome , Research/trends , Sequence Analysis, DNA
5.
Science ; 293(5529): 498-506, 2001 Jul 20.
Article in English | MEDLINE | ID: mdl-11463916

ABSTRACT

The 2,160,837-base pair genome sequence of an isolate of Streptococcus pneumoniae, a Gram-positive pathogen that causes pneumonia, bacteremia, meningitis, and otitis media, contains 2236 predicted coding regions; of these, 1440 (64%) were assigned a biological role. Approximately 5% of the genome is composed of insertion sequences that may contribute to genome rearrangements through uptake of foreign DNA. Extracellular enzyme systems for the metabolism of polysaccharides and hexosamines provide a substantial source of carbon and nitrogen for S. pneumoniae and also damage host tissues and facilitate colonization. A motif identified within the signal peptide of proteins is potentially involved in targeting these proteins to the cell surface of low-guanine/cytosine (GC) Gram-positive species. Several surface-exposed proteins that may serve as potential vaccine candidates were identified. Comparative genome hybridization with DNA arrays revealed strain differences in S. pneumoniae that could contribute to differences in virulence and antigenicity.


Subject(s)
Genome, Bacterial , Sequence Analysis, DNA , Streptococcus pneumoniae/genetics , Streptococcus pneumoniae/pathogenicity , Antigens, Bacterial , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Bacterial Proteins/immunology , Bacterial Proteins/metabolism , Bacterial Vaccines , Base Composition , Carbohydrate Metabolism , Carrier Proteins/genetics , Carrier Proteins/metabolism , Chromosomes, Bacterial/genetics , Computational Biology , DNA Transposable Elements , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Gene Duplication , Genes, Bacterial , Hexosamines/metabolism , Oligonucleotide Array Sequence Analysis , Recombination, Genetic , Repetitive Sequences, Nucleic Acid , Species Specificity , Streptococcus pneumoniae/immunology , Streptococcus pneumoniae/metabolism , Virulence , rRNA Operon
6.
Calcif Tissue Int ; 68(3): 151-5, 2001 Mar.
Article in English | MEDLINE | ID: mdl-11351498

ABSTRACT

Paget's disease of bone (PDB) is a common disorder characterized by focal areas of increased and disorganized osteoclastic bone resorption, leading to bone pain, deformity, pathological fracture, and an increased risk of osteosarcoma. Genetic factors play an important role in the pathogenesis of Paget's disease. In some families, the disease has been found to be linked to a susceptibility locus on chromosome 18q21-22, which also contains the gene responsible for familial expansile osteolysis (FEO)--a rare bone dysplasia with many similarities to Paget's disease. Insertion mutations of the TNFRSF11A gene encoding Receptor Activator of NF kappa B (RANK) have recently been found to be responsible for FEO and rare cases of early onset familial Paget's disease. Loss of heterozygosity (LOH) affecting the PDB/FEO critical region has also been described in osteosarcomas suggesting that TNFRSF11A might also be involved in the development of osteosarcoma. In order to investigate the possible role of TNFRSF11A in the pathogenesis of Paget's disease and osteosarcoma, we conducted mutation screening of the TNFRSF11A gene in patients with familial and sporadic Paget's disease as well as DNA extracted from Pagetic bone lesions, an osteosarcoma arising in Pagetic bone and six osteosarcoma cell lines. No specific abnormalities of the TNFRSF11A gene were identified in a Pagetic osteosarcoma, the osteosarcoma cell lines, DNA extracted from Pagetic bone lesions, or DNA extracted from peripheral blood in patients with familial or sporadic Paget's disease including several individuals with early onset Paget's disease. These data indicate that TNFRSF11A mutations contribute neither to the vast majority of cases of sporadic or familial PDB, nor to the development of osteosarcoma.


Subject(s)
Bone Neoplasms/genetics , Genetic Predisposition to Disease , Glycoproteins/genetics , Osteitis Deformans/genetics , Osteosarcoma/genetics , Receptors, Cytoplasmic and Nuclear/genetics , Adult , DNA/analysis , DNA Mutational Analysis , DNA Primers/chemistry , Genetic Testing , Humans , Osteoprotegerin , Point Mutation , Polymerase Chain Reaction , Receptors, Tumor Necrosis Factor
7.
Proc Natl Acad Sci U S A ; 98(7): 4136-41, 2001 Mar 27.
Article in English | MEDLINE | ID: mdl-11259647

ABSTRACT

The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extracytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living alpha-class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus.


Subject(s)
Caulobacter crescentus/genetics , Genome, Bacterial , Adaptation, Biological/genetics , Cell Cycle/genetics , DNA Methylation , Dinucleotide Repeats , Molecular Sequence Data , Peptide Hydrolases/genetics , Phylogeny , Signal Transduction , Transcription, Genetic
8.
Science ; 291(5507): 1304-51, 2001 02 16.
Article in English | MEDLINE | ID: mdl-11181995

ABSTRACT

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.


Subject(s)
Genome, Human , Human Genome Project , Sequence Analysis, DNA , Algorithms , Animals , Chromosome Banding , Chromosome Mapping , Chromosomes, Artificial, Bacterial , Computational Biology , Consensus Sequence , CpG Islands , DNA, Intergenic , Databases, Factual , Evolution, Molecular , Exons , Female , Gene Duplication , Genes , Genetic Variation , Humans , Introns , Male , Phenotype , Physical Chromosome Mapping , Polymorphism, Single Nucleotide , Proteins/genetics , Proteins/physiology , Pseudogenes , Repetitive Sequences, Nucleic Acid , Retroelements , Sequence Analysis, DNA/methods , Species Specificity
9.
Mol Diagn ; 6(4): 243-52, 2001 Dec.
Article in English | MEDLINE | ID: mdl-11774190

ABSTRACT

The approach of whole-genome shotgun sequencing coupled with the availability of computational algorithms to facilitate the assembly, gene prediction, and functional annotation of entire genomes has sparked a revolution in our understanding of the biology of free-living organisms. More than 40 bacterial genomes have been sequenced to date, of which several are important human pathogens. The capacity to sequence and assemble entire genomes of bacteria, pathogenic protozoans, and fungi in a rapid and cost-effective way has energized every aspect of microbial science. Comparative genome analysis allows us to dissect the evolutionary forces at work and provides insights into adaptations of microbes to their unique ecological niches. Factors that shape host-pathogen interactions and their outcomes include genetic polymorphisms in the microbial pathogen and host, both of which can impact on microbial virulence or host immune responses to infection. The availability of the genome sequence of entire organisms, together with the use of high-throughput sequence-based genomic technologies to define microbial and host physiological states, provides the unparalleled opportunity to better define clinical outcomes in the field of infectious diseases. There is one overarching lesson: completion of the genomic sequence of any species answers many questions, while at the same time it invites totally new questions.


Subject(s)
Communicable Diseases/genetics , Genome , Animals , Bacterial Infections/genetics , Communicable Diseases/diagnosis , Genome, Bacterial , Genome, Fungal , Genome, Protozoan , Humans , Mycoses/genetics , Protozoan Infections/genetics
10.
Curr Opin Biotechnol ; 11(6): 581-5, 2000 Dec.
Article in English | MEDLINE | ID: mdl-11102793

ABSTRACT

Our genomic DNA sequence provides a unique glimpse of the provenance and evolution of our species, the migration of peoples, and the causation of disease. Understanding the genome may help resolve previously unanswerable questions, including perhaps which human characteristics are innate or acquired. Such an understanding will make it possible to study how genomic DNA sequence varies among populations and among individuals, including the role of such variation in the pathogenesis of important illnesses and responses to pharmaceuticals. The study of the genome and the associated proteomics of free-living organisms will eventually make it possible to localize and annotate every human gene, as well as the regulatory elements that control the timing, organ-site specificity, extent of gene expression, protein levels, and post-translational modifications. For any given physiological process, we will have a new paradigm for addressing its evolution, development, function, and mechanism.


Subject(s)
Genome , Animals , Clinical Medicine , DNA/chemistry , DNA/genetics , Genome, Human , Humans , Molecular Biology , Sequence Analysis, DNA
11.
12.
Novartis Found Symp ; 229: 14-5; discussion 15-8, 2000.
Article in English | MEDLINE | ID: mdl-11084924
13.
Nature ; 406(6795): 477-83, 2000 Aug 03.
Article in English | MEDLINE | ID: mdl-10952301

ABSTRACT

Here we determine the complete genomic sequence of the gram negative, gamma-Proteobacterium Vibrio cholerae El Tor N16961 to be 4,033,460 base pairs (bp). The genome consists of two circular chromosomes of 2,961,146 bp and 1,072,314 bp that together encode 3,885 open reading frames. The vast majority of recognizable genes for essential cell functions (such as DNA replication, transcription, translation and cell-wall biosynthesis) and pathogenicity (for example, toxins, surface antigens and adhesins) are located on the large chromosome. In contrast, the small chromosome contains a larger fraction (59%) of hypothetical genes compared with the large chromosome (42%), and also contains many more genes that appear to have origins other than the gamma-Proteobacteria. The small chromosome also carries a gene capture system (the integron island) and host 'addiction' genes that are typically found on plasmids; thus, the small chromosome may have originally been a megaplasmid that was captured by an ancestral Vibrio species. The V. cholerae genomic sequence provides a starting point for understanding how a free-living, environmental organism emerged to become a significant human bacterial pathogen.


Subject(s)
Chromosomes, Bacterial , DNA, Bacterial , Vibrio cholerae/genetics , Base Sequence , Biological Transport , Cholera/microbiology , DNA Repair , Energy Metabolism , Evolution, Molecular , Gene Expression Regulation, Bacterial , Genome, Bacterial , Humans , Molecular Sequence Data , Phylogeny , Sequence Analysis, DNA , Vibrio cholerae/classification , Vibrio cholerae/pathogenicity
14.
Annu Rev Pharmacol Toxicol ; 40: 97-132, 2000.
Article in English | MEDLINE | ID: mdl-10836129

ABSTRACT

The power and effectiveness of clinical pharmacology are about to be transformed with a speed that earlier in this decade could not have been foreseen even by the most astute visionaries. In the very near future, we will have at our disposal the reference DNA sequence for the entire human genome, estimated to contain approximately 3.5 billion bp. At the same time, the science of whole genome sequencing is fostering the computational science of bioinformatics needed to develop practical applications for pharmacology and toxicology. Indeed, it is likely that pharmacology, toxicology, bioinformatics, and genomics will merge into a new branch of medical science for studying and developing pharmaceuticals from molecule to bedside.


Subject(s)
Base Sequence , Genome, Human , Pharmacology , Animals , Genetic Variation , Humans , Pan troglodytes/genetics , Pharmacogenetics , Polymorphism, Genetic
15.
Science ; 287(5461): 2196-204, 2000 Mar 24.
Article in English | MEDLINE | ID: mdl-10731133

ABSTRACT

We report on the quality of a whole-genome assembly of Drosophila melanogaster and the nature of the computer algorithms that accomplished it. Three independent external data sources essentially agree with and support the assembly's sequence and ordering of contigs across the euchromatic portion of the genome. In addition, there are isolated contigs that we believe represent nonrepetitive pockets within the heterochromatin of the centromeres. Comparison with a previously sequenced 2.9- megabase region indicates that sequencing accuracy within nonrepetitive segments is greater than 99. 99% without manual curation. As such, this initial reconstruction of the Drosophila sequence should be of substantial value to the scientific community.


Subject(s)
Computational Biology , Drosophila melanogaster/genetics , Genome , Sequence Analysis, DNA , Algorithms , Animals , Chromatin/genetics , Contig Mapping , Euchromatin , Genes, Insect , Heterochromatin/genetics , Molecular Sequence Data , Physical Chromosome Mapping , Repetitive Sequences, Nucleic Acid , Sequence Tagged Sites
16.
Science ; 287(5459): 1809-15, 2000 Mar 10.
Article in English | MEDLINE | ID: mdl-10710307

ABSTRACT

The 2,272,351-base pair genome of Neisseria meningitidis strain MC58 (serogroup B), a causative agent of meningitis and septicemia, contains 2158 predicted coding regions, 1158 (53.7%) of which were assigned a biological role. Three major islands of horizontal DNA transfer were identified; two of these contain genes encoding proteins involved in pathogenicity, and the third island contains coding sequences only for hypothetical proteins. Insights into the commensal and virulence behavior of N. meningitidis can be gleaned from the genome, in which sequences for structural proteins of the pilus are clustered and several coding regions unique to serogroup B capsular polysaccharide synthesis can be identified. Finally, N. meningitidis contains more genes that undergo phase variation than any pathogen studied to date, a mechanism that controls their expression and contributes to the evasion of the host immune system.


Subject(s)
Genome, Bacterial , Neisseria meningitidis/genetics , Neisseria meningitidis/pathogenicity , Sequence Analysis, DNA , Antigenic Variation , Antigens, Bacterial/immunology , Bacteremia/microbiology , Bacterial Capsules/genetics , Bacterial Proteins/genetics , Bacterial Proteins/physiology , DNA Transposable Elements , Evolution, Molecular , Fimbriae, Bacterial/genetics , Humans , Meningitis, Meningococcal/microbiology , Meningococcal Infections/microbiology , Molecular Sequence Data , Mutation , Neisseria meningitidis/classification , Neisseria meningitidis/physiology , Open Reading Frames , Operon , Phylogeny , Recombination, Genetic , Serotyping , Transformation, Bacterial , Virulence/genetics
17.
Science ; 287(5459): 1816-20, 2000 Mar 10.
Article in English | MEDLINE | ID: mdl-10710308

ABSTRACT

Neisseria meningitidis is a major cause of bacterial septicemia and meningitis. Sequence variation of surface-exposed proteins and cross-reactivity of the serogroup B capsular polysaccharide with human tissues have hampered efforts to develop a successful vaccine. To overcome these obstacles, the entire genome sequence of a virulent serogroup B strain (MC58) was used to identify vaccine candidates. A total of 350 candidate antigens were expressed in Escherichia coli, purified, and used to immunize mice. The sera allowed the identification of proteins that are surface exposed, that are conserved in sequence across a range of strains, and that induce a bactericidal antibody response, a property known to correlate with vaccine efficacy in humans.


Subject(s)
Antigens, Bacterial/immunology , Bacterial Proteins/immunology , Bacterial Vaccines , Genome, Bacterial , Neisseria meningitidis/genetics , Neisseria meningitidis/immunology , Amino Acid Sequence , Animals , Antibodies, Bacterial/biosynthesis , Antibodies, Bacterial/blood , Antigens, Bacterial/chemistry , Antigens, Bacterial/genetics , Antigens, Surface/chemistry , Antigens, Surface/genetics , Antigens, Surface/immunology , Bacterial Capsules , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Bacterial Vaccines/genetics , Bacterial Vaccines/immunology , Conserved Sequence , Escherichia coli/genetics , Humans , Immune Sera/immunology , Mice , Neisseria meningitidis/classification , Neisseria meningitidis/pathogenicity , Open Reading Frames , Recombinant Fusion Proteins/chemistry , Recombinant Fusion Proteins/immunology , Recombinant Fusion Proteins/isolation & purification , Recombination, Genetic , Sequence Analysis, DNA , Serotyping , Vaccination , Virulence
18.
Genomics ; 63(3): 321-32, 2000 Feb 01.
Article in English | MEDLINE | ID: mdl-10704280

ABSTRACT

End sequences from bacterial artificial chromosomes (BACs) provide highly specific sequence markers in large-scale sequencing projects. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length of >460 bp for a total of 141 Mb covering approximately 4.7% of the genome. Over 60% of the clones have BAC end sequences (BESs) from both ends representing more than fivefold coverage of the human genome by the paired-end clones. Our quality assessments and sequence analyses indicate that BESs from human BAC libraries developed at The California Institute of Technology (CalTech) and Roswell Park Cancer Institute have similar properties. The analyses have highlighted differences in insert size for different segments of the CalTech library. Problems with the fidelity of tracking of sequence data back to physical clones have been observed in some subsets of the overall BES dataset. The annotation results of BESs for the contents of available genomic sequences, sequence tagged sites, expressed sequence tags, protein encoding regions, and repeats indicate that this resource will be valuable in many areas of genome research.


Subject(s)
Chromosomes, Bacterial , Genetic Markers , Genetic Vectors , Sequence Analysis, DNA/methods , Chromosome Mapping , Evaluation Studies as Topic , Expressed Sequence Tags , Gene Library , Genome, Human , Humans , Quality Control , Repetitive Sequences, Nucleic Acid , Reproducibility of Results , Sensitivity and Specificity
20.
Science ; 286(5447): 2165-9, 1999 Dec 10.
Article in English | MEDLINE | ID: mdl-10591650

ABSTRACT

Mycoplasma genitalium with 517 genes has the smallest gene complement of any independently replicating cell so far identified. Global transposon mutagenesis was used to identify nonessential genes in an effort to learn whether the naturally occurring gene complement is a true minimal genome under laboratory growth conditions. The positions of 2209 transposon insertions in the completely sequenced genomes of M. genitalium and its close relative M. pneumoniae were determined by sequencing across the junction of the transposon and the genomic DNA. These junctions defined 1354 distinct sites of insertion that were not lethal. The analysis suggests that 265 to 350 of the 480 protein-coding genes of M. genitalium are essential under laboratory growth conditions, including about 100 genes of unknown function.


Subject(s)
DNA Transposable Elements , Genes, Essential , Genome, Bacterial , Mutagenesis, Insertional , Mycoplasma/genetics , ATP-Binding Cassette Transporters/genetics , ATP-Binding Cassette Transporters/metabolism , Amino Acyl-tRNA Synthetases/genetics , Bacterial Proteins/genetics , Chromosome Mapping , DNA Polymerase III/genetics , DNA Polymerase III/metabolism , DNA Replication/genetics , Glycolysis/genetics , Lipoproteins/genetics , Mycoplasma/metabolism , Mycoplasma pneumoniae/genetics , Mycoplasma pneumoniae/metabolism , Ribosomal Proteins/genetics , Transcription, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...