Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 77
Filter
Add more filters










Publication year range
2.
Pharmacogenetics ; 11(7): 555-72, 2001 Oct.
Article in English | MEDLINE | ID: mdl-11668216

ABSTRACT

The pregnane X receptor (PXR)/steroid and xenobiotic receptor (SXR) transcriptionally activates cytochrome P4503A4 (CYP3A4) when ligand activated by endobiotics and xenobiotics. We cloned the human PXR gene and analysed the sequence in DNAs of individuals whose CYP3A phenotype was known. The PXR gene spans 35 kb, contains nine exons, and mapped to chromosome 13q11-13. Thirty-eight single nucleotide polymorphisms (SNPs) were identified including six SNPs in the coding region. Three of the coding SNPs are non-synonymous creating new PXR alleles [PXR*2, P27S (79C to T); PXR*3, G36R (106G to A); and PXR*4, R122Q (4321G to A)]. The frequency of PXR*2 was 0.20 in African Americans and was never found in Caucasians. Hepatic expression of CYP3A4 protein was not significantly different between African Americans homozygous for PXR*1 compared to those with one PXR*2 allele. PXR*4 was a rare variant found in only one Caucasian person. Homology modelling suggested that R122Q, (PXR*4) is a direct DNA contact site variation in the third alpha-helix in the DNA binding domain. Compared with PXR*1, and variants PXR*2 and PXR*3, only the variant PXR*4 protein had significantly decreased affinity for the PXR binding sequence in electromobility shift assays and attenuated ligand activation of the CYP3A4 reporter plasmids in transient transfection assays. However, the person heterozygous for PXR*4 is normal for CYP3A4 metabolism phenotype. The relevance of each of the 38 PXR SNPs identified in DNA of individuals whose CYP3A basal and rifampin-inducible CYP3A4 expression was determined in vivo and/or in vitro was demonstrated by univariate statistical analysis. Because ligand activation of PXR and upregulation of a system of drug detoxification genes are major determinants of drug interactions, it will now be useful to extend this work to determine the association of these common PXR SNPs to human variation in induction of other drug detoxification gene targets.


Subject(s)
Alleles , Aryl Hydrocarbon Hydroxylases , Receptors, Cytoplasmic and Nuclear/chemistry , Receptors, Cytoplasmic and Nuclear/genetics , Receptors, Steroid/chemistry , Receptors, Steroid/genetics , Xenobiotics/metabolism , Amino Acid Sequence , Animals , Chromosome Mapping/methods , Cytochrome P-450 CYP3A , Cytochrome P-450 Enzyme System/genetics , Cytochrome P-450 Enzyme System/metabolism , Humans , Models, Molecular , Molecular Sequence Data , Oxidoreductases, N-Demethylating/genetics , Oxidoreductases, N-Demethylating/metabolism , Polymorphism, Single Nucleotide/genetics , Pregnane X Receptor , Receptors, Cytoplasmic and Nuclear/physiology , Receptors, Steroid/physiology , Sequence Homology, Amino Acid , Transcriptional Activation/physiology
3.
Nature ; 409(6822): 922-7, 2001 Feb 15.
Article in English | MEDLINE | ID: mdl-11237012

ABSTRACT

The most important product of the sequencing of a genome is a complete, accurate catalogue of genes and their products, primarily messenger RNA transcripts and their cognate proteins. Such a catalogue cannot be constructed by computational annotation alone; it requires experimental validation on a genome scale. Using 'exon' and 'tiling' arrays fabricated by ink-jet oligonucleotide synthesis, we devised an experimental approach to validate and refine computational gene predictions and define full-length transcripts on the basis of co-regulated expression of their exons. These methods can provide more accurate gene numbers and allow the detection of mRNA splice variants and identification of the tissue- and disease-specific conditions under which genes are expressed. We apply our technique to chromosome 22q under 69 experimental condition pairs, and to the entire human genome under two experimental conditions. We discuss implications for more comprehensive, consistent and reliable genome annotation, more efficient, full-length complementary DNA cloning strategies and application to complex diseases.


Subject(s)
Chromosomes, Human, Pair 22 , Computational Biology , Genome, Human , Oligonucleotide Array Sequence Analysis , Algorithms , Alternative Splicing , Cell Line , DNA, Complementary , Exons , Human Genome Project , Humans , Oligonucleotide Probes
4.
Nat Genet ; 27(4): 383-91, 2001 Apr.
Article in English | MEDLINE | ID: mdl-11279519

ABSTRACT

Variation in the CYP3A enzymes, which act in drug metabolism, influences circulating steroid levels and responses to half of all oxidatively metabolized drugs. CYP3A activity is the sum activity of the family of CYP3A genes, including CYP3A5, which is polymorphically expressed at high levels in a minority of Americans of European descent and Europeans (hereafter collectively referred to as 'Caucasians'). Only people with at least one CYP3A5*1 allele express large amounts of CYP3A5. Our findings show that single-nucleotide polymorphisms (SNPs) in CYP3A5*3 and CYP3A5*6 that cause alternative splicing and protein truncation result in the absence of CYP3A5 from tissues of some people. CYP3A5 was more frequently expressed in livers of African Americans (60%) than in those of Caucasians (33%). Because CYP3A5 represents at least 50% of the total hepatic CYP3A content in people polymorphically expressing CYP3A5, CYP3A5 may be the most important genetic contributor to interindividual and interracial differences in CYP3A-dependent drug clearance and in responses to many medicines.


Subject(s)
Cytochrome P-450 Enzyme System/genetics , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Alleles , Alternative Splicing , Cytochrome P-450 CYP3A , Humans , Molecular Sequence Data , Racial Groups
5.
Curr Protoc Hum Genet ; Chapter 6: Unit 6.7, 2001 May.
Article in English | MEDLINE | ID: mdl-18428302

ABSTRACT

This unit describes the NCBI's Entrez database browser. Entrez integrates DNA and protein sequence data, three dimensional structures, and taxonomic information with its associated abstracts and citations contained in PubMed (MEDLINE). It is possible to search the Entrez information space using conventional search queries (authors, gene names, map location) as well as by bibliographic associations (articles that are related to one another) and sequence homology. Also described are the procedures for submission of new data, updates, and corrections to the sequence databases.


Subject(s)
Databases, Genetic , Databases, Nucleic Acid , Databases, Protein , Genetics, Medical , Humans , Information Storage and Retrieval , Information Systems , National Library of Medicine (U.S.) , PubMed , United States
6.
Curr Protoc Mol Biol ; Chapter 19: Unit 19.2, 2001 May.
Article in English | MEDLINE | ID: mdl-18265176

ABSTRACT

This unit provides an overview of biomedical information resources, focusing on sequence data, structure information, and the associated literature, and also discusses how nucleotide sequence data gets into the databases in the first place. Some specific databases covered here are MEDLINE, GenBank, and Entrez.


Subject(s)
Computational Biology/methods , Databases as Topic , Databases, Nucleic Acid , Databases, Protein , Information Storage and Retrieval/methods
7.
Neoplasia ; 2(3): 280-6, 2000.
Article in English | MEDLINE | ID: mdl-10935514

ABSTRACT

We have curated a reference set of cancer- related genes and reanalyzed their sequences in the light of molecular information and resources that have become available since they were first cloned. Homology studies were carried out for human oncogenes and tumor suppressors, compared with the complete proteome of the nematode, Caenorhabditis elegans, and partial proteomes of mouse and rat and the fruit fly, Drosophila melanogaster. Our results demonstrate that simple, semi-automated bioinformatics approaches to identifying putative functionally equivalent gene products in different organisms may often be misleading. An electronic supplement to this article provides an integrated view of our comparative genomics analysis as well as mapping data, physical cDNA resources and links to published literature and reviews, thus creating a "window" into the genomes of humans and other organisms for cancer biology.


Subject(s)
Genes, Tumor Suppressor , Genome , Oncogenes , Animals , Caenorhabditis/genetics , Drosophila melanogaster/genetics , Human Genome Project , Humans , Mice , Rats
9.
Genome Res ; 10(4): 411-5, 2000 Apr.
Article in English | MEDLINE | ID: mdl-10779482

ABSTRACT

Human L1 retrotransposons can produce DNA transduction events in which unique DNA segments downstream of L1 elements are mobilized as part of aberrant retrotransposition events. That L1s are capable of carrying out such a reaction in tissue culture cells was elegantly demonstrated. Using bioinformatic approaches to analyze the structures of L1 element target site duplications and flanking sequence features, we provide evidence suggesting that approximately 15% of full-length L1 elements bear evidence of flanking DNA segment transduction. Extrapolating these findings to the 600,000 copies of L1 in the genome, we predict that the amount of DNA transduced by L1 represents approximately 1% of the genome, a fraction comparable with that occupied by exons.


Subject(s)
DNA/genetics , DNA/metabolism , Long Interspersed Nucleotide Elements/genetics , Mutagenesis, Insertional/genetics , 3' Untranslated Regions/genetics , Computational Biology/methods , Humans , Models, Biological
10.
Science ; 287(5461): 2204-15, 2000 Mar 24.
Article in English | MEDLINE | ID: mdl-10731134

ABSTRACT

A comparative analysis of the genomes of Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae-and the proteins they are predicted to encode-was undertaken in the context of cellular, developmental, and evolutionary processes. The nonredundant protein sets of flies and worms are similar in size and are only twice that of yeast, but different gene families are expanded in each genome, and the multidomain proteins and signaling pathways of the fly and worm are far more complex than those of yeast. The fly has orthologs to 177 of the 289 human disease genes examined and provides the foundation for rapid analysis of some of the basic processes involved in human disease.


Subject(s)
Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Genome , Proteome , Saccharomyces cerevisiae/genetics , Animals , Apoptosis/genetics , Biological Evolution , Caenorhabditis elegans/chemistry , Caenorhabditis elegans/physiology , Cell Adhesion/genetics , Cell Cycle/genetics , Drosophila melanogaster/chemistry , Drosophila melanogaster/physiology , Fungal Proteins/chemistry , Fungal Proteins/genetics , Genes, Duplicate , Genetic Diseases, Inborn/genetics , Genetics, Medical , Helminth Proteins/chemistry , Helminth Proteins/genetics , Humans , Immunity/genetics , Insect Proteins/chemistry , Insect Proteins/genetics , Multigene Family , Neoplasms/genetics , Protein Structure, Tertiary , Saccharomyces cerevisiae/chemistry , Saccharomyces cerevisiae/physiology , Signal Transduction/genetics
11.
Hum Mol Genet ; 8(12): 2325-33, 1999 Nov.
Article in English | MEDLINE | ID: mdl-10545614

ABSTRACT

Cerebral cavernous malformations (CCM) are congenital vascular anomalies of the brain that can cause significant neurological disabilities, including intractable seizures and hemorrhagic stroke. One locus for autosomal dominant CCM ( CCM1 ) maps to chromosome 7q21-q22. Recombination events in linked family members define a critical region of approximately 2 Mb and a shared disease haplotype associated with a presumed founder effect in families of Mexican-American descent points to a potentially smaller region of interest. Using a genomic sequence-based positional cloning strategy, we have identified KRIT1, encoding a protein that interacts with the Krev-1/rap1a tumor suppressor, as the CCM1 gene. Seven different KRIT1 mutations have been identified in 23 distinct CCM1 families. The identical mutation is present in 16 of 21 Mexican-American families analyzed, substantiating a founder effect in this population. Other Mexican-American and non-Hispanic Caucasian CCM1 kindreds harbor other KRIT1 mutations. Identification of a common Mexican-American mutation has potential clinical significance for presymptomatic diagnosis of CCM in this population. In addition, these data point to a key role for the Krev-1/rap1a signaling pathway in angiogenesis and cerebrovascular disease.


Subject(s)
Blood Vessels/abnormalities , Brain/blood supply , Microtubule-Associated Proteins , Mutation , Proto-Oncogene Proteins/genetics , Ethnicity , Genetic Linkage , Humans , KRIT1 Protein , Molecular Sequence Data , Physical Chromosome Mapping
12.
Gene ; 238(1): 163-70, 1999 Sep 30.
Article in English | MEDLINE | ID: mdl-10570994

ABSTRACT

Recently, we have defined and analyzed over 1800 orthologous human and rodent genes. Here we extend this work to compare human and Caenorhabditis elegans coding sequences. 1880 human proteins were compared with about 20000 predicted nematode proteins presumably comprising nearly the complete proteome of C. elegans. We found that 44% of human/rodent orthologs have convincing nematode counterparts. On average, the amino acid similarity and identity between aligned human and C. elegans orthologous gene products are 69.3% and 49.1% respectively, and the nucleotide identity is 49.8%. Detailed investigation of our results suggests that some nematode gene predictions are incorrect, leading to erroneous pairing with human genes (e.g. calcineurin and polymerase II elongation factor III). Furthermore, other proteins (i.e. homologs of human ribosomal proteins S20 and L41, thymosin) are missing entirely from the nematode proteome, suggesting that it may not be complete. These results underscore the fact that metazoan gene prediction is a very challenging task and that most computer-predicted nematode genes require supporting evidence of their existence from comparative genomics and/or laboratory investigation.


Subject(s)
Caenorhabditis elegans/genetics , Proteome/genetics , Animals , Evolution, Molecular , Humans , RNA, Messenger/genetics , Sequence Homology, Nucleic Acid
13.
Science ; 286(5439): 453-5, 1999 Oct 15.
Article in English | MEDLINE | ID: mdl-10521334

ABSTRACT

Annotation of large-scale gene sequence data will benefit from comprehensive and consistent application of well-documented, standard analysis methods and from progressive and vigilant efforts to ensure quality and utility and to keep the annotation up to date. However, it is imperative to learn how to apply information derived from functional genomics and proteomics technologies to conceptualize and explain the behaviors of biological systems. Quantitative and dynamical models of systems behaviors will supersede the limited and static forms of single-gene annotation that are now the norm. Molecular biological epistemology will increasingly encompass both teleological and causal explanations.


Subject(s)
Computational Biology , Genetic Techniques , Genome , Proteome , Sequence Analysis, DNA , Animals , Base Sequence , Cloning, Molecular , Databases, Factual , Genome, Human , Human Genome Project , Humans , Molecular Biology
14.
Nat Genet ; 22(4): 388-93, 1999 Aug.
Article in English | MEDLINE | ID: mdl-10431246

ABSTRACT

A physical map of the mouse genome is an essential tool for both positional cloning and genomic sequencing in this key model system for biomedical research. Indeed, the construction of a mouse physical map with markers spaced at an average interval of 300 kb is one of the stated goals of the Human Genome Project. Here we report the results of a project at the Whitehead Institute/MIT Center for Genome Research to construct such a physical map of the mouse. We built the map by screening sequenced-tagged sites (STSs) against a large-insert yeast artificial chromosome (YAC) library and then integrating the STS-content information with a dense genetic map. The integrated map shows the location of 9,787 loci, providing landmarks with an average spacing of approximately 300 kb and affording YAC coverage of approximately 92% of the mouse genome. We also report the results of a project at the MRC UK Mouse Genome Centre targeted at chromosome X. The project produced a YAC-based map containing 619 loci (with 121 loci in common with the Whitehead map and 498 additional loci), providing especially dense coverage of this sex chromosome. The YAC-based physical map directly facilitates positional cloning of mouse mutations by providing ready access to most of the genome. More generally, use of this map in addition to a newly constructed radiation hybrid (RH) map provides a comprehensive framework for mouse genomic studies.


Subject(s)
Chromosomes, Artificial, Yeast , Genome , Mice/genetics , Physical Chromosome Mapping , Animals , Chromosome Mapping , Contig Mapping , Genetic Markers , Models, Genetic
15.
Oncogene ; 18(1): 211-8, 1999 Jan 07.
Article in English | MEDLINE | ID: mdl-9926936

ABSTRACT

Missense mutations in p53 frequently occur at 'hotspot' amino acids which are highly conserved and represent regions of structural or functional importance. Using the p53 mutation database and the p53 DNA sequences for 11 species, we more precisely defined the relationships among conservation, mutation frequency and protein structure. We aligned the p53 sequences codon-by-codon and determined the degree of substitution among them. As a whole, p53 is evolving at an average rate for a mammalian protein-coding gene. As expected, the DNA binding domain is evolving more slowly than the carboxy and amino termini. A detailed map of evolutionary conservation shows that within the DNA binding domain there are repeating peaks and valleys of higher and lower evolutionary constraint. Mutation hotspots were identified by comparing the observed distribution of mutations to the pattern expected from a random multinomial distribution. Seventy-three hotspots were identified; these 19% of codons account for 88% of all reported p53 mutations. Both high evolutionary constraint and mutation hotspots are noted at amino acids close to the protein-DNA interface and at others more distant from DNA, often buried within the core of the folded protein but sometimes on its surface. The results indicate that targeting highly conserved regions for mutational and functional analysis may be efficient strategies for the study of cancer-related genes.


Subject(s)
Evolution, Molecular , Mutation , Protein Conformation , Tumor Suppressor Protein p53/chemistry , Tumor Suppressor Protein p53/genetics , Chromosome Mapping , Tumor Suppressor Protein p53/physiology
16.
Genome Res ; 9(2): 189-94, 1999 Feb.
Article in English | MEDLINE | ID: mdl-10022984

ABSTRACT

Ongoing efforts to sequence the human genome are already generating large amounts of data, with substantial increases anticipated over the next few years. In most cases, a shotgun sequencing strategy is being used, which rapidly yields most of the primary sequence in incompletely assembled sequence contigs ("prefinished" sequence) and more slowly produces the final, completely assembled sequence ("finished" sequence). Thus, in general, prefinished sequence is produced in excess of finished sequence, and this trend is certain to continue and even accelerate over the next few years. Even at a prefinished stage, genomic sequence represents a rich source of important biological information that is of great interest to many investigators. However, analyzing such data is a challenging and daunting task, both because of its sheer volume and because it can change on a day-by-day basis. To facilitate the discovery and characterization of genes and other important elements within prefinished sequence, we have developed an analytical strategy and system that uses readily available software tools in new combinations. Implementation of this strategy for the analysis of prefinished sequence data from human chromosome 7 has demonstrated that this is a convenient, inexpensive, and extensible solution to the problem of analyzing the large amounts of preliminary data being produced by large-scale sequencing efforts. Our approach is accessible to any investigator who wishes to assimilate additional information about particular sequence data en route to developing richer annotations of a finished sequence.


Subject(s)
Base Sequence , DNA/analysis , Sequence Analysis, DNA/methods , Software , Algorithms , Databases, Factual , Genome, Human , Humans , Internet
17.
Science ; 283(5398): 83-7, 1999 Jan 01.
Article in English | MEDLINE | ID: mdl-9872747

ABSTRACT

The temporal program of gene expression during a model physiological response of human cells, the response of fibroblasts to serum, was explored with a complementary DNA microarray representing about 8600 different human genes. Genes could be clustered into groups on the basis of their temporal patterns of expression in this program. Many features of the transcriptional program appeared to be related to the physiology of wound repair, suggesting that fibroblasts play a larger and richer role in this complex multicellular response than had previously been appreciated.


Subject(s)
Blood , Cell Cycle/genetics , Fibroblasts/physiology , Gene Expression Regulation , Transcription, Genetic , Wound Healing/genetics , Calcium-Calmodulin-Dependent Protein Kinases/genetics , Calcium-Calmodulin-Dependent Protein Kinases/metabolism , Cell Line , Cholesterol/biosynthesis , Culture Media , Culture Media, Serum-Free , Expressed Sequence Tags , Fibroblasts/cytology , Fluorescent Dyes , Genes, Immediate-Early , Humans , Oligonucleotide Array Sequence Analysis , Polymerase Chain Reaction/methods , Software , Time Factors , Transcription Factors/genetics
18.
Nat Genet ; 21(1 Suppl): 51-5, 1999 Jan.
Article in English | MEDLINE | ID: mdl-9915502

ABSTRACT

Technologies for whole-genome RNA expression studies are becoming increasingly reliable and accessible. However, universal standards to make the data more suitable for comparative analysis and for inter-operability with other information resources have yet to emerge. Improved access to large electronic data sets, reliable and consistent annotation and effective tools for 'data mining' are critical. Analysis methods that exploit large data warehouses of gene expression experiments will be necessary to realize the full potential of this technology.


Subject(s)
Computational Biology , Database Management Systems , Databases, Factual , Gene Expression , Animals , Data Interpretation, Statistical , Database Management Systems/standards , Databases, Factual/standards , Genome , Humans , Image Processing, Computer-Assisted , Information Storage and Retrieval , RNA, Messenger/genetics , Software
19.
Nucleic Acids Res ; 27(1): 12-7, 1999 Jan 01.
Article in English | MEDLINE | ID: mdl-9847132

ABSTRACT

The GenBank (Registered Trademark symbol) sequence database incorporates DNA sequences from all available public sources, primarily through the direct submission of sequence data from individual laboratories and from large-scale sequencing projects. Most submitters use the BankIt (Web) or Sequin programs to format and send sequence data. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank data is accessible through NCBI's integrated retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome and protein structure information. MEDLINE (Registered Trademark symbol) s from published articles describing the sequences are included as an additional source of biological annotation through the PubMed search system. Sequence similarity searching is offered through the BLAST series of database search programs. In addition to FTP, Email, and server/client versions of Entrez and BLAST, NCBI offers a wide range of World Wide Web retrieval and analysis services based on GenBank data. The GenBank database and related resources are freely accessible via the URL: http://www.ncbi.nlm.nih.gov


Subject(s)
Databases, Factual , Genome , Information Storage and Retrieval , Amino Acid Sequence , Animals , Base Sequence , Classification , Expressed Sequence Tags , Gene Library , Humans , Internet , National Library of Medicine (U.S.) , Proteins/genetics , Sequence Homology , Sequence Tagged Sites , United States
20.
Nat Genet ; 20(1): 19-23, 1998 Sep.
Article in English | MEDLINE | ID: mdl-9731524

ABSTRACT

Microarray technology makes it possible to simultaneously study the expression of thousands of genes during a single experiment. We have developed an information system, ArrayDB, to manage and analyse large-scale expression data. The underlying relational database was designed to allow flexibility in the nature and structure of data input and also in the generation of standard or customized reports through a web-browser interface. ArrayDB provides varied options for data retrieval and analysis tools that should facilitate the interpretation of complex hybridization results. A sampling of ArrayDB storage, retrieval and analysis capabilities is available (www.nhgri.nih.gov/DIR/LCG/15K/HTML/ ), along with information on a set of approximately 15,000 genes used to fabricate several widely used microarrays. Information stored in ArrayDB is used to provide integrated gene expression reports by linking array target sequences with NCBI's Entrez retrieval system, UniGene and KEGG pathway views. The integration of external information resources is essential in interpreting intrinsic patterns and relationships in large-scale gene expression data.


Subject(s)
Database Management Systems , Gene Expression , Molecular Biology/methods , Computer Communication Networks , Databases, Factual , Information Storage and Retrieval , Online Systems , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...