Search | Brasil - Virtual Health Library

Prediction of protein-destabilizing polymorphisms by manual curation with protein structure.

Gough, Craig Alan; Homma, Keiichi; Yamaguchi-Kabata, Yumi; Shimada, Makoto K; Chakraborty, Ranajit; Fujii, Yasuyuki; Iwama, Hisakazu; Minoshima, Shinsei; Sakamoto, Shigetaka; Sato, Yoshiharu; Suzuki, Yoshiyuki; Tada-Umezaki, Masahito; Nishikawa, Ken; Imanishi, Tadashi; Gojobori, Takashi.

PLoS One ; 7(11): e50445, 2012.

Article in English | MEDLINE | ID: mdl-23189203

ABSTRACT

The relationship between sequence polymorphisms and human disease has been studied mostly in terms of effects of single nucleotide polymorphisms (SNPs) leading to single amino acid substitutions that change protein structure and function. However, less attention has been paid to more drastic sequence polymorphisms which cause premature termination of a protein's sequence or large changes, insertions, or deletions in the sequence. We have analyzed a large set (nâ=â512) of insertions and deletions (indels) and single nucleotide polymorphisms causing premature termination of translation in disease-related genes. Prediction of protein-destabilization effects was performed by graphical presentation of the locations of polymorphisms in the protein structure, using the Genomes TO Protein (GTOP) database, and manual annotation with a set of specific criteria. Protein-destabilization was predicted for 44.4% of the nonsense SNPs, 32.4% of the frameshifting indels, and 9.1% of the non-frameshifting indels. A prediction of nonsense-mediated decay allowed to infer which truncated proteins would actually be translated as defective proteins. These cases included the proteins linked to diseases inherited dominantly, suggesting a relation between these diseases and toxic aggregation. Our approach would be useful in identifying potentially aggregation-inducing polymorphisms that may have pathological effects.

Subject(s)

Polymorphism, Single Nucleotide , Proteins/chemistry , Proteins/genetics , Databases, Protein , Genetic Predisposition to Disease , Humans , Hydrophobic and Hydrophilic Interactions , INDEL Mutation , Models, Molecular , Protein Conformation , Protein Stability

VarySysDB: a human genetic polymorphism database based on all H-InvDB transcripts.

Shimada, Makoto K; Matsumoto, Ryuzou; Hayakawa, Yosuke; Sanbonmatsu, Ryoko; Gough, Craig; Yamaguchi-Kabata, Yumi; Yamasaki, Chisato; Imanishi, Tadashi; Gojobori, Takashi.

Nucleic Acids Res ; 37(Database issue): D810-5, 2009 Jan.

Article in English | MEDLINE | ID: mdl-18953038

ABSTRACT

Creation of a vast variety of proteins is accomplished by genetic variation and a variety of alternative splicing transcripts. Currently, however, the abundant available data on genetic variation and the transcriptome are stored independently and in a dispersed fashion. In order to provide a research resource regarding the effects of human genetic polymorphism on various transcripts, we developed VarySysDB, a genetic polymorphism database based on 187,156 extensively annotated matured mRNA transcripts from 36,073 loci provided by H-InvDB. VarySysDB offers information encompassing published human genetic polymorphisms for each of these transcripts separately. This allows comparisons of effects derived from a polymorphism on different transcripts. The published information we analyzed includes single nucleotide polymorphisms and deletion-insertion polymorphisms from dbSNP, copy number variations from Database of Genomic Variants, short tandem repeats and single amino acid repeats from H-InvDB and linkage disequilibrium regions from D-HaploDB. The information can be searched and retrieved by features, functions and effects of polymorphisms, as well as by keywords. VarySysDB combines two kinds of viewers, GBrowse and Sequence View, to facilitate understanding of the positional relationship among polymorphisms, genome, transcripts, loci and functional domains. We expect that VarySysDB will yield useful information on polymorphisms affecting gene expression and phenotypes. VarySysDB is available at http://h-invitational.jp/varygene/.

Subject(s)

Alternative Splicing , Databases, Nucleic Acid , Polymorphism, Genetic , RNA, Messenger/chemistry , Humans , Polymorphism, Single Nucleotide , Repetitive Sequences, Nucleic Acid , User-Computer Interface

The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts.

Yamasaki, Chisato; Murakami, Katsuhiko; Fujii, Yasuyuki; Sato, Yoshiharu; Harada, Erimi; Takeda, Jun-ichi; Taniya, Takayuki; Sakate, Ryuichi; Kikugawa, Shingo; Shimada, Makoto; Tanino, Motohiko; Koyanagi, Kanako O; Barrero, Roberto A; Gough, Craig; Chun, Hong-Woo; Habara, Takuya; Hanaoka, Hideki; Hayakawa, Yosuke; Hilton, Phillip B; Kaneko, Yayoi; Kanno, Masako; Kawahara, Yoshihiro; Kawamura, Toshiyuki; Matsuya, Akihiro; Nagata, Naoki; Nishikata, Kensaku; Noda, Akiko Ogura; Nurimoto, Shin; Saichi, Naomi; Sakai, Hiroaki; Sanbonmatsu, Ryoko; Shiba, Rie; Suzuki, Mami; Takabayashi, Kazuhiko; Takahashi, Aiko; Tamura, Takuro; Tanaka, Masayuki; Tanaka, Susumu; Todokoro, Fusano; Yamaguchi, Kaori; Yamamoto, Naoyuki; Okido, Toshihisa; Mashima, Jun; Hashizume, Aki; Jin, Lihua; Lee, Kyung-Bum; Lin, Yi-Chueh; Nozaki, Asami; Sakai, Katsunaga; Tada, Masahito.

Nucleic Acids Res ; 36(Database issue): D793-9, 2008 Jan.

Article in English | MEDLINE | ID: mdl-18089548

ABSTRACT

Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, protein-protein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.

Subject(s)

Databases, Genetic , Genes , RNA, Messenger/chemistry , Animals , Chromosome Mapping , DNA, Complementary/chemistry , Humans , Internet , Proteins/chemistry , Proteins/genetics , Proteins/metabolism , RNA, Messenger/genetics , User-Computer Interface

Cancer-related mutations in BRCA1-BRCT cause long-range structural changes in protein-protein binding sites: a molecular dynamics study.

Gough, Craig A; Gojobori, Takashi; Imanishi, Tadashi.

Proteins ; 66(1): 69-86, 2007 Jan 01.

Article in English | MEDLINE | ID: mdl-17063491

ABSTRACT

Cancer-associated mutations in the BRCT domain of BRCA1 (BRCA1-BRCT) abolish its tumor suppressor function by disrupting interactions with other proteins such as BACH1. Many cancer-related mutations do not cause sufficient destabilization to lead to global unfolding under physiological conditions, and thus abrogation of function probably is due to localized structural changes. To explore the reasons for mutation-induced loss of function, the authors performed molecular dynamics simulations on three cancer-associated mutants, A1708E, M1775R, and Y1853ter, and on the wild type and benign M1652I mutant, and compared the structures and fluctuations. Only the cancer-associated mutants exhibited significant backbone structure differences from the wild-type crystal structure in BACH1-binding regions, some of which are far from the mutation sites. Backbone differences of the A1708E mutant from the liganded wild type structure in these regions are much larger than those of the unliganded wild type X-ray or molecular dynamics structures. These BACH1-binding regions of the cancer-associated mutants also exhibited increases in their fluctuation magnitudes compared with the same regions in the wild type and M1562I mutant, as quantified by quasiharmonic analysis. Several of the regions of increased fluctuation magnitude correspond to correlated motions of residues in contact that provide a continuous path of fluctuating amino acids in contact from the A1708E and Y1853ter mutation sites to the BACH1-binding sites with altered structure and dynamics. The increased fluctuations in the disease-related mutants suggest an increase in vibrational entropy in the unliganded state that could result in a larger entropy loss in the disease-related mutants upon binding BACH1 than in the wild type. To investigate this possibility, vibrational entropies of the A1708E and wild type in the free state and bound to a BACH1-derived phosphopeptide were calculated using quasiharmonic analysis, to determine the binding entropy difference DeltaDeltaS between the A1708E mutant and the wild type. DeltaDeltaS was determined to be -4.0 cal mol(-1) K(-1), with an uncertainty of 2 cal mol(-1) K(-1); that is, the entropy loss upon binding the peptide is 4.0 cal mol(-1) K(-1) greater for the A1708E mutant, corresponding to an entropic contribution to the DeltaDeltaG of binding (-TDeltaDeltaS) 1.1 kcal mol(-1) more positive for the mutant. The observed differences in structure, flexibility, and entropy of binding likely are responsible for abolition of BACH1 binding, and illustrate that many disease- related mutations could have very long-range effects. The methods described here have potential for identifying correlated motions responsible for other long-range effects of deleterious mutations.

Subject(s)

BRCA1 Protein/chemistry , BRCA1 Protein/genetics , Breast Neoplasms/genetics , Mutation , Algorithms , BRCA1 Protein/metabolism , Binding Sites , Computer Simulation , Crystallography, X-Ray , Databases, Protein , Entropy , Female , Humans , Models, Molecular , Protein Binding , Protein Conformation , Protein Structure, Tertiary/genetics

Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J; Debily, Marie-Anne.

PLoS Biol ; 2(6): e162, 2004 Jun.

Article in English | MEDLINE | ID: mdl-15103394

ABSTRACT

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.

Subject(s)

Computational Biology/methods , DNA, Complementary/genetics , Databases, Genetic , Genes/physiology , Genome, Human , Alternative Splicing/genetics , Genes/genetics , Humans , Internet , Microsatellite Repeats/genetics , Open Reading Frames/genetics , Polymorphism, Genetic , Polymorphism, Single Nucleotide , Protein Structure, Tertiary

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL