Search | VHL Regional Portal

1.

Rare disease variant curation from literature: assessing gaps with creatine transport deficiency in focus.

Lyons, Erica L; Watson, Daniel; Alodadi, Mohammad S; Haugabook, Sharie J; Tawa, Gregory J; Hannah-Shmouni, Fady; Porter, Forbes D; Collins, Jack R; Ottinger, Elizabeth A; Mudunuri, Uma S.

BMC Genomics ; 24(1): 460, 2023 Aug 16.

Article in English | MEDLINE | ID: mdl-37587458

ABSTRACT

BACKGROUND: Approximately 4-8% of the world suffers from a rare disease. Rare diseases are often difficult to diagnose, and many do not have approved therapies. Genetic sequencing has the potential to shorten the current diagnostic process, increase mechanistic understanding, and facilitate research on therapeutic approaches but is limited by the difficulty of novel variant pathogenicity interpretation and the communication of known causative variants. It is unknown how many published rare disease variants are currently accessible in the public domain. RESULTS: This study investigated the translation of knowledge of variants reported in published manuscripts to publicly accessible variant databases. Variants, symptoms, biochemical assay results, and protein function from literature on the SLC6A8 gene associated with X-linked Creatine Transporter Deficiency (CTD) were curated and reported as a highly annotated dataset of variants with clinical context and functional details. Variants were harmonized, their availability in existing variant databases was analyzed and pathogenicity assignments were compared with impact algorithm predictions. 24% of the pathogenic variants found in PubMed articles were not captured in any database used in this analysis while only 65% of the published variants received an accurate pathogenicity prediction from at least one impact prediction algorithm. CONCLUSIONS: Despite being published in the literature, pathogenicity data on patient variants may remain inaccessible for genetic diagnosis, therapeutic target identification, mechanistic understanding, or hypothesis generation. Clinical and functional details presented in the literature are important to make pathogenicity assessments. Impact predictions remain imperfect but are improving, especially for single nucleotide exonic variants, however such predictions are less accurate or unavailable for intronic and multi-nucleotide variants. Developing text mining workflows that use natural language processing for identifying diseases, genes and variants, along with impact prediction algorithms and integrating with details on clinical phenotypes and functional assessments might be a promising approach to scale literature mining of variants and assigning correct pathogenicity. The curated variants list created by this effort includes context details to improve any such efforts on variant curation for rare diseases.

Subject(s)

Creatine , Rare Diseases , Humans , Rare Diseases/genetics , Introns , Algorithms , Nucleotides

2.

Immuno-transcriptomic profiling of extracranial pediatric solid malignancies.

Brohl, Andrew S; Sindiri, Sivasish; Wei, Jun S; Milewski, David; Chou, Hsien-Chao; Song, Young K; Wen, Xinyu; Kumar, Jeetendra; Reardon, Hue V; Mudunuri, Uma S; Collins, Jack R; Nagaraj, Sushma; Gangalapudi, Vineela; Tyagi, Manoj; Zhu, Yuelin J; Masih, Katherine E; Yohe, Marielle E; Shern, Jack F; Qi, Yue; Guha, Udayan; Catchpoole, Daniel; Orentas, Rimas J; Kuznetsov, Igor B; Llosa, Nicolas J; Ligon, John A; Turpin, Brian K; Leino, Daniel G; Iwata, Shintaro; Andrulis, Irene L; Wunder, Jay S; Toledo, Silvia R C; Meltzer, Paul S; Lau, Ching; Teicher, Beverly A; Magnan, Heather; Ladanyi, Marc; Khan, Javed.

Cell Rep ; 37(8): 110047, 2021 11 23.

Article in English | MEDLINE | ID: mdl-34818552

ABSTRACT

We perform an immunogenomics analysis utilizing whole-transcriptome sequencing of 657 pediatric extracranial solid cancer samples representing 14 diagnoses, and additionally utilize transcriptomes of 131 pediatric cancer cell lines and 147 normal tissue samples for comparison. We describe patterns of infiltrating immune cells, T cell receptor (TCR) clonal expansion, and translationally relevant immune checkpoints. We find that tumor-infiltrating lymphocytes and TCR counts vary widely across cancer types and within each diagnosis, and notably are significantly predictive of survival in osteosarcoma patients. We identify potential cancer-specific immunotherapeutic targets for adoptive cell therapies including cell-surface proteins, tumor germline antigens, and lineage-specific transcription factors. Using an orthogonal immunopeptidomics approach, we find several potential immunotherapeutic targets in osteosarcoma and Ewing sarcoma and validated PRAME as a bona fide multi-pediatric cancer target. Importantly, this work provides a critical framework for immune targeting of extracranial solid tumors using parallel immuno-transcriptomic and -peptidomic approaches.

Subject(s)

Neoplasms/genetics , Neoplasms/immunology , Transcriptome/genetics , Adolescent , Antigens, Neoplasm , Cell Line, Tumor , Child , Child, Preschool , Female , Gene Expression/genetics , Gene Expression Profiling/methods , Humans , Immune Checkpoint Proteins/genetics , Immune Checkpoint Proteins/immunology , Immunogenetics/methods , Immunotherapy, Adoptive , Infant , Lymphocytes, Tumor-Infiltrating/immunology , Male , Receptors, Antigen, T-Cell/genetics , Receptors, Antigen, T-Cell/immunology , Transcriptome/immunology , Tumor Microenvironment , Exome Sequencing/methods

3.

AVIA 3.0: interactive portal for genomic variant and sample level analysis.

Reardon, Hue V; Che, Anney; Luke, Brian T; Ravichandran, Sarangan; Collins, Jack R; Mudunuri, Uma S.

Bioinformatics ; 37(16): 2467-2469, 2021 08 25.

Article in English | MEDLINE | ID: mdl-33289511

ABSTRACT

SUMMARY: The Annotation, Visualization and Impact Analysis (AVIA) is a web application combining multiple features to annotate and visualize genomic variant data. Users can investigate functional significance of their genetic alterations across samples, genes and pathways. Version 3.0 of AVIA offers filtering options through interactive charts and by linking disease relevant data sources. Newly incorporated services include gene, variant and sample level reporting, literature and functional correlations among impacted genes, comparative analysis across samples and against data sources such as TCGA and ClinVar, and cohort building. Sample and data management is now feasible through the application, which allows greater flexibility with sharing, reannotating and organizing data. Most importantly, AVIA's utility stems from its convenience for allowing users to upload and explore results without any a priori knowledge or the need to install, update and maintain software or databases. Together, these enhancements strengthen AVIA as a comprehensive, user-driven variant analysis portal. AVAILABILITYAND IMPLEMENTATION: AVIA is accessible online at https://avia-abcc.ncifcrf.gov.

Subject(s)

Databases, Genetic , Genetic Variation , Data Management , Genome , Genomics , Humans , Internet , Software

4.

Erratum to: The somatic autosomal mutation matrix in cancer genomes.

Temiz, Nuri A; Donohue, Duncan E; Bacolla, Albino; Vasquez, Karen M; Cooper, David N; Mudunuri, Uma; Ivanic, Joseph; Cer, Regina Z; Yi, Ming; Stephens, Robert M; Collins, Jack R; Luke, Brian T.

Hum Genet ; 134(8): 865-7, 2015 Aug.

Article in English | MEDLINE | ID: mdl-26071096

5.

The somatic autosomal mutation matrix in cancer genomes.

Temiz, Nuri A; Donohue, Duncan E; Bacolla, Albino; Vasquez, Karen M; Cooper, David N; Mudunuri, Uma; Ivanic, Joseph; Cer, Regina Z; Yi, Ming; Stephens, Robert M; Collins, Jack R; Luke, Brian T.

Hum Genet ; 134(8): 851-64, 2015 Aug.

Article in English | MEDLINE | ID: mdl-26001532

ABSTRACT

DNA damage in somatic cells originates from both environmental and endogenous sources, giving rise to mutations through multiple mechanisms. When these mutations affect the function of critical genes, cancer may ensue. Although identifying genomic subsets of mutated genes may inform therapeutic options, a systematic survey of tumor mutational spectra is required to improve our understanding of the underlying mechanisms of mutagenesis involved in cancer etiology. Recent studies have presented genome-wide sets of somatic mutations as a 96-element vector, a procedure that only captures the immediate neighbors of the mutated nucleotide. Herein, we present a 32 × 12 mutation matrix that captures the nucleotide pattern two nucleotides upstream and downstream of the mutation. A somatic autosomal mutation matrix (SAMM) was constructed from tumor-specific mutations derived from each of 909 individual cancer genomes harboring a total of 10,681,843 single-base substitutions. In addition, mechanistic template mutation matrices (MTMMs) representing oxidative DNA damage, ultraviolet-induced DNA damage, (5m)CpG deamination, and APOBEC-mediated cytosine mutation, are presented. MTMMs were mapped to the individual tumor SAMMs to determine the maximum contribution of each mutational mechanism to the overall mutation pattern. A Manhattan distance across all SAMM elements between any two tumor genomes was used to determine their relative distance. Employing this metric, 89.5% of all tumor genomes were found to have a nearest neighbor from the same tissue of origin. When a distance-dependent 6-nearest neighbor classifier was used, 10.4% of the SAMMs had an Undetermined tissue of origin, and 92.2% of the remaining SAMMs were assigned to the correct tissue of origin. [corrected]. Thus, although tumors from different tissues may have similar mutation patterns, their SAMMs often display signatures that are characteristic of specific tissues.

Subject(s)

DNA Damage , DNA, Neoplasm/genetics , Databases, Genetic , Genome, Human , Mutation, Missense , Neoplasms/genetics , Female , Humans , Male

6.

AVIA v2.0: annotation, visualization and impact analysis of genomic variants and genes.

Vuong, Hue; Che, Anney; Ravichandran, Sarangan; Luke, Brian T; Collins, Jack R; Mudunuri, Uma S.

Bioinformatics ; 31(16): 2748-50, 2015 Aug 15.

Article in English | MEDLINE | ID: mdl-25861966

ABSTRACT

UNLABELLED: As sequencing becomes cheaper and more widely available, there is a greater need to quickly and effectively analyze large-scale genomic data. While the functionality of AVIA v1.0, whose implementation was based on ANNOVAR, was comparable with other annotation web servers, AVIA v2.0 represents an enhanced web-based server that extends genomic annotations to cell-specific transcripts and protein-level functional annotations. With AVIA's improved interface, users can better visualize their data, perform comprehensive searches and categorize both coding and non-coding variants. AVAILABILITY AND IMPLEMENTATION: AVIA is freely available through the web at http://avia.abcc.ncifcrf.gov. CONTACT: Hue.Vuong@fnlcr.nih.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Genes , Genetic Variation , Molecular Sequence Annotation , Software , Databases, Genetic , Internet

7.

Can structural features of kinase receptors provide clues on selectivity and inhibition? A molecular modeling study.

Ravichandran, Sarangan; Luke, Brian T; Collins, Jack R.

J Mol Graph Model ; 57: 36-48, 2015 Apr.

Article in English | MEDLINE | ID: mdl-25635590

ABSTRACT

Cancer is a complex disease resulting from the uncontrolled proliferation of cell signaling events. Protein kinases have been identified as central molecules that participate overwhelmingly in oncogenic events, thus becoming key targets for anticancer drugs. A majority of studies converged on the idea that ligand-binding pockets of kinases retain clues to the inhibiting abilities and cross-reacting tendencies of inhibitor drugs. Even though these ideas are critical for drug discovery, validating them using experiments is not only difficult, but also in some cases infeasible. To overcome these limitations and to test these ideas at the molecular level, we present here the results of receptor-focused in-silico docking of nine marketed drugs to 19 different wild-type and mutated kinases chosen from a wide range of families. This investigation highlights the need for using relevant models to explain the correct inhibition trends and the results are used to make predictions that might be able to influence future experiments. Our simulation studies are able to correctly predict the primary targets for each drug studied in majority of cases and our results agree with the existing findings. Our study shows that the conformations a given receptor acquires during kinase activation, and their micro-environment, defines the ligand partners. Type II drugs display high compatibility and selectivity for DFG-out kinase conformations. On the other hand Type I drugs are less selective and show binding preferences for both the open and closed forms of selected kinases. Using this receptor-focused approach, it is possible to capture the observed fold change in binding affinities between the wild-type and disease-centric mutations in ABL kinase for Imatinib and the second-generation ABL drugs. The effects of mutation are also investigated for two other systems, EGFR and B-Raf. Finally, by including pathway information in the design it is possible to model kinase inhibitors with potentially fewer side-effects.

Subject(s)

Models, Molecular , Protein Kinase Inhibitors/pharmacology , Receptor Protein-Tyrosine Kinases/antagonists & inhibitors , Receptor Protein-Tyrosine Kinases/chemistry , Amino Acid Sequence , Cell Line, Tumor , Crystallography, X-Ray , Databases, Protein , ErbB Receptors/antagonists & inhibitors , ErbB Receptors/chemistry , Humans , Imatinib Mesylate/chemistry , Imatinib Mesylate/pharmacology , Kinetics , Ligands , Molecular Docking Simulation , Molecular Sequence Data , Protein Kinase Inhibitors/chemistry , Proto-Oncogene Proteins B-raf/antagonists & inhibitors , Proto-Oncogene Proteins B-raf/chemistry , Sequence Alignment , Thermodynamics

8.

Guanine holes are prominent targets for mutation in cancer and inherited disease.

Bacolla, Albino; Temiz, Nuri A; Yi, Ming; Ivanic, Joseph; Cer, Regina Z; Donohue, Duncan E; Ball, Edward V; Mudunuri, Uma S; Wang, Guliang; Jain, Aklank; Volfovsky, Natalia; Luke, Brian T; Stephens, Robert M; Cooper, David N; Collins, Jack R; Vasquez, Karen M.

PLoS Genet ; 9(9): e1003816, 2013.

Article in English | MEDLINE | ID: mdl-24086153

ABSTRACT

Single base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G â¢ C bp in the context of all 64 5'-NGNN-3' motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.

Subject(s)

Amino Acid Substitution/genetics , Genetic Diseases, Inborn/genetics , Guanine , Neoplasms/genetics , Computational Biology , DNA, Neoplasm/genetics , Genetic Diseases, Inborn/pathology , Germ-Line Mutation , Humans , Models, Molecular , Neoplasms/pathology , Nucleotide Motifs/genetics

9.

Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools.

Cer, Regina Z; Donohue, Duncan E; Mudunuri, Uma S; Temiz, Nuri A; Loss, Michael A; Starner, Nathan J; Halusa, Goran N; Volfovsky, Natalia; Yi, Ming; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M.

Nucleic Acids Res ; 41(Database issue): D94-D100, 2013 Jan.

Article in English | MEDLINE | ID: mdl-23125372

ABSTRACT

The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.

Subject(s)

DNA/chemistry , Databases, Nucleic Acid , Animals , Computer Graphics , Dogs , Humans , Internet , Mice , Molecular Sequence Annotation , Nucleotide Motifs , Rats , Repetitive Sequences, Nucleic Acid , Software , User-Computer Interface

10.

A molecular model of the enantioselective liquid chromatographic separation of (R,S)-ifosfamide and its N-dechloroethylated metabolites on a teicoplanin aglycon chiral stationary phase.

Ravichandran, Sarangan; Collins, Jack R; Singh, Nagendra; W Wainer, Irving.

J Chromatogr A ; 1269: 218-25, 2012 Dec 21.

Article in English | MEDLINE | ID: mdl-22917979

ABSTRACT

The enantioselective separations of the chiral oxazaphosphorines (R,S)-ifosfamide (IF), (R,S)-2-N-dechloroethyl-IF (2-DCE-IF) and (R,S)-3-N-dechloroethyl-IF (3-DCE-IF) were achieved on teicoplanin-based chiral stationary phase using isopropanol:methanol (60:40, v/v) as the mobile phase. Computational models of the teicoplanin and teicoplanin aglycon (TAG) chiral selectors were constructed and used in docking experiments to examine the chiral recognition mechanism associated with the observed resolutions. Initial data showed no significant differences between the simulated selector-selectand complexes using teicoplanin and TAG, and the full study was conducted using TAG. The data from the study indicate that hydrophobic interactions arise between the chlorine atom present in the cholorethyl moieties of the oxazaphosphorine molecules and hydrophobic pockets within the TAG basket and that these interactions anchored and positioned the selectands within the selector-selectand complexes. The complexes were stabilized through the formation of a network of hydrogen bond and cation-π interactions, in which the latter involved the phosphorous atom of the phosphoramide moiety and aromatic components of the TAG aglycon basket. The chirality of the oxazaphosphorine molecule determined the number and strength of the stabilizing interactions which resulted in significant differences in the relative mean binding energies between the complexes formed by the (R) and (S) enantiomers of the selectands. These differences were consistent with the observed chromatographic enantioselectivity and suggest a multi-step chrial recognition mechanism involving the tethering of the selectand to the selector followed by conformational adjustments and stabilization of the selectand-selector complex.

Subject(s)

Chromatography, Liquid/methods , Ifosfamide/isolation & purification , Models, Molecular , Teicoplanin/analogs & derivatives , Chromatography, Liquid/instrumentation , Ifosfamide/metabolism , Molecular Docking Simulation , Stereoisomerism , Teicoplanin/chemistry

11.

The role of methylation in the intrinsic dynamics of B- and Z-DNA.

Temiz, Nuri A; Donohue, Duncan E; Bacolla, Albino; Luke, Brian T; Collins, Jack R.

PLoS One ; 7(4): e35558, 2012.

Article in English | MEDLINE | ID: mdl-22530050

ABSTRACT

Methylation of cytosine at the 5-carbon position (5 mC) is observed in both prokaryotes and eukaryotes. In humans, DNA methylation at CpG sites plays an important role in gene regulation and has been implicated in development, gene silencing, and cancer. In addition, the CpG dinucleotide is a known hot spot for pathologic mutations genome-wide. CpG tracts may adopt left-handed Z-DNA conformations, which have also been implicated in gene regulation and genomic instability. Methylation facilitates this B-Z transition but the underlying mechanism remains unclear. Herein, four structural models of the dinucleotide d(GC)(5) repeat sequence in B-, methylated B-, Z-, and methylated Z-DNA forms were constructed and an aggregate 100 nanoseconds of molecular dynamics simulations in explicit solvent under physiological conditions was performed for each model. Both unmethylated and methylated B-DNA were found to be more flexible than Z-DNA. However, methylation significantly destabilized the BII, relative to the BI, state through the Gp5mC steps. In addition, methylation decreased the free energy difference between B- and Z-DNA. Comparisons of α/Î³ backbone torsional angles showed that torsional states changed marginally upon methylation for B-DNA, and Z-DNA. Methylation-induced conformational changes and lower energy differences may contribute to the transition to Z-DNA by methylated, over unmethylated, B-DNA and may be a contributing factor to biological function.

Subject(s)

DNA Methylation , DNA, B-Form/chemistry , DNA, Z-Form/chemistry , CpG Islands , Dinucleoside Phosphates/chemistry , Molecular Dynamics Simulation , Nucleic Acid Conformation , Thermodynamics

12.

Non-B DNA-forming sequences and WRN deficiency independently increase the frequency of base substitution in human cells.

Bacolla, Albino; Wang, Guliang; Jain, Aklank; Chuzhanova, Nadia A; Cer, Regina Z; Collins, Jack R; Cooper, David N; Bohr, Vilhelm A; Vasquez, Karen M.

J Biol Chem ; 286(12): 10017-26, 2011 Mar 25.

Article in English | MEDLINE | ID: mdl-21285356

ABSTRACT

Although alternative DNA secondary structures (non-B DNA) can induce genomic rearrangements, their associated mutational spectra remain largely unknown. The helicase activity of WRN, which is absent in the human progeroid Werner syndrome, is thought to counteract this genomic instability. We determined non-B DNA-induced mutation frequencies and spectra in human U2OS osteosarcoma cells and assessed the role of WRN in isogenic knockdown (WRN-KD) cells using a supF gene mutation reporter system flanked by triplex- or Z-DNA-forming sequences. Although both non-B DNA and WRN-KD served to increase the mutation frequency, the increase afforded by WRN-KD was independent of DNA structure despite the fact that purified WRN helicase was found to resolve these structures in vitro. In U2OS cells, â¼70% of mutations comprised single-base substitutions, mostly at G·C base-pairs, with the remaining â¼30% being microdeletions. The number of mutations at G·C base-pairs in the context of NGNN/NNCN sequences correlated well with predicted free energies of base stacking and ionization potentials, suggesting a possible origin via oxidation reactions involving electron loss and subsequent electron transfer (hole migration) between neighboring bases. A set of â¼40,000 somatic mutations at G·C base pairs identified in a lung cancer genome exhibited similar correlations, implying that hole migration may also be involved. We conclude that alternative DNA conformations, WRN deficiency and lung tumorigenesis may all serve to increase the mutation rate by promoting, through diverse pathways, oxidation reactions that perturb the electron orbitals of neighboring bases. It follows that such "hole migration" is likely to play a much more widespread role in mutagenesis than previously anticipated.

Subject(s)

DNA, Z-Form/metabolism , Exodeoxyribonucleases , Genomic Instability , Lung Neoplasms/metabolism , RecQ Helicases , Sequence Deletion , Cell Line, Tumor , DNA, Z-Form/genetics , Gene Knockdown Techniques , Humans , Lung Neoplasms/genetics , Werner Syndrome Helicase

13.

Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.

Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M.

Nucleic Acids Res ; 39(Database issue): D383-91, 2011 Jan.

Article in English | MEDLINE | ID: mdl-21097885

ABSTRACT

Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purineâ¢pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.

Subject(s)

DNA/chemistry , Databases, Nucleic Acid , Animals , Base Sequence , Dogs , Genomics , Humans , Macaca , Mice , Nucleic Acid Conformation , Pan troglodytes/genetics , Repetitive Sequences, Nucleic Acid

14.

Structural conservation of interferon gamma among vertebrates.

Savan, Ram; Ravichandran, Sarangan; Collins, Jack R; Sakai, Masahiro; Young, Howard A.

Cytokine Growth Factor Rev ; 20(2): 115-24, 2009 Apr.

Article in English | MEDLINE | ID: mdl-19268624

ABSTRACT

Interferon gamma (IFN-gamma), being the hallmark of the T-cell T(H)1 response, has been extensively studied with respect to its expression and regulation of immune function. This gene has been extensively characterized in many mammalian species, making it one of the most widely cloned immunoregulatory genes. Recently, the gene has been identified in avian and piscine species and we have identified the gene in the frog genome. Based on these identified DNA sequences, we have constructed an evolutionary history of IFN-gamma that shows this molecule can be traced back more than 450 million years ago. Our analysis shows that type II interferon (IFN-gamma) function evolved before the tetrapod-fish split, a finding that contrasts earlier studies showing its origins in tetrapods. The IFN-gamma gene has undergone a further duplication event in teleosts after the tetrapod-fish split suggesting a specific-evolutionary adaptation in fish. The analyses of IFN-gamma, IL-22 and IL-26 genomic region in mammals, chicken, frog and fish reveal an evolutionary conservation of the loci and several regulatory elements controlling IFN-gamma gene transcription. Furthermore, across the vertebrata, the first intron of IFN-gamma gene contains a polymorphic microsatellite that has been closely correlated with disease susceptibility. Comparative-modeling of IFN-gamma structure revealed differences among the representative species but with an overall conservation of the fold, dimer interface and some interactions with the receptor. The structural and functional conservation of IFN-gamma suggests the presence of an innate, natural killer (NK) like response or even an adaptive T(H)1 immune response in lower vertebrates.

Subject(s)

Interferon-gamma/chemistry , Amino Acid Sequence , Animals , Chickens/genetics , Evolution, Molecular , Fishes/genetics , Genes/physiology , Humans , Interferon-gamma/genetics , Models, Molecular , Molecular Sequence Data , Protein Multimerization , Ranidae/genetics , Receptors, Interferon/genetics , Receptors, Interferon/physiology , Sequence Alignment , Interferon gamma Receptor

15.

Examining the significance of fingerprint-based classifiers.

Luke, Brian T; Collins, Jack R.

BMC Bioinformatics ; 9: 545, 2008 Dec 17.

Article in English | MEDLINE | ID: mdl-19091087

ABSTRACT

BACKGROUND: Experimental examinations of biofluids to measure concentrations of proteins or their fragments or metabolites are being explored as a means of early disease detection, distinguishing diseases with similar symptoms, and drug treatment efficacy. Many studies have produced classifiers with a high sensitivity and specificity, and it has been argued that accurate results necessarily imply some underlying biology-based features in the classifier. The simplest test of this conjecture is to examine datasets designed to contain no information with classifiers used in many published studies. RESULTS: The classification accuracy of two fingerprint-based classifiers, a decision tree (DT) algorithm and a medoid classification algorithm (MCA), are examined. These methods are used to examine 30 artificial datasets that contain random concentration levels for 300 biomolecules. Each dataset contains between 30 and 300 Cases and Controls, and since the 300 observed concentrations are randomly generated, these datasets are constructed to contain no biological information. A modest search of decision trees containing at most seven decision nodes finds a large number of unique decision trees with an average sensitivity and specificity above 85% for datasets containing 60 Cases and 60 Controls or less, and for datasets with 90 Cases and 90 Controls many DTs have an average sensitivity and specificity above 80%. For even the largest dataset (300 Cases and 300 Controls) the MCA procedure finds several unique classifiers that have an average sensitivity and specificity above 88% using only six or seven features. CONCLUSION: While it has been argued that accurate classification results must imply some biological basis for the separation of Cases from Controls, our results show that this is not necessarily true. The DT and MCA classifiers are sufficiently flexible and can produce good results from datasets that are specifically constructed to contain no information. This means that a chance fitting to the data is possible. All datasets used in this investigation are available on the web.

Subject(s)

Computational Biology/methods , Algorithms , Artificial Intelligence , Biomarkers/metabolism , Computer Simulation , Databases, Protein , Decision Trees , Gene Expression Profiling/methods , Humans , Models, Statistical , Models, Theoretical , Neural Networks, Computer , Oligonucleotide Array Sequence Analysis/methods , Pattern Recognition, Automated/methods , Reproducibility of Results

16.

Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties.

Bacolla, Albino; Larson, Jacquelynn E; Collins, Jack R; Li, Jian; Milosavljevic, Aleksandar; Stenson, Peter D; Cooper, David N; Wells, Robert D.

Genome Res ; 18(10): 1545-53, 2008 Oct.

Article in English | MEDLINE | ID: mdl-18687880

ABSTRACT

Microsatellites are abundant in vertebrate genomes, but their sequence representation and length distributions vary greatly within each family of repeats (e.g., tetranucleotides). Biophysical studies of 82 synthetic single-stranded oligonucleotides comprising all tetra- and trinucleotide repeats revealed an inverse correlation between the stability of folded-back hairpin and quadruplex structures and the sequence representation for repeats > or =30 bp in length in nine vertebrate genomes. Alternatively, the predicted energies of base-stacking interactions correlated directly with the longest length distributions in vertebrate genomes. Genome-wide analyses indicated that unstable sequences, such as CAG:CTG and CCG:CGG, were over-represented in coding regions and that micro/minisatellites were recruited in genes involved in transcription and signaling pathways, particularly in the nervous system. Microsatellite instability (MSI) is a hallmark of cancer, and length polymorphism within genes can confer susceptibility to inherited disease. Sequences that manifest the highest MSI values also displayed the strongest base-stacking interactions; analyses of 62 tri- and tetranucleotide repeat-containing genes associated with human genetic disease revealed enrichments similar to those noted for micro/minisatellite-containing genes. We conclude that DNA structure and base-stacking determined the number and length distributions of microsatellite repeats in vertebrate genomes over evolutionary time and that micro/minisatellites have been recruited to participate in both gene and protein function.

Subject(s)

DNA/chemistry , Genome , Microsatellite Repeats , Trinucleotide Repeats , Animals , Base Pairing , Databases, Nucleic Acid , Humans , Nucleic Acid Conformation , Polymorphism, Genetic , Temperature

17.

Interaction of noncompetitive inhibitors with the alpha3beta2 nicotinic acetylcholine receptor investigated by affinity chromatography and molecular docking.

Jozwiak, Krzysztof; Ravichandran, Sarangan; Collins, Jack R; Moaddel, Ruin; Wainer, Irving W.

J Med Chem ; 50(24): 6279-83, 2007 Nov 29.

Article in English | MEDLINE | ID: mdl-17973360

ABSTRACT

A molecular model of the alpha3beta2 nAChR lumen channel was constructed and hydrophobic clefts were observed near the receptor gate. Docking simulations indicated that ligand-nAChR complexes were formed by hydrophobic interactions with the cleft and hydrogen bond interactions. The equilibrium constants and association and dissociation constant rates associated with the binding interactions were determined using nonlinear chromatography on an immobilized alpha3beta2 nAChR column. The computational-chromatography approach can be used to predict and describe ligand-nAChR interactions.

Subject(s)

Models, Molecular , Nicotinic Antagonists/chemistry , Receptors, Nicotinic/chemistry , Binding Sites , Cell Line , Chromatography, Liquid , Humans , Hydrogen Bonding , Hydrophobic and Hydrophilic Interactions , Ligands , Receptors, Nicotinic/biosynthesis , Structure-Activity Relationship

18.

The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists.

Huang, Da Wei; Sherman, Brad T; Tan, Qina; Collins, Jack R; Alvord, W Gregory; Roayaei, Jean; Stephens, Robert; Baseler, Michael W; Lane, H Clifford; Lempicki, Richard A.

Genome Biol ; 8(9): R183, 2007.

Article in English | MEDLINE | ID: mdl-17784955

ABSTRACT

The DAVID Gene Functional Classification Tool http://david.abcc.ncifcrf.gov uses a novel agglomeration algorithm to condense a list of genes or associated biological terms into organized classes of related genes or biology, called biological modules. This organization is accomplished by mining the complex biological co-occurrences found in multiple sources of functional annotation. It is a powerful method to group functionally related genes and terms into a manageable number of biological modules for efficient interpretation of gene lists in a network context.

Subject(s)

Gene Expression Profiling , Genetic Techniques , Genomics , Algorithms , Cluster Analysis , Computational Biology/methods , Data Interpretation, Statistical , Databases, Genetic , Humans , Models, Theoretical , Oligonucleotide Array Sequence Analysis , Pattern Recognition, Automated , Software

19.

Proteomic analysis identifies oxidative stress induction by adaphostin.

Stockwin, Luke H; Bumke, Maja A; Yu, Sherry X; Webb, Simon P; Collins, Jack R; Hollingshead, Melinda G; Newton, Dianne L.

Clin Cancer Res ; 13(12): 3667-81, 2007 Jun 15.

Article in English | MEDLINE | ID: mdl-17575232

ABSTRACT

PURPOSE: Activities distinct from inhibition of Bcr/abl have led to adaphostin (NSC 680410) being described as "a drug in search of a mechanism." In this study, proteomic analysis of adaphostin-treated myeloid leukemia cell lines was used to further elucidate a mechanism of action. EXPERIMENTAL DESIGN: HL60 and K562 cells treated with adaphostin for 6, 12, or 24 h were analyzed using two-dimensional PAGE. Differentially expressed spots were excised, digested with trypsin, and analyzed by liquid chromatography-tandem mass spectrometry. The contribution of the redox-active hydroquinone group in adaphostin was also examined by carrying out proteomic analysis of HL60 cells treated with a simple hydroquinone (1,4-dihydroxybenzene) or H(2)O(2). RESULTS: Analysis of adaphostin-treated cells identified 49 differentially expressed proteins, the majority being implicated in the response to oxidative stress (e.g., CALM, ERP29, GSTP1, PDIA1) or induction of apoptosis (e.g., LAMA, FLNA, TPR, GDIS). Interestingly, modulation of these proteins was almost fully prevented by inclusion of an antioxidant, N-acetylcysteine. Validation of the proteomic data confirmed GSTP1 as an adaphostin resistance gene. Subsequent analysis of HL60 cells treated with 1,4-dihydroxybenzene or H(2)O(2) showed similar increases in intracellular peroxides and an almost identical proteomic profiles to that of adaphostin treatment. Western blotting of a panel of cell lines identified Cu/Zn superoxide dismutase (SOD) as correlating with adaphostin resistance. The role of SOD as a second adaphostin resistance gene was confirmed by demonstrating that inhibition of SOD using diethyldithiocarbamate increased adaphostin sensitivity, whereas transfection of SOD I attenuated toxicity. Importantly, treatment with 1,4-dihydroxybenzene or H(2)O(2) replicated adaphostin-induced Bcr/abl polypeptide degradation, suggesting that kinase inhibition is a ROS-dependent phenomenon. CONCLUSION: Adaphostin should be classified as a redox-active-substituted dihydroquinone.

Subject(s)

Adamantane/analogs & derivatives , Antineoplastic Agents/pharmacology , Gene Expression/drug effects , Hydroquinones/pharmacology , Oxidative Stress/drug effects , Adamantane/classification , Adamantane/pharmacology , Antineoplastic Agents/classification , Blotting, Western , Electrophoresis, Gel, Two-Dimensional , Gene Expression Profiling , HL-60 Cells , Humans , Hydroquinones/classification , Oxidants/classification , Oxidants/pharmacology , Proteomics

20.

Long homopurine*homopyrimidine sequences are characteristic of genes expressed in brain and the pseudoautosomal region.

Bacolla, Albino; Collins, Jack R; Gold, Bert; Chuzhanova, Nadia; Yi, Ming; Stephens, Robert M; Stefanov, Stefan; Olsh, Adam; Jakupciak, John P; Dean, Michael; Lempicki, Richard A; Cooper, David N; Wells, Robert D.

Nucleic Acids Res ; 34(9): 2663-75, 2006.

Article in English | MEDLINE | ID: mdl-16714445

ABSTRACT

Homo(purine*pyrimidine) sequences (R*Y tracts) with mirror repeat symmetries form stable triplexes that block replication and transcription and promote genetic rearrangements. A systematic search was conducted to map the location of the longest R*Y tracts in the human genome in order to assess their potential function(s). The 814 R*Y tracts with > or =250 uninterrupted base pairs were preferentially clustered in the pseudoautosomal region of the sex chromosomes and located in the introns of 228 annotated genes whose protein products were associated with functions at the cell membrane. These genes were highly expressed in the brain and particularly in genes associated with susceptibility to mental disorders, such as schizophrenia. The set of 1957 genes harboring the 2886 R*Y tracts with > or =100 uninterrupted base pairs was additionally enriched in proteins associated with phosphorylation, signal transduction, development and morphogenesis. Comparisons of the > or =250 bp R*Y tracts in the mouse and chimpanzee genomes indicated that these sequences have mutated faster than the surrounding regions and are longer in humans than in chimpanzees. These results support a role for long R*Y tracts in promoting recombination and genome diversity during evolution through destabilization of chromosomal DNA, thereby inducing repair and mutation.

Subject(s)

Brain/metabolism , DNA/chemistry , Gene Expression , Sex Chromosomes , Animals , Evolution, Molecular , Genome, Human , Humans , Pan troglodytes/genetics , Proteins/genetics , Purines/chemistry , Pyrimidines/chemistry , Repetitive Sequences, Nucleic Acid , Tissue Distribution

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL