Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
Add more filters










Publication year range
1.
Hum Mutat ; 38(9): 1155-1168, 2017 09.
Article in English | MEDLINE | ID: mdl-28397312

ABSTRACT

The CAGI-4 Hopkins clinical panel challenge was an attempt to assess state-of-the-art methods for clinical phenotype prediction from DNA sequence. Participants were provided with exonic sequences of 83 genes for 106 patients from the Johns Hopkins DNA Diagnostic Laboratory. Five groups participated in the challenge, predicting both the probability that each patient had each of the 14 possible classes of disease, as well as one or more causal variants. In cases where the Hopkins laboratory reported a variant, at least one predictor correctly identified the disease class in 36 of the 43 patients (84%). Even in cases where the Hopkins laboratory did not find a variant, at least one predictor correctly identified the class in 39 of the 63 patients (62%). Each prediction group correctly diagnosed at least one patient that was not successfully diagnosed by any other group. We discuss the causal variant predictions by different groups and their implications for further development of methods to assess variants of unknown significance. Our results suggest that clinically relevant variants may be missed when physicians order small panels targeted on a specific phenotype. We also quantify the false-positive rate of DNA-guided analysis in the absence of prior phenotypic indication.


Subject(s)
Computational Biology/methods , Sequence Analysis, DNA/methods , Databases, Genetic , Genetic Predisposition to Disease , Genetic Testing , Humans , Phenotype
2.
PLoS Comput Biol ; 10(6): e1003662, 2014 Jun.
Article in English | MEDLINE | ID: mdl-24921255

ABSTRACT

A recent proliferation of Massive Open Online Courses (MOOCs) and other web-based educational resources has greatly increased the potential for effective self-study in many fields. This article introduces a catalog of several hundred free video courses of potential interest to those wishing to expand their knowledge of bioinformatics and computational biology. The courses are organized into eleven subject areas modeled on university departments and are accompanied by commentary and career advice.


Subject(s)
Computational Biology/education , Internet , Computer-Assisted Instruction , Humans
3.
Biopolymers ; 99(3): 203-17, 2013 Mar.
Article in English | MEDLINE | ID: mdl-23034580

ABSTRACT

Polymeric macromolecules, when viewed abstractly as strings of symbols, can be treated in terms of formal language theory, providing a mathematical foundation for characterizing such strings both as collections and in terms of their individual structures. In addition this approach offers a framework for analysis of macromolecules by tools and conventions widely used in computational linguistics. This article introduces the ways that linguistics can be and has been applied to molecular biology, covering the relevant formal language theory at a relatively nontechnical level. Analogies between macromolecules and human natural language are used to provide intuitive insights into the relevance of grammars, parsing, and analysis of language complexity to biology.


Subject(s)
Macromolecular Substances , Molecular Biology , Proteins/chemistry , Linguistics , Protein Folding
4.
PLoS Comput Biol ; 8(9): e1002632, 2012.
Article in English | MEDLINE | ID: mdl-23028269

ABSTRACT

Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field.


Subject(s)
Algorithms , Computational Biology/education , Computer-Assisted Instruction/methods , Curriculum , Internet , Teaching/methods , User-Computer Interface , Online Systems
6.
PLoS Comput Biol ; 6(6): e1000809, 2010 Jun 24.
Article in English | MEDLINE | ID: mdl-20589079
7.
Nat Rev Drug Discov ; 8(11): 865-78, 2009 11.
Article in English | MEDLINE | ID: mdl-19876041

ABSTRACT

Drug discovery must be guided not only by medical need and commercial potential, but also by the areas in which new science is creating therapeutic opportunities, such as target identification and the understanding of disease mechanisms. To systematically identify such areas of high scientific activity, we use bibliometrics and related data-mining methods to analyse over half a terabyte of data, including PubMed abstracts, literature citation data and patent filings. These analyses reveal trends in scientific activity related to disease studied at varying levels, down to individual genes and pathways, and provide methods to monitor areas in which scientific advances are likely to create new therapeutic opportunities.


Subject(s)
Drug Discovery/methods , Drug Industry/methods , Research Design , Bibliometrics , Drug Delivery Systems , Humans , Publications/trends
10.
Nucleic Acids Res ; 37(Database issue): D680-5, 2009 Jan.
Article in English | MEDLINE | ID: mdl-18948278

ABSTRACT

The IUPHAR database (IUPHAR-DB) integrates peer-reviewed pharmacological, chemical, genetic, functional and anatomical information on the 354 nonsensory G protein-coupled receptors (GPCRs), 71 ligand-gated ion channel subunits and 141 voltage-gated-like ion channel subunits encoded by the human, rat and mouse genomes. These genes represent the targets of approximately one-third of currently approved drugs and are a major focus of drug discovery and development programs in the pharmaceutical industry. IUPHAR-DB provides a comprehensive description of the genes and their functions, with information on protein structure and interactions, ligands, expression patterns, signaling mechanisms, functional assays and biologically important receptor variants (e.g. single nucleotide polymorphisms and splice variants). In addition, the phenotypes resulting from altered gene expression (e.g. in genetically altered animals or in human genetic disorders) are described. The content of the database is peer reviewed by members of the International Union of Basic and Clinical Pharmacology Committee on Receptor Nomenclature and Drug Classification (NC-IUPHAR); the data are provided through manual curation of the primary literature by a network of over 60 subcommittees of NC-IUPHAR. Links to other bioinformatics resources, such as NCBI, Uniprot, HGNC and the rat and mouse genome databases are provided. IUPHAR-DB is freely available at http://www.iuphar-db.org.


Subject(s)
Databases, Protein , Ion Channels/genetics , Ion Channels/physiology , Receptors, G-Protein-Coupled/genetics , Receptors, G-Protein-Coupled/physiology , Animals , Drug Discovery , Humans , Ion Channels/chemistry , Ligands , Mice , Protein Subunits/chemistry , Protein Subunits/genetics , Protein Subunits/physiology , Rats , Receptors, G-Protein-Coupled/chemistry
11.
Brief Bioinform ; 9(6): 479-92, 2008 Nov.
Article in English | MEDLINE | ID: mdl-18820304

ABSTRACT

The drug discovery enterprise provides strong drivers for data integration. While attention in this arena has tended to focus on integration of primary data from omics and other large platform technologies contributing to drug discovery and development, the scientific literature remains a major source of information valuable to pharmaceutical enterprises, and therefore tools for mining such data and integrating it with other sources are of vital interest and economic impact. This review provides a brief overview of approaches to literature mining as they relate to drug discovery, and offers an illustrative case study of a 'lightweight' approach we have implemented within an industrial context.


Subject(s)
Databases, Bibliographic , Drug Discovery , Information Storage and Retrieval , PubMed , Animals , Artificial Intelligence , Humans , Internet , Medical Subject Headings , Proteins/genetics , Proteins/metabolism
12.
PLoS Comput Biol ; 3(6): e105, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17604444
14.
J Comput Biol ; 13(5): 1077-100, 2006 Jun.
Article in English | MEDLINE | ID: mdl-16796552

ABSTRACT

Since the first application of context-free grammars to RNA secondary structures in 1988, many researchers have used both ad hoc and formal methods from computational linguistics to model RNA and protein structure. We show how nearly all of these methods are based on the same core principles and can be converted into equivalent approaches in the framework of tree-adjoining grammars and related formalisms. We also propose some new approaches that extend these core principles in novel ways.


Subject(s)
Algorithms , Models, Chemical , Nucleic Acid Conformation , Protein Structure, Secondary , Sequence Analysis, Protein , Sequence Analysis, RNA , Computational Biology
15.
Nat Rev Drug Discov ; 4(1): 45-58, 2005 Jan.
Article in English | MEDLINE | ID: mdl-15688072

ABSTRACT

The effective integration of data and knowledge from many disparate sources will be crucial to future drug discovery. Data integration is a key element of conducting scientific investigations with modern platform technologies, managing increasingly complex discovery portfolios and processes, and fully realizing economies of scale in large enterprises. However, viewing data integration as simply an 'IT problem' underestimates the novel and serious scientific and management challenges it embodies - challenges that could require significant methodological and even cultural changes in our approach to data.


Subject(s)
Data Collection/methods , Drug Design , Systems Integration , Database Management Systems/trends , Drug Industry/methods , Models, Theoretical , Terminology as Topic
16.
OMICS ; 9(4): 351-63, 2005.
Article in English | MEDLINE | ID: mdl-16402893

ABSTRACT

Transcriptomic techniques are valuable tools with which to validate genetic and biological hypotheses and are now widely available for research. However, with the exception of tumor biology, comparative genomics analyses have been difficult to use as discovery engines to describe biologically relevant expression changes. We propose that physical proximity of human genes correlates with similar mRNA expression, so that increased expression might include a disease-relevant gene and many other genes in the adjacent region. To increase the efficiency of combining susceptibility gene mapping and interpretation of transcriptomics, we developed a method to identify clusters of adjacent and similarly expressed genes. Gene expression profiles for 28,945 genes across 101 normal human tissues were obtained from the Gene Logic BioExpress system. The expression similarity for genes in sliding-windows was measured using average pair-wise Pearson correlation coefficients. We identified 187 clusters (p < 10e-4) of co-regulated genes, including 2648 genes, or 9.1% of all genes considered and termed these "clusters of adjacent and similarly expressed genes" (CASEGs). Genes in 15 (8.2%) of these clusters demonstrate a significant co-expression enrichment (p < 10e-10). This study demonstrates the coordinate expression of neighboring genes and provides a comprehensive view of expression-based compartmentalization of the human genome, which can be overlaid on genetic susceptibility gene maps.


Subject(s)
Gene Expression Profiling , Multigene Family , RNA, Messenger/genetics , Humans
17.
Drug Discov Today Technol ; 2(3): 197-204, 2005.
Article in English | MEDLINE | ID: mdl-24981936

ABSTRACT

Genomic and proteomic platform data constitute a hugely important resource to current efforts in disease understanding, systems biology and drug discovery. We review prerequisites for the adequate management of 'omic' data, the means by which such data are analyzed and converted to knowledge relevant to drug discovery and issues crucial to the integration of such data, particularly with chemical, genetic and clinical data.:

SELECTION OF CITATIONS
SEARCH DETAIL
...