Búsqueda | Portal Regional de la BVS

Causal reasoning on biological networks: interpreting transcriptional changes.

Chindelevitch, Leonid; Ziemek, Daniel; Enayetallah, Ahmed; Randhawa, Ranjit; Sidders, Ben; Brockel, Christoph; Huang, Enoch S.

Bioinformatics ; 28(8): 1114-21, 2012 Apr 15.

Artículo en Inglés | MEDLINE | ID: mdl-22355083

RESUMEN

MOTIVATION: The interpretation of high-throughput datasets has remained one of the central challenges of computational biology over the past decade. Furthermore, as the amount of biological knowledge increases, it becomes more and more difficult to integrate this large body of knowledge in a meaningful manner. In this article, we propose a particular solution to both of these challenges. METHODS: We integrate available biological knowledge by constructing a network of molecular interactions of a specific kind: causal interactions. The resulting causal graph can be queried to suggest molecular hypotheses that explain the variations observed in a high-throughput gene expression experiment. We show that a simple scoring function can discriminate between a large number of competing molecular hypotheses about the upstream cause of the changes observed in a gene expression profile. We then develop an analytical method for computing the statistical significance of each score. This analytical method also helps assess the effects of random or adversarial noise on the predictive power of our model. RESULTS: Our results show that the causal graph we constructed from known biological literature is extremely robust to random noise and to missing or spurious information. We demonstrate the power of our causal reasoning model on two specific examples, one from a cancer dataset and the other from a cardiac hypertrophy experiment. We conclude that causal reasoning models provide a valuable addition to the biologist's toolkit for the interpretation of gene expression data. AVAILABILITY AND IMPLEMENTATION: R source code for the method is available upon request.

Asunto(s)

Neoplasias de la Mama/genética , Cardiomegalia/genética , Biología Computacional/métodos , Perfilación de la Expresión Génica , Algoritmos , Humanos , Modelos Biológicos

Structure-based druggability assessment--identifying suitable targets for small molecule therapeutics.

Fauman, Eric B; Rai, Brajesh K; Huang, Enoch S.

Curr Opin Chem Biol ; 15(4): 463-8, 2011 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-21704549

RESUMEN

A target is druggable if it can be modulated in vivo by a drug-like molecule. The general properties of oral drugs are summarized by the 'rule of 5' which specifies parameters related to size and lipophilicity. Structure-based target druggability assessment consists of predicting ligand-binding sites on the protein that are complementary to these drug-like properties. Automated identification of ligand-binding sites can use geometrical considerations alone or include specific physicochemical properties of the protein surface. Features of a pocket's size and shape, together with measures of its hydrophobicity, are most informative in identifying suitable drug-binding pockets. The recent availability of several validation sets of druggable versus undruggable targets has helped fuel the development of more elaborate methods.

Asunto(s)

Diseño de Fármacos , Modelos Moleculares , Proteínas/química , Relación Estructura-Actividad Cuantitativa , Bibliotecas de Moléculas Pequeñas/química , Algoritmos , Sitios de Unión , Evaluación Preclínica de Medicamentos/métodos , Interacciones Hidrofóbicas e Hidrofílicas , Ligandos , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/metabolismo , Unión Proteica , Proteínas/metabolismo

Beyond data integration.

Slater, Ted; Bouton, Christopher; Huang, Enoch S.

Drug Discov Today ; 13(13-14): 584-9, 2008 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-18598913

RESUMEN

Pharmaceutical R&D organizations have no shortage of experimental data or annotation information. However, the sheer volume and complexity of this information results in a paralyzing inability to make effective use of it for predicting drug efficacy and safety. Data integration efforts are legion, but even in the rare instances where they succeed, they are found to be insufficient to advance programs because interpretation of query results becomes a research project in itself. In this review, we propose a coherent, interoperable platform comprising knowledge engineering and hypothesis generation components for rapidly making determinations of confidence in mechanism and safety (among other goals) using experimental data and expert knowledge.

Asunto(s)

Interpretación Estadística de Datos , Industria Farmacéutica/tendencias , Industria Farmacéutica/normas , Bases del Conocimiento , Semántica

PFAAT version 2.0: a tool for editing, annotating, and analyzing multiple sequence alignments.

Caffrey, Daniel R; Dana, Paul H; Mathur, Vidhya; Ocano, Marco; Hong, Eun-Jong; Wang, Yaoyu E; Somaroo, Shyamal; Caffrey, Brian E; Potluri, Shobha; Huang, Enoch S.

BMC Bioinformatics ; 8: 381, 2007 Oct 11.

Artículo en Inglés | MEDLINE | ID: mdl-17931421

RESUMEN

BACKGROUND: By virtue of their shared ancestry, homologous sequences are similar in their structure and function. Consequently, multiple sequence alignments are routinely used to identify trends that relate to function. This type of analysis is particularly productive when it is combined with structural and phylogenetic analysis. RESULTS: Here we describe the release of PFAAT version 2.0, a tool for editing, analyzing, and annotating multiple sequence alignments. Support for multiple annotations is a key component of this release as it provides a framework for most of the new functionalities. The sequence annotations are accessible from the alignment and tree, where they are typically used to label sequences or hyperlink them to related databases. Sequence annotations can be created manually or extracted automatically from UniProt entries. Once a multiple sequence alignment is populated with sequence annotations, sequences can be easily selected and sorted through a sophisticated search dialog. The selected sequences can be further analyzed using statistical methods that explicitly model relationships between the sequence annotations and residue properties. Residue annotations are accessible from the alignment viewer and are typically used to designate binding sites or properties for a particular residue. Residue annotations are also searchable, and allow one to quickly select alignment columns for further sequence analysis, e.g. computing percent identities. Other features include: novel algorithms to compute sequence conservation, mapping conservation scores to a 3D structure in Jmol, displaying secondary structure elements, and sorting sequences by residue composition. CONCLUSION: PFAAT provides a framework whereby end-users can specify knowledge for a protein family in the form of annotation. The annotations can be combined with sophisticated analysis to test hypothesis that relate to sequence, structure and function.

Asunto(s)

Documentación/métodos , Proteínas/química , Proteínas/ultraestructura , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Interfaz Usuario-Computador , Secuencia de Aminoácidos , Simulación por Computador , Modelos Químicos , Modelos Moleculares , Datos de Secuencia Molecular

Structure-based maximal affinity model predicts small-molecule druggability.

Cheng, Alan C; Coleman, Ryan G; Smyth, Kathleen T; Cao, Qing; Soulard, Patricia; Caffrey, Daniel R; Salzberg, Anna C; Huang, Enoch S.

Nat Biotechnol ; 25(1): 71-5, 2007 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-17211405

RESUMEN

Lead generation is a major hurdle in small-molecule drug discovery, with an estimated 60% of projects failing from lack of lead matter or difficulty in optimizing leads for drug-like properties. It would be valuable to identify these less-druggable targets before incurring substantial expenditure and effort. Here we show that a model-based approach using basic biophysical principles yields good prediction of druggability based solely on the crystal structure of the target binding site. We quantitatively estimate the maximal affinity achievable by a drug-like molecule, and we show that these calculated values correlate with drug discovery outcomes. We experimentally test two predictions using high-throughput screening of a diverse compound collection. The collective results highlight the utility of our approach as well as strategies for tackling difficult targets.

Asunto(s)

Algoritmos , Diseño de Fármacos , Modelos Químicos , Modelos Moleculares , Preparaciones Farmacéuticas/química , Mapeo de Interacción de Proteínas/métodos , Proteínas/química , Sitios de Unión , Simulación por Computador , Unión Proteica

Predicting ligands for orphan GPCRs.

Huang, Enoch S.

Drug Discov Today ; 10(1): 69-73, 2005 Jan 01.

Artículo en Inglés | MEDLINE | ID: mdl-15676301

RESUMEN

G-protein-coupled receptors (GPCRs) represent not only one of the most successful target classes for the pharmaceutical industry, but also one of the largest and most structurally and functionally diverse. Many are "orphan" GPCRs that have not yet been paired with their cognate ligands. Computational approaches are well suited for this type of classification problem, but most are confounded when the orphan is dissimilar to characterized GPCRs.

Asunto(s)

Receptores Acoplados a Proteínas G , Tecnología Farmacéutica/métodos , Ligandos , Valor Predictivo de las Pruebas , Receptores Acoplados a Proteínas G/clasificación , Receptores Acoplados a Proteínas G/fisiología , Tecnología Farmacéutica/tendencias

Are protein-protein interfaces more conserved in sequence than the rest of the protein surface?

Caffrey, Daniel R; Somaroo, Shyamal; Hughes, Jason D; Mintseris, Julian; Huang, Enoch S.

Protein Sci ; 13(1): 190-202, 2004 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-14691234

RESUMEN

Protein interfaces are thought to be distinguishable from the rest of the protein surface by their greater degree of residue conservation. We test the validity of this approach on an expanded set of 64 protein-protein interfaces using conservation scores derived from two multiple sequence alignment types, one of close homologs/orthologs and one of diverse homologs/paralogs. Overall, we find that the interface is slightly more conserved than the rest of the protein surface when using either alignment type, with alignments of diverse homologs showing marginally better discrimination. However, using a novel surface-patch definition, we find that the interface is rarely significantly more conserved than other surface patches when using either alignment type. When an interface is among the most conserved surface patches, it tends to be part of an enzyme active site. The most conserved surface patch overlaps with 39% (+/- 28%) and 36% (+/- 28%) of the actual interface for diverse and close homologs, respectively. Contrary to results obtained from smaller data sets, this work indicates that residue conservation is rarely sufficient for complete and accurate prediction of protein interfaces. Finally, we find that obligate interfaces differ from transient interfaces in that the former have significantly fewer alignment gaps at the interface than the rest of the protein surface, as well as having buried interface residues that are more conserved than partially buried interface residues.

Asunto(s)

Unión Proteica , Proteínas/química , Secuencia de Aminoácidos , Aminoácidos/química , Sitios de Unión , Análisis por Conglomerados , Secuencia Conservada , Cristalografía por Rayos X , Bases de Datos Factuales , Dimerización , Entropía , Enzimas/química , Enzimas/metabolismo , Evolución Molecular , Genómica , Modelos Moleculares , Datos de Secuencia Molecular , Estructura Secundaria de Proteína , Proteínas/genética , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Solventes/metabolismo , Propiedades de Superficie

Construction of a sequence motif characteristic of aminergic G protein-coupled receptors.

Huang, Enoch S.

Protein Sci ; 12(7): 1360-7, 2003 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-12824482

RESUMEN

An approach to discover sequence patterns characteristic of ligand classes is described and applied to aminergic G protein-coupled receptors (GPCRs). Putative ligand-binding residue positions were inferred from considering three lines of evidence: conservation in the subfamily absent or underrepresented in the superfamily, any available mutation data, and the physicochemical properties of the ligand. For aminergic GPCRs, the motif is composed of a conserved aspartic acid in the third transmembrane (TM) domain (rhodopsin position 117) and a conserved tryptophan in the seventh TM domain (rhodopsin position 293); the roles of each are readily justified by molecular modeling of ligand-receptor interactions. This minimally defined motif is an appropriate computational tool for identifying additional, potentially novel aminergic GPCRs from a set of experimentally uncharacterized "orphan" GPCRs, complementing existing sequence matching, clustering, and machine-learning techniques. Motif sensitivity stems from the stepwise addition of residues characteristic of an entire class of ligand (and not tailored for any particular biogenic amine). This sensitivity is balanced by careful consideration of residues (evidence drawn from mutation data, correlation of ligand properties to residue properties, and location with respect to the extracellular face), thereby maintaining specificity for the aminergic class. A number of orphan GPCRs assigned to the aminergic class by this motif were later discovered to be a novel subfamily of trace amine GPCRs, as well as the successful classification of the histamine H4 receptor.

Asunto(s)

Receptores Acoplados a Proteínas G/química , Secuencia de Aminoácidos , Ácido Aspártico/química , Secuencia Conservada , Modelos Moleculares , Datos de Secuencia Molecular , Mutación , Péptidos Cíclicos/química , Conformación Proteica , Receptores Acoplados a Proteínas G/clasificación , Receptores Acoplados a Proteínas G/genética , Receptores Histamínicos/química , Receptores Histamínicos H4 , Rodopsina/química , Rodopsina/clasificación , Alineación de Secuencia , Triptófano/química

Protein family annotation in a multiple alignment viewer.

Johnson, Jason M; Mason, Keith; Moallemi, Ciamac; Xi, Hualin; Somaroo, Shyamal; Huang, Enoch S.

Bioinformatics ; 19(4): 544-5, 2003 Mar 01.

Artículo en Inglés | MEDLINE | ID: mdl-12611813

RESUMEN

SUMMARY: The Pfaat protein family alignment annotation tool is a Java-based multiple sequence alignment editor and viewer designed for protein family analysis. The application merges display features such as dendrograms, secondary and tertiary protein structure with SRS retrieval, subgroup comparison, and extensive user-annotation capabilities. AVAILABILITY: The program and source code are freely available from the authors under the GNU General Public License at http://www.pfizerdtc.com

Asunto(s)

Documentación , Almacenamiento y Recuperación de la Información/métodos , Proteínas/química , Alineación de Secuencia/métodos , Programas Informáticos , Interfaz Usuario-Computador , Secuencia de Aminoácidos , Modelos Moleculares , Datos de Secuencia Molecular , Conformación Proteica , Proteínas/clasificación , Análisis de Secuencia de Proteína/métodos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA