Pesquisa | BVS IEC

1.

The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information.

Morris, John H; Soman, Karthik; Akbas, Rabia E; Zhou, Xiaoyuan; Smith, Brett; Meng, Elaine C; Huang, Conrad C; Cerono, Gabriel; Schenk, Gundolf; Rizk-Jackson, Angela; Harroud, Adil; Sanders, Lauren; Costes, Sylvain V; Bharat, Krish; Chakraborty, Arjun; Pico, Alexander R; Mardirossian, Taline; Keiser, Michael; Tang, Alice; Hardi, Josef; Shi, Yongmei; Musen, Mark; Israni, Sharat; Huang, Sui; Rose, Peter W; Nelson, Charlotte A; Baranzini, Sergio E.

Bioinformatics ; 39(2)2023 02 03.

Artigo em Inglês | MEDLINE | ID: mdl-36759942

RESUMO

MOTIVATION: Knowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size and heterogeneity of the underlying information. RESULTS: In this work, we present the Scalable Precision Medicine Open Knowledge Engine (SPOKE), a biomedical KG connecting millions of concepts via semantically meaningful relationships. SPOKE contains 27 million nodes of 21 different types and 53 million edges of 55 types downloaded from 41 databases. The graph is built on the framework of 11 ontologies that maintain its structure, enable mappings and facilitate navigation. SPOKE is built weekly by python scripts which download each resource, check for integrity and completeness, and then create a 'parent table' of nodes and edges. Graph queries are translated by a REST API and users can submit searches directly via an API or a graphical user interface. Conclusions/Significance: SPOKE enables the integration of seemingly disparate information to support precision medicine efforts. AVAILABILITY AND IMPLEMENTATION: The SPOKE neighborhood explorer is available at https://spoke.rbvi.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Reconhecimento Automatizado de Padrão , Medicina de Precisão , Bases de Dados Factuais

2.

RRDistMaps: a UCSF Chimera tool for viewing and comparing protein distance maps.

Chen, Jonathan E; Huang, Conrad C; Ferrin, Thomas E.

Bioinformatics ; 31(9): 1484-6, 2015 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-25540183

RESUMO

MOTIVATION: Contact maps are a convenient method for the structural biologists to identify structural features through two-dimensional simplification. Binary (yes/no) contact maps with a single cutoff distance can be generalized to show continuous distance ranges. We have developed a UCSF Chimera tool, RRDistMaps, to compute such generalized maps in order to analyze pairwise variations in intramolecular contacts. An interactive utility, RRDistMaps, visualizes conformational changes, both local (e.g. binding-site residues) and global (e.g. hinge motion), between unbound and bound proteins through distance patterns. Users can target residue pairs in RRDistMaps for further navigation in Chimera. The interface contains the unique features of identifying long-range residue motion and aligning sequences to simultaneously compare distance maps. AVAILABILITY AND IMPLEMENTATION: RRDistMaps was developed as part of UCSF Chimera release 1.10, which is freely available at http://rbvi.ucsf.edu/chimera/download.html, and operates on Linux, Windows, and Mac OS. CONTACT: conrad@cgl.ucsf.edu.

Assuntos

Conformação Proteica , Software , Sítios de Ligação , Modelos Moleculares , Ligação Proteica , Proteínas/química

3.

Enhancing UCSF Chimera through web services.

Huang, Conrad C; Meng, Elaine C; Morris, John H; Pettersen, Eric F; Ferrin, Thomas E.

Nucleic Acids Res ; 42(Web Server issue): W478-84, 2014 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-24861624

RESUMO

Integrating access to web services with desktop applications allows for an expanded set of application features, including performing computationally intensive tasks and convenient searches of databases. We describe how we have enhanced UCSF Chimera (http://www.rbvi.ucsf.edu/chimera/), a program for the interactive visualization and analysis of molecular structures and related data, through the addition of several web services (http://www.rbvi.ucsf.edu/chimera/docs/webservices.html). By streamlining access to web services, including the entire job submission, monitoring and retrieval process, Chimera makes it simpler for users to focus on their science projects rather than data manipulation. Chimera uses Opal, a toolkit for wrapping scientific applications as web services, to provide scalable and transparent access to several popular software packages. We illustrate Chimera's use of web services with an example workflow that interleaves use of these services with interactive manipulation of molecular sequences and structures, and we provide an example Python program to demonstrate how easily Opal-based web services can be accessed from within an application. Web server availability: http://webservices.rbvi.ucsf.edu/opal2/dashboard?command=serviceList.

Assuntos

Estrutura Molecular , Software , Internet , Modelos Moleculares

4.

The Structure-Function Linkage Database.

Akiva, Eyal; Brown, Shoshana; Almonacid, Daniel E; Barber, Alan E; Custer, Ashley F; Hicks, Michael A; Huang, Conrad C; Lauck, Florian; Mashiyama, Susan T; Meng, Elaine C; Mischel, David; Morris, John H; Ojha, Sunil; Schnoes, Alexandra M; Stryke, Doug; Yunes, Jeffrey M; Ferrin, Thomas E; Holliday, Gemma L; Babbitt, Patricia C.

Nucleic Acids Res ; 42(Database issue): D521-30, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24271399

RESUMO

The Structure-Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure-function relationships for functionally diverse enzyme superfamilies. Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction. Thus, despite their different functions, members of these superfamilies 'look alike', making them easy to misannotate. To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy. Browsing and searching options in the SFLD provide access to all of these levels. The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels. Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks. The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity.

Assuntos

Bases de Dados de Proteínas , Enzimas/química , Enzimas/classificação , Enzimas/metabolismo , Internet , Anotação de Sequência Molecular , Alinhamento de Sequência , Relação Estrutura-Atividade

5.

ModBase, a database of annotated comparative protein structure models, and associated resources.

Pieper, Ursula; Webb, Benjamin M; Barkan, David T; Schneidman-Duhovny, Dina; Schlessinger, Avner; Braberg, Hannes; Yang, Zheng; Meng, Elaine C; Pettersen, Eric F; Huang, Conrad C; Datta, Ruchira S; Sampathkumar, Parthasarathy; Madhusudhan, Mallur S; Sjölander, Kimmen; Ferrin, Thomas E; Burley, Stephen K; Sali, Andrej.

Nucleic Acids Res ; 39(Database issue): D465-74, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21097780

RESUMO

ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains 10,355,444 reliable models for domains in 2,421,920 unique protein sequences. ModBase allows users to update comparative models on demand, and request modeling of additional sequences through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are available through the ModBase interface as well as the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the SALIGN server for multiple sequence and structure alignment (http://salilab.org/salign), the ModEval server for predicting the accuracy of protein structure models (http://salilab.org/modeval), the PCSS server for predicting which peptides bind to a given protein (http://salilab.org/pcss) and the FoXS server for calculating and fitting Small Angle X-ray Scattering profiles (http://salilab.org/foxs).

Assuntos

Bases de Dados de Proteínas , Modelos Moleculares , Estrutura Terciária de Proteína , Proteínas de Bactérias/química , Gráficos por Computador , Peptídeos/química , Mapeamento de Interação de Proteínas , Proteínas/química , Espalhamento a Baixo Ângulo , Alinhamento de Sequência , Software , Homologia Estrutural de Proteína , Interface Usuário-Computador , Difração de Raios X

6.

UCSF Chimera, MODELLER, and IMP: an integrated modeling system.

Yang, Zheng; Lasker, Keren; Schneidman-Duhovny, Dina; Webb, Ben; Huang, Conrad C; Pettersen, Eric F; Goddard, Thomas D; Meng, Elaine C; Sali, Andrej; Ferrin, Thomas E.

J Struct Biol ; 179(3): 269-78, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-21963794

RESUMO

Structural modeling of macromolecular complexes greatly benefits from interactive visualization capabilities. Here we present the integration of several modeling tools into UCSF Chimera. These include comparative modeling by MODELLER, simultaneous fitting of multiple components into electron microscopy density maps by IMP MultiFit, computing of small-angle X-ray scattering profiles and fitting of the corresponding experimental profile by IMP FoXS, and assessment of amino acid sidechain conformations based on rotamer probabilities and local interactions by Chimera.

Assuntos

Simulação por Computador , Modelos Moleculares , Software , Sequência de Aminoácidos , Animais , Bovinos , Proteínas de Escherichia coli/química , Proteínas de Choque Térmico/química , Substâncias Macromoleculares/química , Dados de Sequência Molecular , Conformação Proteica , Subunidades Proteicas/química , Espalhamento a Baixo Ângulo , Homologia Estrutural de Proteína , Difração de Raios X

7.

UCSF ChimeraX: Structure visualization for researchers, educators, and developers.

Pettersen, Eric F; Goddard, Thomas D; Huang, Conrad C; Meng, Elaine C; Couch, Gregory S; Croll, Tristan I; Morris, John H; Ferrin, Thomas E.

Protein Sci ; 30(1): 70-82, 2021 01.

Artigo em Inglês | MEDLINE | ID: mdl-32881101

RESUMO

UCSF ChimeraX is the next-generation interactive visualization program from the Resource for Biocomputing, Visualization, and Informatics (RBVI), following UCSF Chimera. ChimeraX brings (a) significant performance and graphics enhancements; (b) new implementations of Chimera's most highly used tools, many with further improvements; (c) several entirely new analysis features; (d) support for new areas such as virtual reality, light-sheet microscopy, and medical imaging data; (e) major ease-of-use advances, including toolbars with icons to perform actions with a single click, basic "undo" capabilities, and more logical and consistent commands; and (f) an app store for researchers to contribute new tools. ChimeraX includes full user documentation and is free for noncommercial use, with downloads available for Windows, Linux, and macOS from https://www.rbvi.ucsf.edu/chimerax.

Assuntos

Gráficos por Computador , Imageamento Tridimensional , Modelos Moleculares , Software

8.

The human multidrug resistance protein 4 (MRP4, ABCC4): functional analysis of a highly polymorphic gene.

Abla, Nada; Chinn, Leslie W; Nakamura, Tsutomu; Liu, Li; Huang, Conrad C; Johns, Susan J; Kawamoto, Michiko; Stryke, Doug; Taylor, Travis R; Ferrin, Thomas E; Giacomini, Kathleen M; Kroetz, Deanna L.

J Pharmacol Exp Ther ; 325(3): 859-68, 2008 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-18364470

RESUMO

ABCC4 encodes multidrug resistance protein 4 (MRP4), a member of the ATP-binding cassette family of membrane transporters involved in the efflux of endogenous and xenobiotic molecules. The aims of this study were to identify single nucleotide polymorphisms of ABCC4 and to functionally characterize selected nonsynonymous variants. Resequencing was performed in a large ethnically diverse population. Ten nonsynonymous variants were selected for analysis of transport function based on allele frequencies and evolutionary conservation. The reference and variant MRP4 cDNAs were constructed by site-directed mutagenesis and transiently transfected into human embryonic kidney cells (HEK 293T). The function of MRP4 variants was compared by measuring the intracellular accumulation of two antiviral agents, azidothymidine (AZT) and adefovir (PMEA). A total of 98 variants were identified in the coding and flanking intronic regions of ABCC4. Of these, 43 variants are in the coding region, and 22 are nonsynonymous. In a functional screen of ten variants, there was no evidence for a complete loss of function allele. However, two variants (G187W and G487E) showed a significantly reduced function compared to reference with both substrates, as evidenced by higher intracellular accumulation of AZT and PMEA compared to the reference MRP4 (43 and 69% increase in accumulation for G187W compared with the reference MRP4, with AZT and PMEA, respectively). The G187W variant also showed decreased expression following transient transfection of HEK 293T cells. Further studies are required to assess the clinical significance of this altered function and expression and to evaluate substrate specificity of this functional change.

Assuntos

Adenina/análogos & derivados , Antivirais/metabolismo , Proteínas Associadas à Resistência a Múltiplos Medicamentos/genética , Proteínas Associadas à Resistência a Múltiplos Medicamentos/metabolismo , Organofosfonatos/metabolismo , Polimorfismo de Nucleotídeo Único , Zidovudina/metabolismo , Adenina/metabolismo , Sequência de Bases , California , Linhagem Celular , Etnicidade/genética , Haplótipos , Humanos , Dados de Sequência Molecular , Alinhamento de Sequência , Análise de Sequência de DNA , População Branca/genética

9.

structureViz: linking Cytoscape and UCSF Chimera.

Morris, John H; Huang, Conrad C; Babbitt, Patricia C; Ferrin, Thomas E.

Bioinformatics ; 23(17): 2345-7, 2007 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-17623706

RESUMO

structureViz is a Cytoscape plug-in that links the visualization of biological networks provided by Cytoscape with the visualization and analysis of macromolecular structures and sequences provided by UCSF Chimera. When combined with Cytoscape and Chimera, structureViz provides the first tool that links these two critical aspects of computational analysis in a straightforward manner. structureViz includes commands to open structures in Chimera and align them using Chimera's sequence-structure analysis tools. When a structure is opened, structureViz provides an alternative interface to Chimera: the Cytoscape Molecular Structure Navigator. This interface uses a tree-based paradigm to allow users to select and affect the display of models, chains and residues, mostly through the use of context menus.

Assuntos

Gráficos por Computador , Mapeamento de Interação de Proteínas/métodos , Proteoma/química , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Software , Interface Usuário-Computador , Simulação por Computador , Modelos Biológicos , Análise de Sequência de Proteína/métodos , Integração de Sistemas

10.

The International Gene Trap Consortium Website: a portal to all publicly available gene trap cell lines in mouse.

Nord, Alex S; Chang, Patricia J; Conklin, Bruce R; Cox, Antony V; Harper, Courtney A; Hicks, Geoffrey G; Huang, Conrad C; Johns, Susan J; Kawamoto, Michiko; Liu, Songyan; Meng, Elaine C; Morris, John H; Rossant, Janet; Ruiz, Patricia; Skarnes, William C; Soriano, Philippe; Stanford, William L; Stryke, Doug; von Melchner, Harald; Wurst, Wolfgang; Yamamura, Ken-ichi; Young, Stephen G; Babbitt, Patricia C; Ferrin, Thomas E.

Nucleic Acids Res ; 34(Database issue): D642-8, 2006 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-16381950

RESUMO

Gene trapping is a method of generating murine embryonic stem (ES) cell lines containing insertional mutations in known and novel genes. A number of international groups have used this approach to create sizeable public cell line repositories available to the scientific community for the generation of mutant mouse strains. The major gene trapping groups worldwide have recently joined together to centralize access to all publicly available gene trap lines by developing a user-oriented Website for the International Gene Trap Consortium (IGTC). This collaboration provides an impressive public informatics resource comprising approximately 45 000 well-characterized ES cell lines which currently represent approximately 40% of known mouse genes, all freely available for the creation of knockout mice on a non-collaborative basis. To standardize annotation and provide high confidence data for gene trap lines, a rigorous identification and annotation pipeline has been developed combining genomic localization and transcript alignment of gene trap sequence tags to identify trapped loci. This information is stored in a new bioinformatics database accessible through the IGTC Website interface. The IGTC Website (www.genetrap.org) allows users to browse and search the database for trapped genes, BLAST sequences against gene trap sequence tags, and view trapped genes within biological pathways. In addition, IGTC data have been integrated into major genome browsers and bioinformatics sites to provide users with outside portals for viewing this data. The development of the IGTC Website marks a major advance by providing the research community with the data and tools necessary to effectively use public gene trap resources for the large-scale characterization of mammalian gene function.

Assuntos

Linhagem Celular , Bases de Dados de Ácidos Nucleicos , Camundongos/genética , Mutagênese Insercional , Animais , Mapeamento Cromossômico , Embrião de Mamíferos/citologia , Cooperação Internacional , Internet , Camundongos/embriologia , Camundongos Knockout , Mutagênese Insercional/métodos , RNA Mensageiro/análise , Integração de Sistemas , Interface Usuário-Computador

11.

UCSF ChimeraX: Meeting modern challenges in visualization and analysis.

Goddard, Thomas D; Huang, Conrad C; Meng, Elaine C; Pettersen, Eric F; Couch, Gregory S; Morris, John H; Ferrin, Thomas E.

Protein Sci ; 27(1): 14-25, 2018 01.

Artigo em Inglês | MEDLINE | ID: mdl-28710774

RESUMO

UCSF ChimeraX is next-generation software for the visualization and analysis of molecular structures, density maps, 3D microscopy, and associated data. It addresses challenges in the size, scope, and disparate types of data attendant with cutting-edge experimental methods, while providing advanced options for high-quality rendering (interactive ambient occlusion, reliable molecular surface calculations, etc.) and professional approaches to software design and distribution. This article highlights some specific advances in the areas of visualization and usability, performance, and extensibility. ChimeraX is free for noncommercial use and is available from http://www.rbvi.ucsf.edu/chimerax/ for Windows, Mac, and Linux.

Assuntos

Imageamento Tridimensional , Software , Estrutura Molecular

12.

Software extensions to UCSF chimera for interactive visualization of large molecular assemblies.

Goddard, Thomas D; Huang, Conrad C; Ferrin, Thomas E.

Structure ; 13(3): 473-82, 2005 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-15766548

RESUMO

Many structures of large molecular assemblies such as virus capsids and ribosomes have been experimentally determined to atomic resolution. We consider four software problems that arise in interactive visualization and analysis of large assemblies: how to represent multimers efficiently, how to make cartoon representations, how to calculate contacts efficiently, and how to select subassemblies. We describe techniques and algorithms we have developed and give examples of their use. Existing molecular visualization programs work well for single protein and nucleic acid molecules and for small complexes. The methods presented here are proposed as features to add to existing programs or include in next-generation visualization software to allow easy exploration of assemblies containing tens to thousands of macromolecules. Our approach is pragmatic, emphasizing simplicity of code, reliability, and speed. The methods described have been distributed as the Multiscale extension of the UCSF Chimera (www.cgl.ucsf.edu/chimera) molecular graphics program.

Assuntos

Algoritmos , Imageamento Tridimensional/métodos , Modelos Moleculares , Complexos Multiproteicos/química , Software , Estrutura Molecular

13.

Biocuration in the structure-function linkage database: the anatomy of a superfamily.

Holliday, Gemma L; Brown, Shoshana D; Akiva, Eyal; Mischel, David; Hicks, Michael A; Morris, John H; Huang, Conrad C; Meng, Elaine C; Pegg, Scott C-H; Ferrin, Thomas E; Babbitt, Patricia C.

Database (Oxford) ; 2017(1)2017 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-28365730

RESUMO

With ever-increasing amounts of sequence data available in both the primary literature and sequence repositories, there is a bottleneck in annotating molecular function to a sequence. This article describes the biocuration process and methods used in the structure-function linkage database (SFLD) to help address some of the challenges. We discuss how the hierarchy within the SFLD allows us to infer detailed functional properties for functionally diverse enzyme superfamilies in which all members are homologous, conserve an aspect of their chemical function and have associated conserved structural features that enable the chemistry. Also presented is the Enzyme Structure-Function Ontology (ESFO), which has been designed to capture the relationships between enzyme sequence, structure and function that underlie the SFLD and is used to guide the biocuration processes within the SFLD. Database URL: http://sfld.rbvi.ucsf.edu/.

Assuntos

Bases de Dados de Proteínas , Enzimas/química , Enzimas/genética , Ontologia Genética , Anotação de Sequência Molecular , Homologia Estrutural de Proteína , Relação Estrutura-Atividade

14.

Tools for integrated sequence-structure analysis with UCSF Chimera.

Meng, Elaine C; Pettersen, Eric F; Couch, Gregory S; Huang, Conrad C; Ferrin, Thomas E.

BMC Bioinformatics ; 7: 339, 2006 Jul 12.

Artigo em Inglês | MEDLINE | ID: mdl-16836757

RESUMO

BACKGROUND: Comparing related structures and viewing the structures in the context of sequence alignments are important tasks in protein structure-function research. While many programs exist for individual aspects of such work, there is a need for interactive visualization tools that: (a) provide a deep integration of sequence and structure, far beyond mapping where a sequence region falls in the structure and vice versa; (b) facilitate changing data of one type based on the other (for example, using only sequence-conserved residues to match structures, or adjusting a sequence alignment based on spatial fit); (c) can be used with a researcher's own data, including arbitrary sequence alignments and annotations, closely or distantly related sets of proteins, etc.; and (d) interoperate with each other and with a full complement of molecular graphics features. We describe enhancements to UCSF Chimera to achieve these goals. RESULTS: The molecular graphics program UCSF Chimera includes a suite of tools for interactive analyses of sequences and structures. Structures automatically associate with sequences in imported alignments, allowing many kinds of crosstalk. A novel method is provided to superimpose structures in the absence of a pre-existing sequence alignment. The method uses both sequence and secondary structure, and can match even structures with very low sequence identity. Another tool constructs structure-based sequence alignments from superpositions of two or more proteins. Chimera is designed to be extensible, and mechanisms for incorporating user-specific data without Chimera code development are also provided. CONCLUSION: The tools described here apply to many problems involving comparison and analysis of protein structures and their sequences. Chimera includes complete documentation and is intended for use by a wide range of scientists, not just those in the computational disciplines. UCSF Chimera is free for non-commercial use and is available for Microsoft Windows, Apple Mac OS X, Linux, and other platforms from http://www.cgl.ucsf.edu/chimera.

Assuntos

Bases de Dados de Proteínas , Modelos Químicos , Proteínas/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Software , Interface Usuário-Computador , Gráficos por Computador , Simulação por Computador , Sistemas de Gerenciamento de Base de Dados , Modelos Moleculares , Reconhecimento Automatizado de Padrão/métodos , Conformação Proteica , Proteínas/classificação , Proteínas/genética , Proteínas/ultraestrutura , Relação Estrutura-Atividade , Integração de Sistemas

15.

Comparison of methods for genomic localization of gene trap sequences.

Harper, Courtney A; Huang, Conrad C; Stryke, Doug; Kawamoto, Michiko; Ferrin, Thomas E; Babbitt, Patricia C.

BMC Genomics ; 7: 236, 2006 Sep 18.

Artigo em Inglês | MEDLINE | ID: mdl-16982004

RESUMO

BACKGROUND: Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences) were used to evaluate localization results. RESULTS: In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. CONCLUSION: The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

Assuntos

Algoritmos , Biologia Computacional/métodos , Genoma/genética , Alinhamento de Sequência/métodos , Animais , Proteínas de Ciclo Celular/genética , Fator 1 de Modelagem da Cromatina , Proteínas Cromossômicas não Histona/genética , Mapeamento Cromossômico/métodos , Proteínas de Ligação a DNA/genética , Éxons/genética , Camundongos , Proteínas Nucleares/genética , Nucleotídeos/genética , Pseudogenes/genética , Reprodutibilidade dos Testes , Fatores de Tempo

16.

The Structure Superposition Database.

Chiang, Ranyee A; Meng, Elaine C; Huang, Conrad C; Ferrin, Thomas E; Babbitt, Patricia C.

Nucleic Acids Res ; 31(1): 505-10, 2003 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-12520064

RESUMO

The need for new tools for investigating biological systems on a large scale is becoming acute, particularly with respect to computationally intensive analyses such as comparisons of many three-dimensional protein structures. Structure superposition is a valuable approach for understanding evolutionary relationships and for the prediction of function. But while available tools are adequate for generating and viewing superpositions of single pairs of protein structures, these tools are generally too cumbersome and time-consuming for examining multiple superpositions. To address this need, we have created the Structure Superposition Database (SSD) for accessing, viewing and understanding large sets of structure superposition data. The initial implementation of the SSD contains the results of pairwise, all-by-all superpositions of a representative set of 115 (beta/alpha)8 barrel structures (TIM barrels). Future plans call for extending the database to include representative structure superpositions for many additional folds. The SSD can be browsed with a user interface module developed as an extension to Chimera, an extensible molecular modeling program. Features of the user interface module facilitate viewing multiple superpositions together. The SSD interface module can be downloaded from http://ssd.rbvi.ucsf.edu.

Assuntos

Bases de Dados de Proteínas , Homologia Estrutural de Proteína , Animais , Modelos Moleculares , Estrutura Terciária de Proteína , Interface Usuário-Computador

17.

BayGenomics: a resource of insertional mutations in mouse embryonic stem cells.

Stryke, Doug; Kawamoto, Michiko; Huang, Conrad C; Johns, Susan J; King, Leslie A; Harper, Courtney A; Meng, Elaine C; Lee, Roy E; Yee, Alice; L'Italien, Larry; Chuang, Pao-Tien; Young, Stephen G; Skarnes, William C; Babbitt, Patricia C; Ferrin, Thomas E.

Nucleic Acids Res ; 31(1): 278-81, 2003 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-12520002

RESUMO

The BayGenomics gene-trap resource (http://baygenomics.ucsf.edu) provides researchers with access to thousands of mouse embryonic stem (ES) cell lines harboring characterized insertional mutations in both known and novel genes. Each cell line contains an insertional mutation in a specific gene. The identity of the gene that has been interrupted can be determined from a DNA sequence tag. Approximately 75% of our cell lines contain insertional mutations in known mouse genes or genes that share strong sequence similarities with genes that have been identified in other organisms. These cell lines readily transmit the mutation to the germline of mice and many mutant lines of mice have already been generated from this resource. BayGenomics provides facile access to our entire database, including sequence tags for each mutant ES cell line, through the World Wide Web. Investigators can browse our resource, search for specific entries, download any portion of our database and BLAST sequences of interest against our entire set of cell line sequence tags. They can then obtain the mutant ES cell line for the purpose of generating knockout mice.

Assuntos

Linhagem Celular , Bases de Dados de Ácidos Nucleicos , Embrião de Mamíferos/citologia , Genômica , Camundongos/genética , Mutagênese Insercional , Células-Tronco/citologia , Animais , Internet , Camundongos Knockout , Mutação , Interface Usuário-Computador

18.

MODBASE, a database of annotated comparative protein structure models, and associated resources.

Pieper, Ursula; Eswar, Narayanan; Braberg, Hannes; Madhusudhan, M S; Davis, Fred P; Stuart, Ashley C; Mirkovic, Nebojsa; Rossi, Andrea; Marti-Renom, Marc A; Fiser, Andras; Webb, Ben; Greenblatt, Daniel; Huang, Conrad C; Ferrin, Thomas E; Sali, Andrej.

Nucleic Acids Res ; 32(Database issue): D217-22, 2004 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-14681398

RESUMO

MODBASE (http://salilab.org/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE uses the MySQL relational database management system for flexible querying and CHIMERA for viewing the sequences and structures (http://www.cgl.ucsf.edu/chimera/). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different data sets. The largest data set contains 1,26,629 models for domains in 659,495 out of 1,182,126 unique protein sequences in the complete Swiss-Prot/TrEMBL database (August 25, 2003); only models based on alignments with significant similarity scores and models assessed to have the correct fold despite insignificant alignments are included. Another model data set supports target selection and structure-based annotation by the New York Structural Genomics Research Consortium; e.g. the 53 new structures produced by the consortium allowed us to characterize structurally 24,113 sequences. MODBASE also contains binding site predictions for small ligands and a set of predicted interactions between pairs of modeled sequences from the same genome. Our other resources associated with MODBASE include a comprehensive database of multiple protein structure alignments (DBALI, http://salilab.org/dbali) as well as web servers for automated comparative modeling with MODPIPE (MODWEB, http://salilab. org/modweb), modeling of loops in protein structures (MODLOOP, http://salilab.org/modloop) and predicting functional consequences of single nucleotide polymorphisms (SNPWEB, http://salilab. org/snpweb).

Assuntos

Biologia Computacional , Bases de Dados de Proteínas , Proteínas/química , Sequência de Aminoácidos , Animais , Sítios de Ligação , Genômica , Humanos , Internet , Ligantes , Modelos Moleculares , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Ligação Proteica , Conformação Proteica , Proteínas/genética , Alinhamento de Sequência , Software , Interface Usuário-Computador

19.

Functional characterization in yeast of genetic variants in the human equilibrative nucleoside transporter, ENT1.

Osato, Douglas H; Huang, Conrad C; Kawamoto, Michiko; Johns, Susan J; Stryke, Doug; Wang, Joanne; Ferrin, Thomas E; Herskowitz, Ira; Giacomini, Kathleen M.

Pharmacogenetics ; 13(5): 297-301, 2003 May.

Artigo em Inglês | MEDLINE | ID: mdl-12724623

RESUMO

The human equilibrative nucleoside transporter, ENT1, appears to play a critical role in the disposition of nucleosides and nucleoside analogs used clinically as anti-viral and anti-cancer drugs. Recently, we identified variants of ENT1 in an ethnically diverse DNA sample set from 247 individuals, focusing primarily on the coding region. The goal of the present study was to analyse the haplotype structure and functionally characterize the variants of ENT1. We observed that a single haplotype, ENT1*1, accounted for 91.3% of the 494 chromosomes. Functional analysis in Saccharomyces cerevisiae revealed no differences in the kinetics of uptake of nucleosides and nucleoside analogs by the two non-synonymous variant transporters, ENT1-I216T and ENT1-E391K, and the reference ENT1. These results, together with the observation that there are few haplotypes of ENT1, indicate that coding region variants of ENT1 do not contribute to inter-individual differences in response to nucleoside analog drugs.

Assuntos

Transportador Equilibrativo 1 de Nucleosídeo/genética , Variação Genética , Sequência de Aminoácidos , Clonagem Molecular/métodos , Transportador Equilibrativo 1 de Nucleosídeo/química , Transportador Equilibrativo 1 de Nucleosídeo/metabolismo , Humanos , Cinética , Modelos Moleculares , Dados de Sequência Molecular , Nucleosídeos/farmacocinética , Conformação Proteica , Proteínas Recombinantes/química , Proteínas Recombinantes/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Alinhamento de Sequência

20.

Polymorphisms in a human kidney xenobiotic transporter, OCT2, exhibit altered function.

Leabman, Maya K; Huang, Conrad C; Kawamoto, Michiko; Johns, Susan J; Stryke, Douglas; Ferrin, Thomas E; DeYoung, Joseph; Taylor, Travis; Clark, Andrew G; Herskowitz, Ira; Giacomini, Kathleen M.

Pharmacogenetics ; 12(5): 395-405, 2002 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-12142729

RESUMO

The completion of the Human Genome Project and the development of high-throughput polymorphism identification methods have allowed researchers to carry out full genetic analyses of many clinically relevant genes. However, few studies have combined genetic analysis with in vitro phenotyping to better understand the relationship between genetic variation and protein function. Many transporters in the kidney are thought to play key roles in defense against a variety of foreign substances. The goal of this study was to understand the relationship between variation in a gene encoding a major renal xenobiotic transporter, OCT2, and transporter function. We report a comprehensive genetic analysis and functional characterization of variants of OCT2. Twenty-eight variable sites in the OCT2 gene were identified in a collection of 247 ethnically diverse DNA samples. Eight caused non-synonymous amino acid changes, of which four were present at >/= 1% in an ethnic population. All four of these altered transporter function assayed in Xenopus laevis oocytes. Analysis of nucleotide diversity (pi) revealed a higher prevalence of synonymous (pi = 22.4 x 10-4) versus non-synonymous (pi = 2.1 x 10-4) changes in OCT2 than in other genes. In addition, the non-synonymous sites had a significant tendency to exhibit more skewed allele frequencies (more negative Tajima's D-values) compared to synonymous sites. The population-genetic analysis, together with the functional characterization, suggests that selection has acted against amino acid changes in OCT2. This selection may be due to a necessary role of OCT2 in the renal elimination of endogenous amines or xenobiotics, including environmental toxins, neurotoxic amines and therapeutic drugs.

Assuntos

Rim/fisiologia , Proteínas de Transporte de Cátions Orgânicos/genética , Polimorfismo Genético , Alelos , Sequência de Aminoácidos , Animais , Primers do DNA , Feminino , Variação Genética , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Oócitos/fisiologia , Proteínas de Transporte de Cátions Orgânicos/química , Proteínas de Transporte de Cátions Orgânicos/fisiologia , Transportador 2 de Cátion Orgânico , Reação em Cadeia da Polimerase , Conformação Proteica , Xenobióticos/farmacocinética , Xenopus laevis

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA