Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
1.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36759942

RESUMO

MOTIVATION: Knowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size and heterogeneity of the underlying information. RESULTS: In this work, we present the Scalable Precision Medicine Open Knowledge Engine (SPOKE), a biomedical KG connecting millions of concepts via semantically meaningful relationships. SPOKE contains 27 million nodes of 21 different types and 53 million edges of 55 types downloaded from 41 databases. The graph is built on the framework of 11 ontologies that maintain its structure, enable mappings and facilitate navigation. SPOKE is built weekly by python scripts which download each resource, check for integrity and completeness, and then create a 'parent table' of nodes and edges. Graph queries are translated by a REST API and users can submit searches directly via an API or a graphical user interface. Conclusions/Significance: SPOKE enables the integration of seemingly disparate information to support precision medicine efforts. AVAILABILITY AND IMPLEMENTATION: The SPOKE neighborhood explorer is available at https://spoke.rbvi.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Reconhecimento Automatizado de Padrão , Medicina de Precisão , Bases de Dados Factuais
2.
AI Mag ; 43(1): 46-58, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36093122

RESUMO

Knowledge representation and reasoning (KR&R) has been successfully implemented in many fields to enable computers to solve complex problems with AI methods. However, its application to biomedicine has been lagging in part due to the daunting complexity of molecular and cellular pathways that govern human physiology and pathology. In this article we describe concrete uses of SPOKE, an open knowledge network that connects curated information from 37 specialized and human-curated databases into a single property graph, with 3 million nodes and 15 million edges to date. Applications discussed in this article include drug discovery, COVID-19 research and chronic disease diagnosis and management.

3.
J Comput Chem ; 43(15): 1053-1062, 2022 06 05.
Artigo em Inglês | MEDLINE | ID: mdl-35394655

RESUMO

Pfizer's Crystal Structure Database (CSDB) is a key enabling technology that allows scientists on structure-based projects rapid access to Pfizer's vast library of in-house crystal structures, as well as a significant number of structures imported from the Protein Data Bank. In addition to capturing basic information such as the asymmetric unit coordinates, reflection data, and the like, CSDB employs a variety of automated methods to first ensure a standard level of annotations and error checking, and then to add significant value for design teams by processing the structures through a sequence of algorithms that prepares the structures for use in modeling. The structures are made available, both as the original asymmetric unit as submitted, as well as the final prepared structures, through REST-based web services that are consumed by several client desktop applications. The structures can be searched by keyword, sequence, submission date, ligand substructure and similarity search, and other common queries.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Humanos , Ligantes
6.
PLoS Comput Biol ; 15(4): e1006842, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-31009453

RESUMO

Many proteins fold into highly regular and repetitive three dimensional structures. The analysis of structural patterns and repeated elements is fundamental to understand protein function and evolution. We present recent improvements to the CE-Symm tool for systematically detecting and analyzing the internal symmetry and structural repeats in proteins. In addition to the accurate detection of internal symmetry, the tool is now capable of i) reporting the type of symmetry, ii) identifying the smallest repeating unit, iii) describing the arrangement of repeats with transformation operations and symmetry axes, and iv) comparing the similarity of all the internal repeats at the residue level. CE-Symm 2.0 helps the user investigate proteins with a robust and intuitive sequence-to-structure analysis, with many applications in protein classification, functional annotation and evolutionary studies. We describe the algorithmic extensions of the method and demonstrate its applications to the study of interesting cases of protein evolution.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas/química , Software , Sequência de Aminoácidos , Bases de Dados de Proteínas , Modelos Moleculares , Análise de Sequência de Proteína
7.
PLoS Comput Biol ; 15(2): e1006791, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30735498

RESUMO

BioJava is an open-source project that provides a Java library for processing biological data. The project aims to simplify bioinformatic analyses by implementing parsers, data structures, and algorithms for common tasks in genomics, structural biology, ontologies, phylogenetics, and more. Since 2012, we have released two major versions of the library (4 and 5) that include many new features to tackle challenges with increasingly complex macromolecular structure data. BioJava requires Java 8 or higher and is freely available under the LGPL 2.1 license. The project is hosted on GitHub at https://github.com/biojava/biojava. More information and documentation can be found online on the BioJava website (http://www.biojava.org) and tutorial (https://github.com/biojava/biojava-tutorial). All inquiries should be directed to the GitHub page or the BioJava mailing list (http://lists.open-bio.org/mailman/listinfo/biojava-l).


Assuntos
Biologia Computacional/métodos , Acesso à Informação , Algoritmos , Biblioteca Gênica , Genoma/genética , Genômica , Armazenamento e Recuperação da Informação , Internet , Software
8.
Bioinformatics ; 34(21): 3755-3758, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-29850778

RESUMO

Motivation: The interactive visualization of very large macromolecular complexes on the web is becoming a challenging problem as experimental techniques advance at an unprecedented rate and deliver structures of increasing size. Results: We have tackled this problem by developing highly memory-efficient and scalable extensions for the NGL WebGL-based molecular viewer and by using Macromolecular Transmission Format (MMTF), a binary and compressed MMTF. These enable NGL to download and render molecular complexes with millions of atoms interactively on desktop computers and smartphones alike, making it a tool of choice for web-based molecular visualization in research and education. Availability and implementation: The source code is freely available under the MIT license at github.com/arose/ngl and distributed on NPM (npmjs.com/package/ngl). MMTF-JavaScript encoders and decoders are available at github.com/rcsb/mmtf-javascript.


Assuntos
Gráficos por Computador , Internet , Substâncias Macromoleculares , Software
9.
PLoS One ; 13(6): e0197176, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29864163

RESUMO

The Protein Data Bank (PDB) is the single worldwide archive of experimentally-determined three-dimensional (3D) structures of proteins and nucleic acids. As of January 2017, the PDB housed more than 125,000 structures and was growing by more than 11,000 structures annually. Since the 3D structure of a protein is vital to understand the mechanisms of biological processes, diseases, and drug design, correct oligomeric assembly information is of critical importance. Unfortunately, the biologically relevant oligomeric form of a 3D structure is not directly obtainable by X-ray crystallography, whilst in solution methods (NMR or single particle EM) it is known from the experiment. Instead, this information may be provided by the PDB Depositor as metadata coming from additional experiments, be inferred by sequence-sequence comparisons with similar proteins of known oligomeric state, or predicted using software, such as PISA (Proteins, Interfaces, Structures and Assemblies) or EPPIC (Evolutionary Protein Protein Interface Classifier). Despite significant efforts by professional PDB Biocurators during data deposition, there remain a number of structures in the archive with incorrect quaternary structure descriptions (or annotations). Further investigation is, therefore, needed to evaluate the correctness of quaternary structure annotations. In this study, we aim to identify the most probable oligomeric states for proteins represented in the PDB. Our approach evaluated the performance of four independent prediction methods, including text mining of primary publications, inference from homologous protein structures, and two computational methods (PISA and EPPIC). Aggregating predictions to give consensus results outperformed all four of the independent prediction methods, yielding 83% correct, 9% wrong, and 8% inconclusive predictions, when tested with a well-curated benchmark dataset. We have developed a freely-available web-based tool to make this approach accessible to researchers and PDB Biocurators (http://quatstruct.rcsb.org/).


Assuntos
Bases de Dados de Proteínas , Análise de Sequência de Proteína/métodos , Software , Cristalografia por Raios X , Ressonância Magnética Nuclear Biomolecular , Estrutura Quaternária de Proteína
10.
Br J Gen Pract ; 68(672): e505-e511, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29739779

RESUMO

BACKGROUND: Safety netting is a diagnostic strategy used in UK primary care to ensure patients are monitored until their symptoms or signs are explained. Despite being recommended in cancer diagnosis guidelines, little evidence exists about which components are effective and feasible in modern-day primary care. AIM: To understand the reality of safety netting for cancer in contemporary primary care. DESIGN AND SETTING: A qualitative study of GPs in Oxfordshire primary care. METHOD: In-depth interviews with a purposive sample of 25 qualified GPs were undertaken. Interviews were recorded and transcribed verbatim, and analysed thematically using constant comparison. RESULTS: GPs revealed uncertainty about which aspects of clinical practice are considered safety netting. They use bespoke personal strategies, often developed from past mistakes, without knowledge of their colleagues' practice. Safety netting varied according to the perceived risk of cancer, the perceived reliability of each patient to follow advice, GP working patterns, and time pressures. Increasing workload, short appointments, and a reluctance to overburden hospital systems or create unnecessary patient anxiety have together led to a strategy of selective active follow-up of patients perceived to be at higher risk of cancer or less able to act autonomously. This left patients with low-risk-but-not-no-risk symptoms of cancer with less robust or absent safety netting. CONCLUSION: GPs would benefit from clearer guidance on which aspects of clinical practice contribute to effective safety netting for cancer. Practice systems that enable active follow-up of patients with low-risk-but-not-no-risk symptoms, which could represent malignancy, could reduce delays in cancer diagnosis without increasing GP workload.


Assuntos
Detecção Precoce de Câncer , Clínicos Gerais , Padrões de Prática Médica/organização & administração , Lesões Pré-Cancerosas/diagnóstico , Atenção Primária à Saúde , Conduta Expectante/organização & administração , Atitude do Pessoal de Saúde , Pesquisa sobre Serviços de Saúde , Humanos , Segurança do Paciente , Atenção Primária à Saúde/organização & administração , Pesquisa Qualitativa , Reprodutibilidade dos Testes , Reino Unido , Carga de Trabalho
11.
Br J Gen Pract ; 68(670): e323-e332, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29686134

RESUMO

BACKGROUND: It is unclear to what extent primary care practitioners (PCPs) should retain responsibility for follow-up to ensure that patients are monitored until their symptoms or signs are explained. AIM: To explore the extent to which PCPs retain responsibility for diagnostic follow-up actions across 11 international jurisdictions. DESIGN AND SETTING: A secondary analysis of survey data from the International Cancer Benchmarking Partnership. METHOD: The authors counted the proportion of 2879 PCPs who retained responsibility for each area of follow-up (appointments, test results, and non-attenders). Proportions were weighted by the sample size of each jurisdiction. Pooled estimates were obtained using a random-effects model, and UK estimates were compared with non-UK ones. Free-text responses were analysed to contextualise quantitative findings using a modified grounded theory approach. RESULTS: PCPs varied in their retention of responsibility for follow-up from 19% to 97% across jurisdictions and area of follow-up. Test reconciliation was inadequate in most jurisdictions. Significantly fewer UK PCPs retained responsibility for test result communication (73% versus 85%, P = 0.04) and non-attender follow-up (78% versus 93%, P<0.01) compared with non-UK PCPs. PCPs have developed bespoke, inconsistent solutions to follow-up. In cases of greatest concern, 'double safety netting' is described, where both patient and PCP retain responsibility. CONCLUSION: The degree to which PCPs retain responsibility for follow-up is dependent on their level of concern about the patient and their primary care system's properties. Integrated systems to support follow-up are at present underutilised, and research into their development, uptake, and effectiveness seems warranted.


Assuntos
Benchmarking , Continuidade da Assistência ao Paciente/organização & administração , Neoplasias/terapia , Padrões de Prática Médica/estatística & dados numéricos , Atenção Primária à Saúde , Atitude do Pessoal de Saúde , Coleta de Dados , Seguimentos , Pesquisas sobre Atenção à Saúde , Pesquisa sobre Serviços de Saúde , Humanos , Neoplasias/reabilitação , Responsabilidade Social
12.
Nat Biotechnol ; 36(3): 272-281, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29457794

RESUMO

Genome-scale network reconstructions have helped uncover the molecular basis of metabolism. Here we present Recon3D, a computational resource that includes three-dimensional (3D) metabolite and protein structure data and enables integrated analyses of metabolic functions in humans. We use Recon3D to functionally characterize mutations associated with disease, and identify metabolic response signatures that are caused by exposure to certain drugs. Recon3D represents the most comprehensive human metabolic network model to date, accounting for 3,288 open reading frames (representing 17% of functionally annotated human genes), 13,543 metabolic reactions involving 4,140 unique metabolites, and 12,890 protein structures. These data provide a unique resource for investigating molecular mechanisms of human metabolism. Recon3D is available at http://vmh.life.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Redes e Vias Metabólicas/genética , Bases de Dados Genéticas , Humanos , Internet , Anotação de Sequência Molecular , Fases de Leitura Aberta/genética
13.
Eur J Oncol Nurs ; 32: 73-81, 2018 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-29353635

RESUMO

PURPOSE: This study sought to test the acceptability and feasibility of a nurse-led psycho-educational intervention (NLPI) delivered in primary care to prostate cancer survivors, and to provide preliminary estimates of the effectiveness of the intervention. METHODS: Men who reported an ongoing problem with urinary, bowel, sexual or hormone-related functioning/vitality on a self-completion questionnaire were invited to participate. Participants were randomly assigned to the NLPI plus usual care, or to usual care alone. Recruitment and retention rates were assessed. Prostate-related quality of life, self-efficacy, unmet needs, and psychological morbidity were measured at baseline and 9 months. Health-care resource use data was also collected. An integrated qualitative study assessed experiences of the intervention. RESULTS: 61% eligible men (83/136) participated in the trial, with an 87% (72/83) completion rate. Interviews indicated that the intervention filled an important gap in care following treatment completion, helping men to self-manage, and improving their sense of well-being. However, only a small reduction in unmet needs and small improvement in self-efficacy was observed, and no difference in prostate-related quality of life or psychological morbidity. Patients receiving the NLPI recorded more primary care visits, while the usual care group recorded more secondary care visits. Most men (70%; (21/30)) felt the optimal time for the intervention was around the time of diagnosis/before the end of treatment. CONCLUSIONS: Findings suggest a nurse-led psycho-educational intervention in primary care is feasible, acceptable and potentially useful to prostate cancer survivors.


Assuntos
Sobreviventes de Câncer/educação , Sobreviventes de Câncer/psicologia , Educação de Pacientes como Assunto/métodos , Atenção Primária à Saúde/métodos , Neoplasias da Próstata/enfermagem , Neoplasias da Próstata/psicologia , Qualidade de Vida/psicologia , Idoso , Inglaterra , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Papel do Profissional de Enfermagem , Relações Enfermeiro-Paciente , Projetos Piloto , Inquéritos e Questionários
14.
Genome Med ; 9(1): 113, 2017 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-29254494

RESUMO

The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Polimorfismo Genético , Conformação Proteica , Análise de Sequência de Proteína/métodos , Algoritmos , Congressos como Assunto , Estudo de Associação Genômica Ampla/normas , Humanos , Análise de Sequência de Proteína/normas
15.
PLoS Comput Biol ; 13(6): e1005575, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28574982

RESUMO

Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Compostos Químicos , Substâncias Macromoleculares , Software , Internet , Substâncias Macromoleculares/análise , Substâncias Macromoleculares/química , Substâncias Macromoleculares/classificação , Estrutura Molecular
16.
PLoS One ; 12(3): e0174846, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28362865

RESUMO

The size and complexity of 3D macromolecular structures available in the Protein Data Bank is constantly growing. Current tools and file formats have reached limits of scalability. New compression approaches are required to support the visualization of large molecular complexes and enable new and scalable means for data analysis. We evaluated a series of compression techniques for coordinates of 3D macromolecular structures and identified the best performing approaches. By balancing compression efficiency in terms of the decompression speed and compression ratio, and code complexity, our results provide the foundation for a novel standard to represent macromolecular coordinates in a compact and useful file format.


Assuntos
Bases de Dados de Proteínas , Algoritmos , Compressão de Dados , Espectroscopia de Ressonância Magnética , Modelos Teóricos , Estrutura Molecular , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína
17.
PLoS One ; 12(3): e0171355, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28296894

RESUMO

The Protein Data Bank (PDB; http://wwpdb.org) was established in 1971 as the first open access digital data resource in biology with seven protein structures as its initial holdings. The global PDB archive now contains more than 126,000 experimentally determined atomic level three-dimensional (3D) structures of biological macromolecules (proteins, DNA, RNA), all of which are freely accessible via the Internet. Knowledge of the 3D structure of the gene product can help in understanding its function and role in disease. Of particular interest in the PDB archive are proteins for which 3D structures of genetic variant proteins have been determined, thus revealing atomic-level structural differences caused by the variation at the DNA level. Herein, we present a systematic and qualitative analysis of such cases. We observe a wide range of structural and functional changes caused by single amino acid differences, including changes in enzyme activity, aggregation propensity, structural stability, binding, and dissociation, some in the context of large assemblies. Structural comparison of wild type and mutated proteins, when both are available, provide insights into atomic-level structural differences caused by the genetic variation.


Assuntos
Variação Genética , Proteínas/química , Proteínas/fisiologia , Exoma , Polimorfismo de Nucleotídeo Único , Relação Estrutura-Atividade
18.
Bioinformatics ; 33(13): 2047-2049, 2017 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-28334105

RESUMO

SUMMARY: We developed a new software tool, BioJava-ModFinder, for identifying protein modifications observed in 3D structures archived in the Protein Data Bank (PDB). Information on more than 400 types of protein modifications were collected and curated from annotations in PDB, RESID, and PSI-MOD. We divided these modifications into three categories: modified residues, attachment modifications, and cross-links. We have developed a systematic method to identify these modifications in 3D protein structures. We have integrated this package with the RCSB PDB web application and added protein modification annotations to the sequence diagram and structure display. By scanning all 3D structures in the PDB using BioJava-ModFinder, we identified more than 30 000 structures with protein modifications, which can be searched, browsed, and visualized on the RCSB PDB website. AVAILABILITY AND IMPLEMENTATION: BioJava-ModFinder is available as open source (LGPL license) at ( https://github.com/biojava/biojava/tree/master/biojava-modfinder ). The RCSB PDB can be accessed at http://www.rcsb.org . CONTACT: pwrose@ucsd.edu.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Conformação Proteica , Software , Internet
19.
Nucleic Acids Res ; 45(D1): D271-D281, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27794042

RESUMO

The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, http://rcsb.org), the US data center for the global PDB archive, makes PDB data freely available to all users, from structural biologists to computational biologists and beyond. New tools and resources have been added to the RCSB PDB web portal in support of a 'Structural View of Biology.' Recent developments have improved the User experience, including the high-speed NGL Viewer that provides 3D molecular visualization in any web browser, improved support for data file download and enhanced organization of website pages for query, reporting and individual structure exploration. Structure validation information is now visible for all archival entries. PDB data have been integrated with external biological resources, including chromosomal position within the human genome; protein modifications; and metabolic pathways. PDB-101 educational materials have been reorganized into a searchable website and expanded to include new features such as the Geis Digital Archive.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Proteínas/química , Proteínas/genética , Conjuntos de Dados como Assunto , Redes e Vias Metabólicas , Modelos Moleculares , Conformação Proteica , Proteínas/metabolismo , Software , Relação Estrutura-Atividade , Interface Usuário-Computador , Navegador
20.
J Comput Aided Mol Des ; 31(3): 301-304, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-27995514

RESUMO

Scientific software engineering is a distinct discipline from both computational chemistry project support and research informatics. A scientific software engineer not only has a deep understanding of the science of drug discovery but also the desire, skills and time to apply good software engineering practices. A good team of scientific software engineers can create a software foundation that is maintainable, validated and robust. If done correctly, this foundation enable the organization to investigate new and novel computational ideas with a very high level of efficiency.


Assuntos
Desenho Assistido por Computador , Descoberta de Drogas/métodos , Indústria Farmacêutica/métodos , Software , Química Farmacêutica , Biologia Computacional , Desenho de Fármacos , Modelos Moleculares
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...