Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 341
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Am Med Inform Assoc ; 27(5): 757-769, 2020 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-32364237

RESUMO

OBJECTIVE: Non-small cell lung cancer is a leading cause of cancer death worldwide, and histopathological evaluation plays the primary role in its diagnosis. However, the morphological patterns associated with the molecular subtypes have not been systematically studied. To bridge this gap, we developed a quantitative histopathology analytic framework to identify the types and gene expression subtypes of non-small cell lung cancer objectively. MATERIALS AND METHODS: We processed whole-slide histopathology images of lung adenocarcinoma (n = 427) and lung squamous cell carcinoma patients (n = 457) in the Cancer Genome Atlas. We built convolutional neural networks to classify histopathology images, evaluated their performance by the areas under the receiver-operating characteristic curves (AUCs), and validated the results in an independent cohort (n = 125). RESULTS: To establish neural networks for quantitative image analyses, we first built convolutional neural network models to identify tumor regions from adjacent dense benign tissues (AUCs > 0.935) and recapitulated expert pathologists' diagnosis (AUCs > 0.877), with the results validated in an independent cohort (AUCs = 0.726-0.864). We further demonstrated that quantitative histopathology morphology features identified the major transcriptomic subtypes of both adenocarcinoma and squamous cell carcinoma (P < .01). DISCUSSION: Our study is the first to classify the transcriptomic subtypes of non-small cell lung cancer using fully automated machine learning methods. Our approach does not rely on prior pathology knowledge and can discover novel clinically relevant histopathology patterns objectively. The developed procedure is generalizable to other tumor types or diseases.

3.
Clin Transl Sci ; 2020 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-32100936

RESUMO

Asians as a group comprise > 60% the world's population. There is an incredible amount of diversity in Asian and admixed populations that has not been addressed in a pharmacogenetic context. The known pharmacogenetic differences in Asian subgroups generally represent previously known variants that are present at much lower or higher frequencies in Asians compared with other populations. In this review we summarize the main drugs and known genes that appear to have differences in their pharmacogenetic properties in certain Asian populations. Evidence-based guidelines and summary statistics from the US Food and Drug Administration and the Clinical Pharmacogenetics Implementation Consortium were analyzed for ethnic differences in outcomes. Implicated drugs included commonly prescribed drugs such as warfarin, clopidogrel, carbamazepine, and allopurinol. The majority of these associations are due to Asians more commonly being poor metabolizers of cytochrome P450 (CYP) 2C19 and carriers of the human leukocyte antigen (HLA)-B*15:02 allele. The relative risk increase was shown to vary between genes and drugs, but could be > 100-fold higher in Asians. Specifically, there was a 172-fold increased risk of Stevens-Johnson syndrome and toxic epidermal necrolysis with carbamazepine use among HLA-B*15:02 carriers. The effects ranged from relatively benign reactions such as reduced drug efficacy to severe cutaneous skin reactions. These reactions are severe and prevalent enough to warrant pharmacogenetic testing and appropriate changes in dose and medication choice for at-risk populations. Further studies should be done on Asian cohorts to more fully understand pharmacogenetic variants in these populations and to clarify how such differences may influence drug response.

4.
Clin Pharmacol Ther ; 107(1): 203-210, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31306493

RESUMO

Pharmacogenomics (PGx) decision support and return of results is an active area of precision medicine. One challenge of implementing PGx is extracting genomic variants and assigning haplotypes in order to apply prescribing recommendations and information from the Clinical Pharmacogenetics Implementation Consortium (CPIC), the US Food and Drug Administration (FDA), the Pharmacogenomics Knowledgebase (PharmGKB), etc. Pharmacogenomics Clinical Annotation Tool (PharmCAT) (i) extracts variants specified in guidelines from a genetic data set derived from sequencing or genotyping technologies, (ii) infers haplotypes and diplotypes, and (iii) generates a report containing genotype/diplotype-based annotations and guideline recommendations. We describe PharmCAT and a pilot validation project comparing results for 1000 Genomes Project sequences of Coriell samples with corresponding Genetic Testing Reference Materials Coordination Program (GeT-RM) sample characterization. PharmCAT was highly concordant with the GeT-RM data. PharmCAT is available in GitHub to evaluate, test, and report results back to the community. As precision medicine becomes more prevalent, our ability to consistently, accurately, and clearly define and report PGx annotations and prescribing recommendations is critical.

7.
Pac Symp Biocomput ; 25: 463-474, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31797619

RESUMO

Millions of Americans are affected by rare diseases, many of which have poor survival rates. However, the small market size of individual rare diseases, combined with the time and capital requirements of pharmaceutical R&D, have hindered the development of new drugs for these cases. A promising alternative is drug repurposing, whereby existing FDA-approved drugs might be used to treat diseases different from their original indications. In order to generate drug repurposing hypotheses in a systematic and comprehensive fashion, it is essential to integrate information from across the literature of pharmacology, genetics, and pathology. To this end, we leverage a newly developed knowledge graph, the Global Network of Biomedical Relationships (GNBR). GNBR is a large, heterogeneous knowledge graph comprising drug, disease, and gene (or protein) entities linked by a small set of semantic themes derived from the abstracts of biomedical literature. We apply a knowledge graph embedding method that explicitly models the uncertainty associated with literature-derived relationships and uses link prediction to generate drug repurposing hypotheses. This approach achieves high performance on a gold-standard test set of known drug indications (AUROC = 0.89) and is capable of generating novel repurposing hypotheses, which we independently validate using external literature sources and protein interaction networks. Finally, we demonstrate the ability of our model to produce explanations of its predictions.

8.
Pac Symp Biocomput ; 25: 611-622, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31797632

RESUMO

Precision medicine tailors treatment to individuals personal data including differences in their genome. The Pharmacogenomics Knowledgebase (PharmGKB) provides highly curated information on the effect of genetic variation on drug response and side effects for a wide range of drugs. PharmGKB's scientific curators triage, review and annotate a large number of papers each year but the task is challenging. We present the PGxMine resource, a text-mined resource of pharmacogenomic associations from all accessible published literature to assist in the curation of PharmGKB. We developed a supervised machine learning pipeline to extract associations between a variant (DNA and protein changes, star alleles and dbSNP identifiers) and a chemical. PGxMine covers 452 chemicals and 2,426 variants and contains 19,930 mentions of pharmacogenomic associations across 7,170 papers. An evaluation by PharmGKB curators found that 57 of the top 100 associations not found in PharmGKB led to 83 curatable papers and a further 24 associations would likely lead to curatable papers through citations. The results can be viewed at https://pgxmine.pharmgkb.org/ and code can be downloaded at https://github.com/jakelever/pgxmine.

9.
Pac Symp Biocomput ; 25: 671-682, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31797637

RESUMO

One in five Americans experience mental illness, and roughly 75% of psychiatric prescriptions do not successfully treat the patient's condition. Extensive evidence implicates genetic factors and signaling disruption in the pathophysiology of these diseases. Changes in transcription often underlie this molecular pathway dysregulation; individual patient transcriptional data can improve the efficacy of diagnosis and treatment. Recent large-scale genomic studies have uncovered shared genetic modules across multiple psychiatric disorders - providing an opportunity for an integrated multi-disease approach for diagnosis. Moreover, network-based models informed by gene expression can represent pathological biological mechanisms and suggest new genes for diagnosis and treatment. Here, we use patient gene expression data from multiple studies to classify psychiatric diseases, integrate knowledge from expert-curated databases and publicly available experimental data to create augmented disease-specific gene sets, and use these to recommend disease-relevant drugs. From Gene Expression Omnibus, we extract expression data from 145 cases of schizophrenia, 82 cases of bipolar disorder, 190 cases of major depressive disorder, and 307 shared controls. We use pathway-based approaches to predict psychiatric disease diagnosis with a random forest model (78% accuracy) and derive important features to augment available drug and disease signatures. Using protein-protein-interaction networks and embedding-based methods, we build a pipeline to prioritize treatments for psychiatric diseases that achieves a 3.4-fold improvement over a background model. Thus, we demonstrate that gene-expression-derived pathway features can diagnose psychiatric diseases and that molecular insights derived from this classification task can inform treatment prioritization for psychiatric diseases.

11.
Elife ; 82019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31674906

RESUMO

The small molecule Retro-2 prevents ricin toxicity through a poorly-defined mechanism of action (MOA), which involves halting retrograde vesicle transport to the endoplasmic reticulum (ER). CRISPRi genetic interaction analysis revealed Retro-2 activity resembles disruption of the transmembrane domain recognition complex (TRC) pathway, which mediates post-translational ER-targeting and insertion of tail-anchored (TA) proteins, including SNAREs required for retrograde transport. Cell-based and in vitro assays show that Retro-2 blocks delivery of newly-synthesized TA-proteins to the ER-targeting factor ASNA1 (TRC40). An ASNA1 point mutant identified using CRISPR-mediated mutagenesis abolishes both the cytoprotective effect of Retro-2 against ricin and its inhibitory effect on ASNA1-mediated ER-targeting. Together, our work explains how Retro-2 prevents retrograde trafficking of toxins by inhibiting TA-protein targeting, describes a general CRISPR strategy for predicting the MOA of small molecules, and paves the way for drugging the TRC pathway to treat broad classes of viruses known to be inhibited by Retro-2.

12.
Nat Commun ; 10(1): 4941, 2019 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-31666519

RESUMO

Protein-RNA interaction plays important roles in post-transcriptional regulation. However, the task of predicting these interactions given a protein structure is difficult. Here we show that, by leveraging a deep learning model NucleicNet, attributes such as binding preference of RNA backbone constituents and different bases can be predicted from local physicochemical characteristics of protein structure surface. On a diverse set of challenging RNA-binding proteins, including Fem-3-binding-factor 2, Argonaute 2 and Ribonuclease III, NucleicNet can accurately recover interaction modes discovered by structural biology experiments. Furthermore, we show that, without seeing any in vitro or in vivo assay data, NucleicNet can still achieve consistency with experiments, including RNAcompete, Immunoprecipitation Assay, and siRNA Knockdown Benchmark. NucleicNet can thus serve to provide quantitative fitness of RNA sequences for given binding pockets or to predict potential binding pockets and binding RNAs for previously unknown RNA binding proteins.

13.
Circ Cardiovasc Qual Outcomes ; 12(10): e005595, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31610712

RESUMO

BACKGROUND: Atrial fibrillation (AF) increases the risk of stroke 5-fold and there is rising interest to determine if AF severity or burden can further risk stratify these patients, particularly for near-term events. Using continuous remote monitoring data from cardiac implantable electronic devices, we sought to evaluate if machine learned signatures of AF burden could provide prognostic information on near-term risk of stroke when compared to conventional risk scores. METHODS AND RESULTS: We retrospectively identified Veterans Health Administration serviced patients with cardiac implantable electronic device remote monitoring data and at least one day of device-registered AF. The first 30 days of remote monitoring in nonstroke controls were compared against the past 30 days of remote monitoring before stroke in cases. We trained 3 types of models on our data: (1) convolutional neural networks, (2) random forest, and (3) L1 regularized logistic regression (LASSO). We calculated the CHA2DS2-VASc score for each patient and compared its performance against machine learned indices based on AF burden in separate test cohorts. Finally, we investigated the effect of combining our AF burden models with CHA2DS2-VASc. We identified 3114 nonstroke controls and 71 stroke cases, with no significant differences in baseline characteristics. Random forest performed the best in the test data set (area under the curve [AUC]=0.662) and convolutional neural network in the validation dataset (AUC=0.702), whereas CHA2DS2-VASc had an AUC of 0.5 or less in both data sets. Combining CHA2DS2-VASc with random forest and convolutional neural network yielded a validation AUC of 0.696 and test AUC of 0.634, yielding the highest average AUC on nontraining data. CONCLUSIONS: This proof-of-concept study found that machine learning and ensemble methods that incorporate daily AF burden signature provided incremental prognostic value for risk stratification beyond CHA2DS2-VASc for near-term risk of stroke.

14.
J Chem Inf Model ; 59(10): 4131-4149, 2019 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-31580672

RESUMO

Accurate determination of target-ligand interactions is crucial in the drug discovery process. In this paper, we propose a graph-convolutional (Graph-CNN) framework for predicting protein-ligand interactions. First, we built an unsupervised graph-autoencoder to learn fixed-size representations of protein pockets from a set of representative druggable protein binding sites. Second, we trained two Graph-CNNs to automatically extract features from pocket graphs and 2D ligand graphs, respectively, driven by binding classification labels. We demonstrate that graph-autoencoders can learn fixed-size representations for protein pockets of varying sizes and the Graph-CNN framework can effectively capture protein-ligand binding interactions without relying on target-ligand complexes. Across several metrics, Graph-CNNs achieved better or comparable performance to 3DCNN ligand-scoring, AutoDock Vina, RF-Score, and NNScore on common virtual screening benchmark data sets. Visualization of key pocket residues and ligand atoms contributing to the classification decisions confirms that our networks are able to detect important interface residues and ligand atoms within the pockets and ligands, respectively.

15.
Nat Biotechnol ; 37(11): 1332-1343, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31611695

RESUMO

Accurate prediction of antigen presentation by human leukocyte antigen (HLA) class II molecules would be valuable for vaccine development and cancer immunotherapies. Current computational methods trained on in vitro binding data are limited by insufficient training data and algorithmic constraints. Here we describe MARIA (major histocompatibility complex analysis with recurrent integrated architecture; https://maria.stanford.edu/ ), a multimodal recurrent neural network for predicting the likelihood of antigen presentation from a gene of interest in the context of specific HLA class II alleles. In addition to in vitro binding measurements, MARIA is trained on peptide HLA ligand sequences identified by mass spectrometry, expression levels of antigen genes and protease cleavage signatures. Because it leverages these diverse training data and our improved machine learning framework, MARIA (area under the curve = 0.89-0.92) outperformed existing methods in validation datasets. Across independent cancer neoantigen studies, peptides with high MARIA scores are more likely to elicit strong CD4+ T cell responses. MARIA allows identification of immunogenic epitopes in diverse cancers and autoimmune disease.


Assuntos
Linfócitos T CD4-Positivos/imunologia , Biologia Computacional/métodos , Antígenos de Histocompatibilidade Classe II/genética , Apresentação do Antígeno , Aprendizado Profundo , Antígenos de Histocompatibilidade Classe II/química , Humanos , Células K562 , Espectrometria de Massas , Peptídeos/metabolismo , Análise de Sequência de RNA
16.
J Biomed Inform ; 99: 103307, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31627020

RESUMO

Social media has been identified as a promising potential source of information for pharmacovigilance. The adoption of social media data has been hindered by the massive and noisy nature of the data. Initial attempts to use social media data have relied on exact text matches to drugs of interest, and therefore suffer from the gap between formal drug lexicons and the informal nature of social media. The Reddit comment archive represents an ideal corpus for bridging this gap. We trained a word embedding model, RedMed, to facilitate the identification and retrieval of health entities from Reddit data. We compare the performance of our model trained on a consumer-generated corpus against publicly available models trained on expert-generated corpora. Our automated classification pipeline achieves an accuracy of 0.88 and a specificity of >0.9 across four different term classes. Of all drug mentions, an average of 79% (±0.5%) were exact matches to a generic or trademark drug name, 14% (±0.5%) were misspellings, 6.4% (±0.3%) were synonyms, and 0.13% (±0.05%) were pill marks. We find that our system captures an additional 20% of mentions; these would have been missed by approaches that rely solely on exact string matches. We provide a lexicon of misspellings and synonyms for 2978 drugs and a word embedding model trained on a health-oriented subset of Reddit.

17.
Clin Pharmacol Ther ; 2019 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-31562770

RESUMO

The 21st Century Cures Act passed by the United States Congress mandates the US Food and Drug Administration to develop guidance to evaluate the use of real-world evidence (RWE) to support the regulatory process. RWE has generated important medical discoveries, especially in areas where traditional clinical trials would be unethical or infeasible. However, RWE suffers from several issues that hinder its ability to provide proof of treatment efficacy at a level comparable to randomized controlled trials. In this review article, we summarized the advantages and limitations of RWE, identified the key opportunities for RWE, and pointed the way forward to maximize the potential of RWE for regulatory purposes.

18.
Semin Cutan Med Surg ; 38(1): E19-E24, 2019 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-31051019

RESUMO

Pharmacogenomics aims to associate human genetic variability with differences in drug phenotypes in order to tailor drug treatment to individual patients. The massive amount of genetic data generated from large cohorts of patients with variable drug phenotypes have led to advances in this field. Understanding the application of pharmacogenomics in dermatology could inform clinical practice and provide insight for future research. The Pharmacogenomics Knowledge Base and the Clinical Pharmacogenetics Implementation Consortium are among the resources to help clinicians and researchers navigate the many gene-drug associations that have already been discovered. The implementation of clinical pharmacogenomics within health care systems remains an area of ongoing development. This review provides an introduction to the field of pharmacogenomics and to current pharmacogenomics resources using examples of gene-drug associations relevant to the field of dermatology.


Assuntos
Bases de Dados Factuais , Farmacogenética , Dermatopatias/tratamento farmacológico , Fármacos Dermatológicos/efeitos adversos , Fármacos Dermatológicos/uso terapêutico , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/prevenção & controle , Humanos , Medicina de Precisão
19.
Bioinformatics ; 35(9): 1503-1512, 2019 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-31051039

RESUMO

MOTIVATION: Accurate annotation of protein functions is fundamental for understanding molecular and cellular physiology. Data-driven methods hold promise for systematically deriving rules underlying the relationship between protein structure and function. However, the choice of protein structural representation is critical. Pre-defined biochemical features emphasize certain aspects of protein properties while ignoring others, and therefore may fail to capture critical information in complex protein sites. RESULTS: In this paper, we present a general framework that applies 3D convolutional neural networks (3DCNNs) to structure-based protein functional site detection. The framework can extract task-dependent features automatically from the raw atom distributions. We benchmarked our method against other methods and demonstrate better or comparable performance for site detection. Our deep 3DCNNs achieved an average recall of 0.955 at a precision threshold of 0.99 on PROSITE families, detected 98.89 and 92.88% of nitric oxide synthase and TRYPSIN-like enzyme sites in Catalytic Site Atlas, and showed good performance on challenging cases where sequence motifs are absent but a function is known to exist. Finally, we inspected the individual contributions of each atom to the classification decisions and show that our models successfully recapitulate known 3D features within protein functional sites. AVAILABILITY AND IMPLEMENTATION: The 3DCNN models described in this paper are available at https://simtk.org/projects/fscnn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

20.
Bioinformatics ; 35(21): 4504-4506, 2019 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-31114840

RESUMO

SUMMARY: Limited efficacy and intolerable safety limit therapeutic development and identification of potential liabilities earlier in development could significantly improve this process. Computational approaches which aggregate data from multiple sources and consider the drug's pathways effects could add to identification of these liabilities earlier. Such computational methods must be accessible to a variety of users beyond computational scientists, especially regulators and industry scientists, in order to impact the therapeutic development process. We have previously developed and published PathFX, an algorithm for identifying drug networks and phenotypes for understanding drug associations to safety and efficacy. Here we present a streamlined and easy-to-use PathFX web application that allows users to search for drug networks and associated phenotypes. We have also added visualization, and phenotype clustering to improve functionality and interpretability of PathFXweb. AVAILABILITY AND IMPLEMENTATION: https://www.pathfxweb.net/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA