Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Nat Commun ; 15(1): 2026, 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38467600

RESUMO

Timely detection of Barrett's esophagus, the pre-malignant condition of esophageal adenocarcinoma, can improve patient survival rates. The Cytosponge-TFF3 test, a non-endoscopic minimally invasive procedure, has been used for diagnosing intestinal metaplasia in Barrett's. However, it depends on pathologist's assessment of two slides stained with H&E and the immunohistochemical biomarker TFF3. This resource-intensive clinical workflow limits large-scale screening in the at-risk population. To improve screening capacity, we propose a deep learning approach for detecting Barrett's from routinely stained H&E slides. The approach solely relies on diagnostic labels, eliminating the need for expensive localized expert annotations. We train and independently validate our approach on two clinical trial datasets, totaling 1866 patients. We achieve 91.4% and 87.3% AUROCs on discovery and external test datasets for the H&E model, comparable to the TFF3 model. Our proposed semi-automated clinical workflow can reduce pathologists' workload to 48% without sacrificing diagnostic performance, enabling pathologists to prioritize high risk cases.


Assuntos
Adenocarcinoma , Esôfago de Barrett , Aprendizado Profundo , Neoplasias Esofágicas , Humanos , Esôfago de Barrett/diagnóstico , Esôfago de Barrett/patologia , Neoplasias Esofágicas/diagnóstico , Neoplasias Esofágicas/patologia , Adenocarcinoma/diagnóstico , Adenocarcinoma/patologia , Metaplasia
2.
Nat Commun ; 13(1): 1161, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-35246539

RESUMO

Imperfections in data annotation, known as label noise, are detrimental to the training of machine learning models and have a confounding effect on the assessment of model performance. Nevertheless, employing experts to remove label noise by fully re-annotating large datasets is infeasible in resource-constrained settings, such as healthcare. This work advocates for a data-driven approach to prioritising samples for re-annotation-which we term "active label cleaning". We propose to rank instances according to estimated label correctness and labelling difficulty of each sample, and introduce a simulation framework to evaluate relabelling efficacy. Our experiments on natural images and on a specifically-devised medical imaging benchmark show that cleaning noisy labels mitigates their negative impact on model training, evaluation, and selection. Crucially, the proposed approach enables correcting labels up to 4 × more effectively than typical random selection in realistic conditions, making better use of experts' valuable time for improving dataset quality.


Assuntos
Diagnóstico por Imagem , Aprendizado de Máquina , Benchmarking , Curadoria de Dados , Atenção à Saúde
3.
JAMA Netw Open ; 3(11): e2027426, 2020 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-33252691

RESUMO

Importance: Personalized radiotherapy planning depends on high-quality delineation of target tumors and surrounding organs at risk (OARs). This process puts additional time burdens on oncologists and introduces variability among both experts and institutions. Objective: To explore clinically acceptable autocontouring solutions that can be integrated into existing workflows and used in different domains of radiotherapy. Design, Setting, and Participants: This quality improvement study used a multicenter imaging data set comprising 519 pelvic and 242 head and neck computed tomography (CT) scans from 8 distinct clinical sites and patients diagnosed either with prostate or head and neck cancer. The scans were acquired as part of treatment dose planning from patients who received intensity-modulated radiation therapy between October 2013 and February 2020. Fifteen different OARs were manually annotated by expert readers and radiation oncologists. The models were trained on a subset of the data set to automatically delineate OARs and evaluated on both internal and external data sets. Data analysis was conducted October 2019 to September 2020. Main Outcomes and Measures: The autocontouring solution was evaluated on external data sets, and its accuracy was quantified with volumetric agreement and surface distance measures. Models were benchmarked against expert annotations in an interobserver variability (IOV) study. Clinical utility was evaluated by measuring time spent on manual corrections and annotations from scratch. Results: A total of 519 participants' (519 [100%] men; 390 [75%] aged 62-75 years) pelvic CT images and 242 participants' (184 [76%] men; 194 [80%] aged 50-73 years) head and neck CT images were included. The models achieved levels of clinical accuracy within the bounds of expert IOV for 13 of 15 structures (eg, left femur, κ = 0.982; brainstem, κ = 0.806) and performed consistently well across both external and internal data sets (eg, mean [SD] Dice score for left femur, internal vs external data sets: 98.52% [0.50] vs 98.04% [1.02]; P = .04). The correction time of autogenerated contours on 10 head and neck and 10 prostate scans was measured as a mean of 4.98 (95% CI, 4.44-5.52) min/scan and 3.40 (95% CI, 1.60-5.20) min/scan, respectively, to ensure clinically accepted accuracy. Manual segmentation of the head and neck took a mean 86.75 (95% CI, 75.21-92.29) min/scan for an expert reader and 73.25 (95% CI, 68.68-77.82) min/scan for a radiation oncologist. The autogenerated contours represented a 93% reduction in time. Conclusions and Relevance: In this study, the models achieved levels of clinical accuracy within expert IOV while reducing manual contouring time and performing consistently well across previously unseen heterogeneous data sets. With the availability of open-source libraries and reliable performance, this creates significant opportunities for the transformation of radiation treatment planning.


Assuntos
Aprendizado Profundo/estatística & dados numéricos , Neoplasias de Cabeça e Pescoço/radioterapia , Neoplasias da Próstata/radioterapia , Radioterapia Guiada por Imagem/instrumentação , Idoso , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem , Humanos , Masculino , Pessoa de Meia-Idade , Redes Neurais de Computação , Variações Dependentes do Observador , Órgãos em Risco/efeitos da radiação , Neoplasias da Próstata/diagnóstico por imagem , Melhoria de Qualidade/normas , Radioterapia Guiada por Imagem/métodos , Radioterapia de Intensidade Modulada/métodos , Reprodutibilidade dos Testes , Tomografia Computadorizada por Raios X/métodos
4.
OMICS ; 8(2): 176-88, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15268775

RESUMO

In recent years, graphical models have become an increasingly important tool for the structural analysis of genome-wide expression profiles at the systems level. Here we present a new graphical modelling technique, which is based on decomposable graphical models, and apply it to a set of gene expression profiles from acute lymphoblastic leukemia (ALL). The new method explains probabilistic dependencies of expression levels in terms of the concerted action of underlying genetic functional modules, which are represented as so-called "cliques" in the graph. In addition, the method uses continuous-valued (instead of discretized) expression levels, and makes no particular assumption about their probability distribution. We show that the method successfully groups members of known functional modules to cliques. Our method allows the evaluation of the importance of genes for global cellular functions based on both link count and the clique membership count.


Assuntos
Perfilação da Expressão Gênica , Modelos Teóricos , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Genoma Humano , Humanos , Análise de Sequência com Séries de Oligonucleotídeos
5.
IEEE Trans Biomed Eng ; 50(3): 375-82, 2003 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-12669994

RESUMO

We describe a classification system for a novel imaging method for arthritic finger joints. The basis of this system is a laser imaging technique which is sensitive to the optical characteristics of finger joint tissue. From the laser images acquired at baseline and follow-up, finger joints can automatically be classified according to whether the inflammatory status has improved or worsened. To perform the classification task, various linear and kernel-based systems were implemented and their performances were compared. Based on the results presented in this paper, we conclude that the laser-based imaging permits a reliable classification of pathological finger joints, making it a sensitive method for detecting arthritic changes.


Assuntos
Artrite Reumatoide/classificação , Artrite Reumatoide/diagnóstico , Articulações dos Dedos/patologia , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Lasers , Algoritmos , Sistemas Inteligentes , Humanos , Variações Dependentes do Observador , Reconhecimento Automatizado de Padrão , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
6.
Comb Chem High Throughput Screen ; 12(5): 453-68, 2009 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-19519325

RESUMO

A large number of different machine learning methods can potentially be used for ligand-based virtual screening. In our contribution, we focus on three specific nonlinear methods, namely support vector regression, Gaussian process models, and decision trees. For each of these methods, we provide a short and intuitive introduction. In particular, we will also discuss how confidence estimates (error bars) can be obtained from these methods. We continue with important aspects for model building and evaluation, such as methodologies for model selection, evaluation, performance criteria, and how the quality of error bar estimates can be verified. Besides an introduction to the respective methods, we will also point to available implementations, and discuss important issues for the practical application.


Assuntos
Inteligência Artificial , Descoberta de Drogas , Modelos Estatísticos , Algoritmos , Simulação por Computador , Árvores de Decisões , Ligantes , Modelos Químicos , Distribuição Normal , Relação Quantitativa Estrutura-Atividade
7.
J Chem Inf Model ; 48(4): 785-96, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18327900

RESUMO

Metabolic stability is an important property of drug molecules that should-optimally-be taken into account early on in the drug design process. Along with numerous medium- or high-throughput assays being implemented in early drug discovery, a prediction tool for this property could be of high value. However, metabolic stability is inherently difficult to predict, and no commercial tools are available for this purpose. In this work, we present a machine learning approach to predicting metabolic stability that is tailored to compounds from the drug development process at Bayer Schering Pharma. For four different in vitro assays, we develop Bayesian classification models to predict the probability of a compound being metabolically stable. The chosen approach implicitly takes the "domain of applicability" into account. The developed models were validated on recent project data at Bayer Schering Pharma, showing that the predictions are highly accurate and the domain of applicability is estimated correctly. Furthermore, we evaluate the modeling method on a set of publicly available data.


Assuntos
Probabilidade , Algoritmos , Teorema de Bayes , Desenho de Fármacos
8.
Mol Pharm ; 4(4): 524-38, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17637064

RESUMO

Unfavorable lipophilicity and water solubility cause many drug failures; therefore these properties have to be taken into account early on in lead discovery. Commercial tools for predicting lipophilicity usually have been trained on small and neutral molecules, and are thus often unable to accurately predict in-house data. Using a modern Bayesian machine learning algorithm--a Gaussian process model--this study constructs a log D7 model based on 14,556 drug discovery compounds of Bayer Schering Pharma. Performance is compared with support vector machines, decision trees, ridge regression, and four commercial tools. In a blind test on 7013 new measurements from the last months (including compounds from new projects) 81% were predicted correctly within 1 log unit, compared to only 44% achieved by commercial software. Additional evaluations using public data are presented. We consider error bars for each method (model based error bars, ensemble based, and distance based approaches), and investigate how well they quantify the domain of applicability of each model.


Assuntos
Inteligência Artificial , Lipídeos/química , Modelos Químicos , Preparações Farmacêuticas/química , Algoritmos , Teorema de Bayes , Árvores de Decisões , Modelos Estatísticos , Estrutura Molecular , Reprodutibilidade dos Testes
9.
J Chem Inf Model ; 47(2): 407-24, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17243756

RESUMO

Accurate in silico models for predicting aqueous solubility are needed in drug design and discovery and many other areas of chemical research. We present a statistical modeling of aqueous solubility based on measured data, using a Gaussian Process nonlinear regression model (GPsol). We compare our results with those of 14 scientific studies and 6 commercial tools. This shows that the developed model achieves much higher accuracy than available commercial tools for the prediction of solubility of electrolytes. On top of the high accuracy, the proposed machine learning model also provides error bars for each individual prediction.


Assuntos
Modelos Químicos , Redes Neurais de Computação , Simulação por Computador , Eletrólitos , Estrutura Molecular , Solubilidade
10.
J Comput Aided Mol Des ; 21(12): 651-64, 2007 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-18060505

RESUMO

We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.


Assuntos
Inteligência Artificial , Preparações Farmacêuticas/química , Relação Quantitativa Estrutura-Atividade , Água/química , Algoritmos , Desenho de Fármacos , Solubilidade
11.
J Comput Aided Mol Des ; 21(9): 485-98, 2007 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17632688

RESUMO

We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.


Assuntos
Inteligência Artificial , Modelos Químicos , Preparações Farmacêuticas/química , Relação Quantitativa Estrutura-Atividade , Algoritmos , Teorema de Bayes , Modelos Estatísticos , Estrutura Molecular , Solubilidade
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa