Search | VHL Regional Portal

1.

Enabling large-scale screening of Barrett's esophagus using weakly supervised deep learning in histopathology.

Bouzid, Kenza; Sharma, Harshita; Killcoyne, Sarah; Castro, Daniel C; Schwaighofer, Anton; Ilse, Max; Salvatelli, Valentina; Oktay, Ozan; Murthy, Sumanth; Bordeaux, Lucas; Moore, Luiza; O'Donovan, Maria; Thieme, Anja; Nori, Aditya; Gehrung, Marcel; Alvarez-Valle, Javier.

Nat Commun ; 15(1): 2026, 2024 Mar 11.

Article in English | MEDLINE | ID: mdl-38467600

ABSTRACT

Timely detection of Barrett's esophagus, the pre-malignant condition of esophageal adenocarcinoma, can improve patient survival rates. The Cytosponge-TFF3 test, a non-endoscopic minimally invasive procedure, has been used for diagnosing intestinal metaplasia in Barrett's. However, it depends on pathologist's assessment of two slides stained with H&E and the immunohistochemical biomarker TFF3. This resource-intensive clinical workflow limits large-scale screening in the at-risk population. To improve screening capacity, we propose a deep learning approach for detecting Barrett's from routinely stained H&E slides. The approach solely relies on diagnostic labels, eliminating the need for expensive localized expert annotations. We train and independently validate our approach on two clinical trial datasets, totaling 1866 patients. We achieve 91.4% and 87.3% AUROCs on discovery and external test datasets for the H&E model, comparable to the TFF3 model. Our proposed semi-automated clinical workflow can reduce pathologists' workload to 48% without sacrificing diagnostic performance, enabling pathologists to prioritize high risk cases.

Subject(s)

Adenocarcinoma , Barrett Esophagus , Deep Learning , Esophageal Neoplasms , Humans , Barrett Esophagus/diagnosis , Barrett Esophagus/pathology , Esophageal Neoplasms/diagnosis , Esophageal Neoplasms/pathology , Adenocarcinoma/diagnosis , Adenocarcinoma/pathology , Metaplasia

2.

Active label cleaning for improved dataset quality under resource constraints.

Bernhardt, Mélanie; Castro, Daniel C; Tanno, Ryutaro; Schwaighofer, Anton; Tezcan, Kerem C; Monteiro, Miguel; Bannur, Shruthi; Lungren, Matthew P; Nori, Aditya; Glocker, Ben; Alvarez-Valle, Javier; Oktay, Ozan.

Nat Commun ; 13(1): 1161, 2022 03 04.

Article in English | MEDLINE | ID: mdl-35246539

ABSTRACT

Imperfections in data annotation, known as label noise, are detrimental to the training of machine learning models and have a confounding effect on the assessment of model performance. Nevertheless, employing experts to remove label noise by fully re-annotating large datasets is infeasible in resource-constrained settings, such as healthcare. This work advocates for a data-driven approach to prioritising samples for re-annotation-which we term "active label cleaning". We propose to rank instances according to estimated label correctness and labelling difficulty of each sample, and introduce a simulation framework to evaluate relabelling efficacy. Our experiments on natural images and on a specifically-devised medical imaging benchmark show that cleaning noisy labels mitigates their negative impact on model training, evaluation, and selection. Crucially, the proposed approach enables correcting labels up to 4 × more effectively than typical random selection in realistic conditions, making better use of experts' valuable time for improving dataset quality.

Subject(s)

Diagnostic Imaging , Machine Learning , Benchmarking , Data Curation , Delivery of Health Care

3.

Evaluation of Deep Learning to Augment Image-Guided Radiotherapy for Head and Neck and Prostate Cancers.

Oktay, Ozan; Nanavati, Jay; Schwaighofer, Anton; Carter, David; Bristow, Melissa; Tanno, Ryutaro; Jena, Rajesh; Barnett, Gill; Noble, David; Rimmer, Yvonne; Glocker, Ben; O'Hara, Kenton; Bishop, Christopher; Alvarez-Valle, Javier; Nori, Aditya.

JAMA Netw Open ; 3(11): e2027426, 2020 11 02.

Article in English | MEDLINE | ID: mdl-33252691

ABSTRACT

Importance: Personalized radiotherapy planning depends on high-quality delineation of target tumors and surrounding organs at risk (OARs). This process puts additional time burdens on oncologists and introduces variability among both experts and institutions. Objective: To explore clinically acceptable autocontouring solutions that can be integrated into existing workflows and used in different domains of radiotherapy. Design, Setting, and Participants: This quality improvement study used a multicenter imaging data set comprising 519 pelvic and 242 head and neck computed tomography (CT) scans from 8 distinct clinical sites and patients diagnosed either with prostate or head and neck cancer. The scans were acquired as part of treatment dose planning from patients who received intensity-modulated radiation therapy between October 2013 and February 2020. Fifteen different OARs were manually annotated by expert readers and radiation oncologists. The models were trained on a subset of the data set to automatically delineate OARs and evaluated on both internal and external data sets. Data analysis was conducted October 2019 to September 2020. Main Outcomes and Measures: The autocontouring solution was evaluated on external data sets, and its accuracy was quantified with volumetric agreement and surface distance measures. Models were benchmarked against expert annotations in an interobserver variability (IOV) study. Clinical utility was evaluated by measuring time spent on manual corrections and annotations from scratch. Results: A total of 519 participants' (519 [100%] men; 390 [75%] aged 62-75 years) pelvic CT images and 242 participants' (184 [76%] men; 194 [80%] aged 50-73 years) head and neck CT images were included. The models achieved levels of clinical accuracy within the bounds of expert IOV for 13 of 15 structures (eg, left femur, κ = 0.982; brainstem, κ = 0.806) and performed consistently well across both external and internal data sets (eg, mean [SD] Dice score for left femur, internal vs external data sets: 98.52% [0.50] vs 98.04% [1.02]; P = .04). The correction time of autogenerated contours on 10 head and neck and 10 prostate scans was measured as a mean of 4.98 (95% CI, 4.44-5.52) min/scan and 3.40 (95% CI, 1.60-5.20) min/scan, respectively, to ensure clinically accepted accuracy. Manual segmentation of the head and neck took a mean 86.75 (95% CI, 75.21-92.29) min/scan for an expert reader and 73.25 (95% CI, 68.68-77.82) min/scan for a radiation oncologist. The autogenerated contours represented a 93% reduction in time. Conclusions and Relevance: In this study, the models achieved levels of clinical accuracy within expert IOV while reducing manual contouring time and performing consistently well across previously unseen heterogeneous data sets. With the availability of open-source libraries and reliable performance, this creates significant opportunities for the transformation of radiation treatment planning.

Subject(s)

Deep Learning/statistics & numerical data , Head and Neck Neoplasms/radiotherapy , Prostatic Neoplasms/radiotherapy , Radiotherapy, Image-Guided/instrumentation , Aged , Head and Neck Neoplasms/diagnostic imaging , Humans , Male , Middle Aged , Neural Networks, Computer , Observer Variation , Organs at Risk/radiation effects , Prostatic Neoplasms/diagnostic imaging , Quality Improvement/standards , Radiotherapy, Image-Guided/methods , Radiotherapy, Intensity-Modulated/methods , Reproducibility of Results , Tomography, X-Ray Computed/methods

4.

How wrong can we get? A review of machine learning approaches and error bars.

Schwaighofer, Anton; Schroeter, Timon; Mika, Sebastian; Blanchard, Gilles.

Comb Chem High Throughput Screen ; 12(5): 453-68, 2009 Jun.

Article in English | MEDLINE | ID: mdl-19519325

ABSTRACT

A large number of different machine learning methods can potentially be used for ligand-based virtual screening. In our contribution, we focus on three specific nonlinear methods, namely support vector regression, Gaussian process models, and decision trees. For each of these methods, we provide a short and intuitive introduction. In particular, we will also discuss how confidence estimates (error bars) can be obtained from these methods. We continue with important aspects for model building and evaluation, such as methodologies for model selection, evaluation, performance criteria, and how the quality of error bar estimates can be verified. Besides an introduction to the respective methods, we will also point to available implementations, and discuss important issues for the practical application.

Subject(s)

Artificial Intelligence , Drug Discovery , Models, Statistical , Algorithms , Computer Simulation , Decision Trees , Ligands , Models, Chemical , Normal Distribution , Quantitative Structure-Activity Relationship

5.

A probabilistic approach to classifying metabolic stability.

Schwaighofer, Anton; Schroeter, Timon; Mika, Sebastian; Hansen, Katja; Ter Laak, Antonius; Lienau, Philip; Reichel, Andreas; Heinrich, Nikolaus; Müller, Klaus-Robert.

J Chem Inf Model ; 48(4): 785-96, 2008 Apr.

Article in English | MEDLINE | ID: mdl-18327900

ABSTRACT

Metabolic stability is an important property of drug molecules that should-optimally-be taken into account early on in the drug design process. Along with numerous medium- or high-throughput assays being implemented in early drug discovery, a prediction tool for this property could be of high value. However, metabolic stability is inherently difficult to predict, and no commercial tools are available for this purpose. In this work, we present a machine learning approach to predicting metabolic stability that is tailored to compounds from the drug development process at Bayer Schering Pharma. For four different in vitro assays, we develop Bayesian classification models to predict the probability of a compound being metabolically stable. The chosen approach implicitly takes the "domain of applicability" into account. The developed models were validated on recent project data at Bayer Schering Pharma, showing that the predictions are highly accurate and the domain of applicability is estimated correctly. Furthermore, we evaluate the modeling method on a set of publicly available data.

Subject(s)

Probability , Algorithms , Bayes Theorem , Drug Design

6.

Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert.

J Comput Aided Mol Des ; 21(12): 651-64, 2007 Dec.

Article in English | MEDLINE | ID: mdl-18060505

ABSTRACT

We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

Subject(s)

Artificial Intelligence , Pharmaceutical Preparations/chemistry , Quantitative Structure-Activity Relationship , Water/chemistry , Algorithms , Drug Design , Solubility

7.

Machine learning models for lipophilicity and their domain of applicability.

Schroeter, Timon; Schwaighofer, Anton; Mika, Sebastian; Laak, Antonius Ter; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert.

Mol Pharm ; 4(4): 524-38, 2007.

Article in English | MEDLINE | ID: mdl-17637064

ABSTRACT

Unfavorable lipophilicity and water solubility cause many drug failures; therefore these properties have to be taken into account early on in lead discovery. Commercial tools for predicting lipophilicity usually have been trained on small and neutral molecules, and are thus often unable to accurately predict in-house data. Using a modern Bayesian machine learning algorithm--a Gaussian process model--this study constructs a log D7 model based on 14,556 drug discovery compounds of Bayer Schering Pharma. Performance is compared with support vector machines, decision trees, ridge regression, and four commercial tools. In a blind test on 7013 new measurements from the last months (including compounds from new projects) 81% were predicted correctly within 1 log unit, compared to only 44% achieved by commercial software. Additional evaluations using public data are presented. We consider error bars for each method (model based error bars, ensemble based, and distance based approaches), and investigate how well they quantify the domain of applicability of each model.

Subject(s)

Artificial Intelligence , Lipids/chemistry , Models, Chemical , Pharmaceutical Preparations/chemistry , Algorithms , Bayes Theorem , Decision Trees , Models, Statistical , Molecular Structure , Reproducibility of Results

8.

Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert.

J Comput Aided Mol Des ; 21(9): 485-98, 2007 Sep.

Article in English | MEDLINE | ID: mdl-17632688

ABSTRACT

We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

Subject(s)

Artificial Intelligence , Models, Chemical , Pharmaceutical Preparations/chemistry , Quantitative Structure-Activity Relationship , Algorithms , Bayes Theorem , Models, Statistical , Molecular Structure , Solubility

9.

Predicting lipophilicity of drug-discovery molecules using Gaussian process models.

Schroeter, Timon S; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert.

ChemMedChem ; 2(9): 1265-7, 2007 Sep.

Article in English | MEDLINE | ID: mdl-17576646

Subject(s)

Drug Design , Models, Theoretical

10.

Accurate solubility prediction with error bars for electrolytes: a machine learning approach.

Schwaighofer, Anton; Schroeter, Timon; Mika, Sebastian; Laub, Julian; ter Laak, Antonius; Sülzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert.

J Chem Inf Model ; 47(2): 407-24, 2007.

Article in English | MEDLINE | ID: mdl-17243756

ABSTRACT

Accurate in silico models for predicting aqueous solubility are needed in drug design and discovery and many other areas of chemical research. We present a statistical modeling of aqueous solubility based on measured data, using a Gaussian Process nonlinear regression model (GPsol). We compare our results with those of 14 scientific studies and 6 commercial tools. This shows that the developed model achieves much higher accuracy than available commercial tools for the prediction of solubility of electrolytes. On top of the high accuracy, the proposed machine learning model also provides error bars for each individual prediction.

Subject(s)

Models, Chemical , Neural Networks, Computer , Computer Simulation , Electrolytes , Molecular Structure , Solubility

11.

Mining functional modules in genetic networks with decomposable graphical models.

Dejori, Mathäus; Schwaighofer, Anton; Tresp, Volker; Stetter, Martin.

OMICS ; 8(2): 176-88, 2004.

Article in English | MEDLINE | ID: mdl-15268775

ABSTRACT

In recent years, graphical models have become an increasingly important tool for the structural analysis of genome-wide expression profiles at the systems level. Here we present a new graphical modelling technique, which is based on decomposable graphical models, and apply it to a set of gene expression profiles from acute lymphoblastic leukemia (ALL). The new method explains probabilistic dependencies of expression levels in terms of the concerted action of underlying genetic functional modules, which are represented as so-called "cliques" in the graph. In addition, the method uses continuous-valued (instead of discretized) expression levels, and makes no particular assumption about their probability distribution. We show that the method successfully groups members of known functional modules to cliques. Our method allows the evaluation of the importance of genes for global cellular functions based on both link count and the clique membership count.

Subject(s)

Gene Expression Profiling , Models, Theoretical , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Genome, Human , Humans , Oligonucleotide Array Sequence Analysis

12.

Classification of rheumatoid joint inflammation based on laser imaging.

Schwaighofer, Anton; Tresp, Volker; Mayer, Peter; Krause, Andreas; Beuthan, Jürgen; Rost, Helmut; Metzger, Georg; Müller, Gerhard A; Scheel, Alexander K.

IEEE Trans Biomed Eng ; 50(3): 375-82, 2003 Mar.

Article in English | MEDLINE | ID: mdl-12669994

ABSTRACT

We describe a classification system for a novel imaging method for arthritic finger joints. The basis of this system is a laser imaging technique which is sensitive to the optical characteristics of finger joint tissue. From the laser images acquired at baseline and follow-up, finger joints can automatically be classified according to whether the inflammatory status has improved or worsened. To perform the classification task, various linear and kernel-based systems were implemented and their performances were compared. Based on the results presented in this paper, we conclude that the laser-based imaging permits a reliable classification of pathological finger joints, making it a sensitive method for detecting arthritic changes.

Subject(s)

Arthritis, Rheumatoid/classification , Arthritis, Rheumatoid/diagnosis , Finger Joint/pathology , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Lasers , Algorithms , Expert Systems , Humans , Observer Variation , Pattern Recognition, Automated , Reproducibility of Results , Sensitivity and Specificity

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL