Polar labeling: silver standard algorithm for training disease classifiers.

Wagholikar, Kavishwar B; Estiri, Hossein; Murphy, Marykate; Murphy, Shawn N

Wagholikar, Kavishwar B; Estiri, Hossein; Murphy, Marykate; Murphy, Shawn N.

Afiliação

Wagholikar KB; Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02114, USA.
Estiri H; Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02114, USA.
Murphy M; Partners Healthcare, Somerville, MA 02145, USA.
Murphy SN; Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02114, USA.

Bioinformatics ; 36(10): 3200-3206, 2020 05 01.

Article em En | MEDLINE | ID: mdl-32049335

ABSTRACT

ABSTRACT

MOTIVATION Expert-labeled data are essential to train phenotyping algorithms for cohort identification. However expert labeling is time and labor intensive, and the costs remain prohibitive for scaling phenotyping to wider use-cases.

RESULTS:

We present an approach referred to as polar labeling (PL), to create silver standard for training machine learning (ML) for disease classification. We test the hypothesis that ML models trained on the silver standard created by applying PL on unlabeled patient records, are comparable in performance to the ML models trained on gold standard, created by clinical experts through manual review of patient records. We perform experimental validation using health records of 38 023 patients spanning six diseases. Our results demonstrate the superior performance of the proposed approach. AVAILABILITY AND IMPLEMENTATION We provide a Python implementation of the algorithm and the Python code developed for this study on Github. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos; Aprendizado de Máquina; Cor; Humanos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Aprendizado de Máquina Tipo de estudo: Guideline / Prognostic_studies Limite: Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2020 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google