TAX-Corpus: Taxonomy based Annotations for Colonoscopy Evaluation.

Syed, Shorabuddin; Angel, Adam Jackson; Syeda, Hafsa Bareen; Jennings, Carole Franc; VanScoy, Joseph; Syed, Mahanazuddin; Greer, Melody; Bhattacharyya, Sudeepa; Al-Shukri, Shaymaa; Zozus, Meredith; Prior, Fred; Tharian, Benjamin

Syed, Shorabuddin; Angel, Adam Jackson; Syeda, Hafsa Bareen; Jennings, Carole Franc; VanScoy, Joseph; Syed, Mahanazuddin; Greer, Melody; Bhattacharyya, Sudeepa; Al-Shukri, Shaymaa; Zozus, Meredith; Prior, Fred; Tharian, Benjamin.

Afiliação

Syed S; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, U.S.A.
Angel AJ; Department of Internal Medicine, Washington University, U.S.A.
Syeda HB; Department of Neurology, University of Arkansas for Medical Sciences, U.S.A.
Jennings CF; Department of Internal Medicine, Tulane University, U.S.A.
VanScoy J; College of Medicine, University of Arkansas for Medical Sciences, U.S.A.
Syed M; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, U.S.A.
Greer M; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, U.S.A.
Bhattacharyya S; Department of Biological Sciences, Arkansas State University, U.S.A.
Al-Shukri S; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, U.S.A.
Zozus M; Department of Population Health Sciences, University of Texas Health Science Centre at San Antonio, U.S.A.
Prior F; Department of Biomedical Informatics, University of Arkansas for Medical Sciences, U.S.A.
Tharian B; Division of Gastroenterology and Hepatology, University of Arkansas for Medical Sciences, U.S.A.

Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap ; 2022: 162-169, 2022 Feb.

Article em En | MEDLINE | ID: mdl-35300321

ABSTRACT

ABSTRACT

Colonoscopy plays a critical role in screening of colorectal carcinomas (CC). Unfortunately, the data related to this procedure are stored in disparate documents, colonoscopy, pathology, and radiology reports respectively. The lack of integrated standardized documentation is impeding accurate reporting of quality metrics and clinical and translational research. Natural language processing (NLP) has been used as an alternative to manual data abstraction. Performance of Machine Learning (ML) based NLP solutions is heavily dependent on the accuracy of annotated corpora. Availability of large volume annotated corpora is limited due to data privacy laws and the cost and effort required. In addition, the manual annotation process is error-prone, making the lack of quality annotated corpora the largest bottleneck in deploying ML solutions. The objective of this study is to identify clinical entities critical to colonoscopy quality, and build a high-quality annotated corpus using domain specific taxonomies following standardized annotation guidelines. The annotated corpus can be used to train ML models for a variety of downstream tasks.

Palavras-chave

Annotation; Clinical Corpus; Colonoscopy; Machine Learning; Natural Language Processing; Taxonomy

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Guideline Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Tipo de estudo: Guideline Idioma: En Ano de publicação: 2022 Tipo de documento: Article