Búsqueda | Portal Regional de la BVS

Community-Supported Shared Infrastructure in Support of Speech Accessibility.

Hasegawa-Johnson, Mark; Zheng, Xiuwen; Kim, Heejin; Mendes, Clarion; Dickinson, Meg; Hege, Erik; Zwilling, Chris; Channell, Marie Moore; Mattie, Laura; Hodges, Heather; Ramig, Lorraine; Bellard, Mary; Shebanek, Mike; SarÎ¹, Leda; Kalgaonkar, Kaustubh; Frerichs, David; Bigham, Jeffrey P; Findlater, Leah; Lea, Colin; Herrlinger, Sarah; Korn, Peter; Abou-Zahra, Shadi; Heywood, Rus; Tomanek, Katrin; MacDonald, Bob.

J Speech Lang Hear Res ; : 1-14, 2024 Sep 26.

Artículo en Inglés | MEDLINE | ID: mdl-39325951

RESUMEN

PURPOSE: The Speech Accessibility Project (SAP) intends to facilitate research and development in automatic speech recognition (ASR) and other machine learning tasks for people with speech disabilities. The purpose of this article is to introduce this project as a resource for researchers, including baseline analysis of the first released data package. METHOD: The project aims to facilitate ASR research by collecting, curating, and distributing transcribed U.S. English speech from people with speech and/or language disabilities. Participants record speech from their place of residence by connecting their personal computer, cell phone, and assistive devices, if needed, to the SAP web portal. All samples are manually transcribed, and 30 per participant are annotated using differential diagnostic pattern dimensions. For purposes of ASR experiments, the participants have been randomly assigned to a training set, a development set for controlled testing of a trained ASR, and a test set to evaluate ASR error rate. RESULTS: The SAP 2023-10-05 Data Package contains the speech of 211 people with dysarthria as a correlate of Parkinson's disease, and the associated test set contains 42 additional speakers. A baseline ASR, with a word error rate of 3.4% for typical speakers, transcribes test speech with a word error rate of 36.3%. Fine-tuning reduces the word error rate to 23.7%. CONCLUSIONS: Preliminary findings suggest that a large corpus of dysarthric and dysphonic speech has the potential to significantly improve speech technology for people with disabilities. By providing these data to researchers, the SAP intends to significantly accelerate research into accessible speech technology. SUPPLEMENTAL MATERIAL: https://doi.org/10.23641/asha.27078079.

A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery.

Ahmidi, Narges; Tao, Lingling; Sefati, Shahin; Gao, Yixin; Lea, Colin; Haro, Benjamin Bejar; Zappella, Luca; Khudanpur, Sanjeev; Vidal, Rene; Hager, Gregory D.

IEEE Trans Biomed Eng ; 64(9): 2025-2041, 2017 09.

Artículo en Inglés | MEDLINE | ID: mdl-28060703

RESUMEN

OBJECTIVE: State-of-the-art techniques for surgical data analysis report promising results for automated skill assessment and action recognition. The contributions of many of these techniques, however, are limited to study-specific data and validation metrics, making assessment of progress across the field extremely challenging. METHODS: In this paper, we address two major problems for surgical data analysis: First, lack of uniform-shared datasets and benchmarks, and second, lack of consistent validation processes. We address the former by presenting the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a public dataset that we have created to support comparative research benchmarking. JIGSAWS contains synchronized video and kinematic data from multiple performances of robotic surgical tasks by operators of varying skill. We address the latter by presenting a well-documented evaluation methodology and reporting results for six techniques for automated segmentation and classification of time-series data on JIGSAWS. These techniques comprise four temporal approaches for joint segmentation and classification: hidden Markov model, sparse hidden Markov model (HMM), Markov semi-Markov conditional random field, and skip-chain conditional random field; and two feature-based ones that aim to classify fixed segments: bag of spatiotemporal features and linear dynamical systems. RESULTS: Most methods recognize gesture activities with approximately 80% overall accuracy under both leave-one-super-trial-out and leave-one-user-out cross-validation settings. CONCLUSION: Current methods show promising results on this shared dataset, but room for significant progress remains, particularly for consistent prediction of gesture activities across different surgeons. SIGNIFICANCE: The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database.

Asunto(s)

Competencia Clínica/estadística & datos numéricos , Competencia Clínica/normas , Gestos , Imagenología Tridimensional/estadística & datos numéricos , Imagenología Tridimensional/normas , Procedimientos Quirúrgicos Robotizados/estadística & datos numéricos , Procedimientos Quirúrgicos Robotizados/normas , Benchmarking/métodos , Benchmarking/normas , Bases de Datos Factuales , Humanos , Reconocimiento de Normas Patrones Automatizadas/métodos , Estados Unidos

System events: readily accessible features for surgical phase detection.

Malpani, Anand; Lea, Colin; Chen, Chi Chiung Grace; Hager, Gregory D.

Int J Comput Assist Radiol Surg ; 11(6): 1201-9, 2016 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-27177760

RESUMEN

PURPOSE: Surgical phase recognition using sensor data is challenging due to high variation in patient anatomy and surgeon-specific operating styles. Segmenting surgical procedures into constituent phases is of significant utility for resident training, education, self-review, and context-aware operating room technologies. Phase annotation is a highly labor-intensive task and would benefit greatly from automated solutions. METHODS: We propose a novel approach using system events-for example, activation of cautery tools-that are easily captured in most surgical procedures. Our method involves extracting event-based features over 90-s intervals and assigning a phase label to each interval. We explore three classification techniques: support vector machines, random forests, and temporal convolution neural networks. Each of these models independently predicts a label for each time interval. We also examine segmental inference using an approach based on the semi-Markov conditional random field, which jointly performs phase segmentation and classification. Our method is evaluated on a data set of 24 robot-assisted hysterectomy procedures. RESULTS: Our framework is able to detect surgical phases with an accuracy of 74 % using event-based features over a set of five different phases-ligation, dissection, colpotomy, cuff closure, and background. Precision and recall values for the cuff closure (Precision: 83 %, Recall: 98 %) and dissection (Precision: 75 %, Recall: 88 %) classes were higher than other classes. The normalized Levenshtein distance between predicted and ground truth phase sequence was 25 %. CONCLUSIONS: Our findings demonstrate that system events features are useful for automatically detecting surgical phase. Events contain phase information that cannot be obtained from motion data and that would require advanced computer vision algorithms to extract from a video. Many of these events are not specific to robotic surgery and can easily be recorded in non-robotic surgical modalities. In future work, we plan to combine information from system events, tool motion, and videos to automate phase detection in surgical procedures.

Asunto(s)

Histerectomía , Redes Neurales de la Computación , Procedimientos Quirúrgicos Robotizados , Máquina de Vectores de Soporte , Análisis y Desempeño de Tareas , Flujo de Trabajo , Algoritmos , Femenino , Humanos , Modelos Anatómicos , Modelos Teóricos , Movimiento (Física) , Procedimientos Quirúrgicos Operativos

3D Sensing Algorithms Towards Building an Intelligent Intensive Care Unit.

Lea, Colin; Facker, James; Hager, Gregory; Taylor, Russell; Saria, Suchi.

AMIA Jt Summits Transl Sci Proc ; 2013: 136-40, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-24303253

RESUMEN

Intensive Care Units (ICUs) are chaotic places where hundreds of tasks are carried out by many different people. Timely and coordinated execution of these tasks are directly related to quality of patient outcomes. An improved understanding of the current care process can aid in improving quality. Our goal is to build towards a system that automatically catalogs various tasks being performed by the bedside. We propose a set of techniques using computer vision and machine learning to develop a system that passively senses the environment and identifies seven common actions such as documenting, checking up on a patient, and performing a procedure. Preliminary evaluation of our system on 5.5 hours of data from the Pediatric ICU obtains overall task recognition accuracy of 70%. Furthermore, we show how it can be used to summarize and visualize tasks. Our system provides a significant departure from current approaches used for quality improvement. With further improvement, we think that such a system could realistically be deployed in the ICU.

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA