Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Med Internet Res ; 24(3): e27210, 2022 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-35319481

RESUMO

BACKGROUND: Information in pathology reports is critical for cancer care. Natural language processing (NLP) systems used to extract information from pathology reports are often narrow in scope or require extensive tuning. Consequently, there is growing interest in automated deep learning approaches. A powerful new NLP algorithm, bidirectional encoder representations from transformers (BERT), was published in late 2018. BERT set new performance standards on tasks as diverse as question answering, named entity recognition, speech recognition, and more. OBJECTIVE: The aim of this study is to develop a BERT-based system to automatically extract detailed tumor site and histology information from free-text oncological pathology reports. METHODS: We pursued three specific aims: extract accurate tumor site and histology descriptions from free-text pathology reports, accommodate the diverse terminology used to indicate the same pathology, and provide accurate standardized tumor site and histology codes for use by downstream applications. We first trained a base language model to comprehend the technical language in pathology reports. This involved unsupervised learning on a training corpus of 275,605 electronic pathology reports from 164,531 unique patients that included 121 million words. Next, we trained a question-and-answer (Q&A) model that connects a Q&A layer to the base pathology language model to answer pathology questions. Our Q&A system was designed to search for the answers to two predefined questions in each pathology report: What organ contains the tumor? and What is the kind of tumor or carcinoma? This involved supervised training on 8197 pathology reports, each with ground truth answers to these 2 questions determined by certified tumor registrars. The data set included 214 tumor sites and 193 histologies. The tumor site and histology phrases extracted by the Q&A model were used to predict International Classification of Diseases for Oncology, Third Edition (ICD-O-3), site and histology codes. This involved fine-tuning two additional BERT models: one to predict site codes and another to predict histology codes. Our final system includes a network of 3 BERT-based models. We call this CancerBERT network (caBERTnet). We evaluated caBERTnet using a sequestered test data set of 2050 pathology reports with ground truth answers determined by certified tumor registrars. RESULTS: caBERTnet's accuracies for predicting group-level site and histology codes were 93.53% (1895/2026) and 97.6% (1993/2042), respectively. The top 5 accuracies for predicting fine-grained ICD-O-3 site and histology codes with 5 or more samples each in the training data set were 92.95% (1794/1930) and 96.01% (1853/1930), respectively. CONCLUSIONS: We have developed an NLP system that outperforms existing algorithms at predicting ICD-O-3 codes across an extensive range of tumor sites and histologies. Our new system could help reduce treatment delays, increase enrollment in clinical trials of new therapies, and improve patient outcomes.


Assuntos
Processamento de Linguagem Natural , Neoplasias , Algoritmos , Humanos , Idioma , Oncologia
2.
J Med Internet Res ; 23(11): e34493, 2021 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-34751656

RESUMO

Data integration, the processes by which data are aggregated, combined, and made available for use, has been key to the development and growth of many technological solutions. In health care, we are experiencing a revolution in the use of sensors to collect data on patient behaviors and experiences. Yet, the potential of this data to transform health outcomes is being held back. Deficits in standards, lexicons, data rights, permissioning, and security have been well documented, less so the cultural adoption of sensor data integration as a priority for large-scale deployment and impact on patient lives. The use and reuse of trustworthy data to make better and faster decisions across drug development and care delivery will require an understanding of all stakeholder needs and best practices to ensure these needs are met. The Digital Medicine Society is launching a new multistakeholder Sensor Data Integration Tour of Duty to address these challenges and more, providing a clear direction on how sensor data can fulfill its potential to enhance patient lives.


Assuntos
Coleta de Dados , Atenção à Saúde , Humanos , Tecnologia
3.
Phys Rev Lett ; 114(9): 091603, 2015 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-25793798

RESUMO

We present prescriptions for obtaining the central charges, a and c, of a four-dimensional superconformal quantum field theory from the superconformal index. At infinite N, for holographic theories dual to Sasaki-Einstein 5-manifolds the prescriptions give the O(1) parts of the central charges. This allows us, among other things, to show the exact AdS/CFT matching of a and c for arbitrary toric quiver CFTs without adjoint matter that are dual to smooth Sasaki-Einstein 5-manifolds. In addition, we include evidence from nonholographic theories for the applicability of these results outside of a holographic setting and away from the large-N limit.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...