Pesquisa | BVS Educação Profissional em Saúde

Practical Guide to Natural Language Processing for Radiology.

Mozayan, Ali; Fabbri, Alexander R; Maneevese, Michelle; Tocino, Irena; Chheang, Sophie.

Radiographics ; 41(5): 1446-1453, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34469212

RESUMO

Natural language processing (NLP) is the subset of artificial intelligence focused on the computer interpretation of human language. It is an invaluable tool in the analysis, aggregation, and simplification of free text. It has already demonstrated significant potential in the analysis of radiology reports. There are abundant open-source libraries and tools available that facilitate its application to the benefit of radiology. Radiologists who understand its limitations and potential will be better positioned to evaluate NLP models, understand how they can improve clinical workflow, and facilitate research endeavors involving large amounts of human language. The advent of increasingly affordable and powerful computer processing, the large quantities of medical and radiologic data, and advances in machine learning algorithms have contributed to the large potential of NLP. In turn, radiology has significant potential to benefit from the ability of NLP to convert relatively standardized radiology reports to machine-readable data. NLP benefits from standardized reporting, but because of its ability to interpret free text by using context clues, NLP does not necessarily depend on it. An overview and practical approach to NLP is featured, with specific emphasis on its applications to radiology. A brief history of NLP, the strengths and challenges inherent to its use, and freely available resources and tools are covered to guide further exploration and study within the field. Particular attention is devoted to the recent development of the Word2Vec and BERT (Bidirectional Encoder Representations from Transformers) language models, which have exponentially increased the power and utility of NLP for a variety of applications. Online supplemental material is available for this article. ©RSNA, 2021.

Assuntos

Processamento de Linguagem Natural , Radiologia , Inteligência Artificial , Humanos , Aprendizado de Máquina , Radiografia

From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting.

Adams, Griffin; Fabbri, Alexander R; Ladhak, Faisal; Lehman, Eric; Elhadad, Noémie.

Proc Conf Empir Methods Nat Lang Process ; 2023(4th New Frontier Summarization Workshop): 68-74, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-39315281

RESUMO

Selecting the "right" amount of information to include in a summary is a difficult task. A good summary should be detailed and entity-centric without being overly dense and hard to follow. To better understand this tradeoff, we solicit increasingly dense GPT-4 summaries with what we refer to as a "Chain of Density" (CoD) prompt. Specifically, GPT-4 generates an initial entity-sparse summary before iteratively incorporating missing salient entities without increasing the length. Summaries generated by CoD are more abstractive, exhibit more fusion, and have less of a lead bias than GPT-4 summaries generated by a vanilla prompt. We conduct a human preference study on 100 CNN DailyMail articles and find that humans prefer GPT-4 summaries that are more dense than those generated by a vanilla prompt and almost as dense as human written summaries. Qualitative analysis supports the notion that there exists a tradeoff between informativeness and readability. 500 annotated CoD summaries, as well as an extra 5,000 unannotated summaries, are freely available on HuggingFace.

Generating EDU Extracts for Plan-Guided Summary Re-Ranking.

Adams, Griffin; Fabbri, Alexander R; Ladhak, Faisal; McKeown, Kathleen; Elhadad, Noémie.

Proc Conf Assoc Comput Linguist Meet ; 2023: 2680-2697, 2023 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38770277

RESUMO

Two-step approaches, in which summary candidates are generated-then-reranked to return a single summary, can improve ROUGE scores over the standard single-step approach. Yet, standard decoding methods (i.e., beam search, nucleus sampling, and diverse beam search) produce candidates with redundant, and often low quality, content. In this paper, we design a novel method to generate candidates for re-ranking that addresses these issues. We ground each candidate abstract on its own unique content plan and generate distinct plan-guided abstracts using a model's top beam. More concretely, a standard language model (a BART LM) auto-regressively generates elemental discourse unit (EDU) content plans with an extractive copy mechanism. The top K beams from the content plan generator are then used to guide a separate LM, which produces a single abstractive candidate for each distinct plan. We apply an existing re-ranker (BRIO) to abstractive candidates generated from our method, as well as baseline decoding methods. We show large relevance improvements over previously published methods on widely used single document news article corpora, with ROUGE-2 F1 gains of 0.88, 2.01, and 0.38 on CNN / Dailymail, NYT, and Xsum, respectively. A human evaluation on CNN / DM validates these results. Similarly, on 1k samples from CNN / DM, we show that prompting GPT-3 to follow EDU plans outperforms sampling-based methods by 1.05 ROUGE-2 F1 points. Code to generate and realize plans is available at https://github.com/griff4692/edu-sum.

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA