Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Ano de publicação
Tipo de documento
Assunto da revista
País de afiliação
Intervalo de ano de publicação
1.
Proc Conf Empir Methods Nat Lang Process ; 2020: 3215-3226, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33364629

RESUMO

Automatic summarization research has traditionally focused on providing high quality general-purpose summaries of documents. However, there are many applications that require more specific summaries, such as supporting question answering or topic-based literature discovery. In this paper, we study the problem of conditional summarization in which content selection and surface realization are explicitly conditioned on an ad-hoc natural language question or topic description. Because of the difficulty in obtaining sufficient reference summaries to support arbitrary conditional summarization, we explore the use of multi-task fine-tuning (MTFT) on twenty-one natural language tasks to enable zero-shot conditional summarization on five tasks. We present four new summarization datasets, two novel "online" or adaptive task-mixing strategies, and report zero-shot performance using T5 and BART, demonstrating that MTFT can improve zero-shot summarization quality.

2.
Proc Int Conf Comput Ling ; 2020: 5640-5646, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-33293900

RESUMO

Recent work has shown that pre-trained Transformers obtain remarkable performance on many natural language processing tasks, including automatic summarization. However, most work has focused on (relatively) data-rich single-document summarization settings. In this paper, we explore highly-abstractive multi-document summarization, where the summary is explicitly conditioned on a user-given topic statement or question. We compare the summarization quality produced by three state-of-the-art transformer-based models: BART, T5, and PEGASUS. We report the performance on four challenging summarization datasets: three from the general domain and one from consumer health in both zero-shot and few-shot learning settings. While prior work has shown significant differences in performance for these models on standard summarization tasks, our results indicate that with as few as 10 labeled examples, there is no statistically significant difference in summary quality, suggesting the need for more abstractive benchmark collections when determining state-of-the-art.

3.
AMIA Jt Summits Transl Sci Proc ; 2020: 561-568, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32477678

RESUMO

Chemical entity recognition is essential for indexing scientific literature in the MEDLINE database at the National Library of Medicine. However, the tool currently used to suggest terms for indexing, the Medical Text Indexer, was not originally conceived as a chemical recognition tool. It has instead been adapted to the task via its use of MetaMap and the addition of in-house patterns and rules. In order to develop a tool more suitable for chemical recognition, we have created a collection of 200 MEDLINE titles and abstracts annotated with genes, proteins, inorganic and organic chemicals, as well as other biological molecules. We use this collection to evaluate eleven chemical entity recognition systems, where we seek to identify a tool that effectively recognizes chemical entities for indexing and also performs well on chemical recognition beyond the indexing task. We observe the highest performance with a SciBERT ensemble.

4.
AMIA Annu Symp Proc ; 2019: 727-734, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32308868

RESUMO

MEDLINE is the National Library of Medicine's premier bibliographic database for biomedical literature. A highly valuable feature of the database is that each record is manually indexed with a controlled vocabulary called MeSH. Most MEDLINE journals are indexed cover-to-cover, but there are about 200 selectively indexed journals for which only articles related to biomedicine and life sciences are indexed. In recent years, the selection process has become an increasing burden for indexing staff, and this paper presents a machine learning based system that offers very significant time savings by semi-automating the task. At the core of the system is a high recall classifier for the identification of journal articles that are in-scope for MEDLINE. The system is shown to reduce the number of articles requiring manual review by 54%, equivalent to approximately 40,000 articles per year.


Assuntos
Indexação e Redação de Resumos , MEDLINE , Aprendizado de Máquina , Redes Neurais de Computação , Medical Subject Headings , National Library of Medicine (U.S.) , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA