Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
BMC Health Serv Res ; 15: 48, 2015 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-25638151

RESUMEN

BACKGROUND: An intermediate care hospital (ICH) was established in a municipality in Central Norway in 2007 to improve the coordination of services and follow-up among elderly and chronically ill patients after hospital discharge. The aim of this study was to compare health care utilization by elderly patients in a municipality with an ICH to that of elderly patients in a municipality without an ICH. METHODS: This study was a retrospective comparative cohort study of all hospitalized patients aged 60 years or older in two municipalities. The data were collected from the national register of hospital use from 2005 to 2012, and from the local general hospital and two primary health care service providers from 2008 to 2012 (approx. 1,250 patients per follow-up year). The data were analyzed using descriptive statistics and analysis of covariance (ANCOVA). RESULTS: The length of hospital stay decreased from the time the ICH was introduced and remained between 10% and 22% lower than the length of hospital stay in the comparative municipality for the next five years. No differences in the number of readmissions or admissions during one year follow-up after the index stay at the local general hospital or changes in primary health care utilization were observed. In the year after hospital discharge, the municipality with an ICH offered more hour-based care to elderly patients living at home (estimated mean = 234 [95% CI 215-252] versus 175 [95% CI 154-196] hours per person and year), while the comparative municipality had a higher utilization of long-term stays in nursing homes (estimated mean = 33.3 [95% CI 29.0-37.7] versus 21.9 [95% CI 18.0-25.7] days per person and year). CONCLUSIONS: This study indicates that the introduction of an ICH rapidly reduces the length of hospital stay without exposing patients to an increased health risk. The ICH appears to operate as an extension of the general hospital, with only a minor impact on the pattern of primary health care utilization.


Asunto(s)
Hospitalización/estadística & datos numéricos , Instituciones de Cuidados Intermedios/estadística & datos numéricos , Tiempo de Internación/estadística & datos numéricos , Aceptación de la Atención de Salud/estadística & datos numéricos , Atención Primaria de Salud/estadística & datos numéricos , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Femenino , Hospitales Generales/estadística & datos numéricos , Humanos , Masculino , Persona de Mediana Edad , Noruega , Alta del Paciente/estadística & datos numéricos , Estudios Retrospectivos , Población Urbana/estadística & datos numéricos
2.
BMC Bioinformatics ; 12 Suppl 8: S4, 2011 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-22151968

RESUMEN

BACKGROUND: The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested. RESULTS: A User Advisory Group (UAG) actively participated in the IAT design and assessment. The task focused on gene normalization (identifying gene mentions in the article and linking these genes to standard database identifiers), gene ranking based on the overall importance of each gene mentioned in the article, and gene-oriented document retrieval (identifying full text papers relevant to a selected gene). Six systems participated and all processed and displayed the same set of articles. The articles were selected based on content known to be problematic for curation, such as ambiguity of gene names, coverage of multiple genes and species, or introduction of a new gene name. Members of the UAG curated three articles for training and assessment purposes, and each member was assigned a system to review. A questionnaire related to the interface usability and task performance (as measured by precision and recall) was answered after systems were used to curate articles. Although the limited number of articles analyzed and users involved in the IAT experiment precluded rigorous quantitative analysis of the results, a qualitative analysis provided valuable insight into some of the problems encountered by users when using the systems. The overall assessment indicates that the system usability features appealed to most users, but the system performance was suboptimal (mainly due to low accuracy in gene normalization). Some of the issues included failure of species identification and gene name ambiguity in the gene normalization task leading to an extensive list of gene identifiers to review, which, in some cases, did not contain the relevant genes. The document retrieval suffered from the same shortfalls. The UAG favored achieving high performance (measured by precision and recall), but strongly recommended the addition of features that facilitate the identification of correct gene and its identifier, such as contextual information to assist in disambiguation. DISCUSSION: The IAT was an informative exercise that advanced the dialog between curators and developers and increased the appreciation of challenges faced by each group. A major conclusion was that the intended users should be actively involved in every phase of software development, and this will be strongly encouraged in future tasks. The IAT Task provides the first steps toward the definition of metrics and functional requirements that are necessary for designing a formal evaluation of interactive curation systems in the BioCreative IV challenge.


Asunto(s)
Minería de Datos/métodos , Genes , Animales , Biología Computacional/métodos , Publicaciones Periódicas como Asunto , Plantas/genética , Plantas/metabolismo
3.
BMC Bioinformatics ; 12: 481, 2011 Dec 18.
Artículo en Inglés | MEDLINE | ID: mdl-22177292

RESUMEN

BACKGROUND: Bio-molecular event extraction from literature is recognized as an important task of bio text mining and, as such, many relevant systems have been developed and made available during the last decade. While such systems provide useful services individually, there is a need for a meta-service to enable comparison and ensemble of such services, offering optimal solutions for various purposes. RESULTS: We have integrated nine event extraction systems in the U-Compare framework, making them intercompatible and interoperable with other U-Compare components. The U-Compare event meta-service provides various meta-level features for comparison and ensemble of multiple event extraction systems. Experimental results show that the performance improvements achieved by the ensemble are significant. CONCLUSIONS: While individual event extraction systems themselves provide useful features for bio text mining, the U-Compare meta-service is expected to improve the accessibility to the individual systems, and to enable meta-level uses over multiple event extraction systems such as comparison and ensemble.


Asunto(s)
Minería de Datos , Sistemas de Computación , Publicaciones Periódicas como Asunto , Programas Informáticos
4.
Artif Intell Med ; 52(2): 107-14, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21652190

RESUMEN

OBJECTIVE: We present a combined terminological resource for text mining over biomedical literature. The purpose of the resource is to allow the detection of mentions of specific biological entities in scientific publications, and their grounding to widely accepted identifiers. This is an essential process, useful in itself, and necessary as an intermediate step for almost every type of complex text mining application. METHODS: We discuss some of the properties of the terminology for this domain, in particular the degree of ambiguity, which constitutes a peculiar problem for text mining applications. Without a correct recognition and disambiguation of the domain entities no reliable results can be produced. RESULTS: We also discuss an application that makes use of the resulting terminological knowledge base. We annotate an existing corpus of sentences about protein interactions. The annotation consists of a normalization step that matches the terms in our resource with their actual representation in the corpus, and a disambiguation step that resolves the ambiguity of matched terms. CONCLUSION: In this paper we present a large terminological resource, compiled through the aggregation of a number of different manually curated sources. We discuss the lexical properties of such resources, specifically the degree of ambiguity of the terms, and we inspect the causes of such ambiguity, in particular for protein names. This information is of vital importance for the implementation of an efficient term normalization and grounding algorithm.


Asunto(s)
Biología Computacional/métodos , Minería de Datos/métodos , Bases de Datos Bibliográficas , Algoritmos , Publicaciones , Vocabulario Controlado
5.
Bioinformatics ; 27(8): 1185-6, 2011 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-21349873

RESUMEN

UNLABELLED: Often, the most informative genes have to be selected from different gene sets and several computer gene ranking algorithms have been developed to cope with the problem. To help researchers decide which algorithm to use, we developed the analysis of gene ranking algorithms (AGRA) system that offers a novel technique for comparing ranked lists of genes. The most important feature of AGRA is that no previous knowledge of gene ranking algorithms is needed for their comparison. Using the text mining system finding-associated concepts with text analysis. AGRA defines what we call biomedical concept space (BCS) for each gene list and offers a comparison of the gene lists in six different BCS categories. The uploaded gene lists can be compared using two different methods. In the first method, the overlap between each pair of two gene lists of BCSs is calculated. The second method offers a text field where a specific biomedical concept can be entered. AGRA searches for this concept in each gene lists' BCS, highlights the rank of the concept and offers a visual representation of concepts ranked above and below it. AVAILABILITY AND IMPLEMENTATION: Available at http://agra.fzv.uni-mb.si/, implemented in Java and running on the Glassfish server. CONTACT: simon.kocbek@uni-mb.si.


Asunto(s)
Algoritmos , Genes , Minería de Datos , Programas Informáticos
6.
J Bioinform Comput Biol ; 8(5): 901-16, 2010 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-20981894

RESUMEN

Although there are several corpora with protein annotation, incompatibility between the annotations in different corpora remains a problem that hinders the progress of automatic recognition of protein names in biomedical literature. Here, we report on our efforts to find a solution to the incompatibility issue, and to improve the compatibility between two representative protein-annotated corpora: the GENIA corpus and the GENETAG corpus. In a comparative study, we improve our insight into the two corpora, and a series of experimental results show that most of the incompatibility can be removed.


Asunto(s)
Minería de Datos , Proteínas , Biología Computacional , PubMed , Terminología como Asunto
7.
Artículo en Inglés | MEDLINE | ID: mdl-20671316

RESUMEN

Currently, relation extraction (RE) and event extraction (EE) are the two main streams of biological information extraction. In 2009, the majority of these RE and EE research efforts were centered around the BioCreative II.5 Protein-Protein Interaction (PPI) challenge and the "BioNLP event extraction shared task." Although these challenges took somewhat different approaches, they share the same ultimate goal of extracting bio-knowledge from the literature. This paper compares the two challenge task definitions, and presents a unified system that was successfully applied in both these and several other PPI extraction task settings. The AkaneRE system has three parts: A core engine for RE, a pool of modules for specific solutions, and a configuration language to adapt the system to different tasks. The core engine is based on machine learning, using either Support Vector Machines or Statistical Classifiers and features extracted from given training data. The specific modules solve tasks like sentence boundary detection, tokenization, stemming, part-of-speech tagging, parsing, named entity recognition, generation of potential relations, generation of machine learning features for each relation, and finally, assignment of confidence scores and ranking of candidate relations. With these components, the AkaneRE system produces state-of-the-art results, and the system is freely available for academic purposes at http://www-tsujii.is.s.u-tokyo.ac.jp/satre/akane/.


Asunto(s)
Biología Computacional/métodos , Minería de Datos/métodos , Procesamiento de Lenguaje Natural , Mapeo de Interacción de Proteínas/métodos , Algoritmos , Bases de Datos Genéticas , Almacenamiento y Recuperación de la Información
8.
J Bioinform Comput Biol ; 8(1): 131-46, 2010 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-20183879

RESUMEN

Biomedical Natural Language Processing (BioNLP) attempts to capture biomedical phenomena from texts by extracting relations between biomedical entities (i.e. proteins and genes). Traditionally, only binary relations have been extracted from large numbers of published papers. Recently, more complex relations (biomolecular events) have also been extracted. Such events may include several entities or other relations. To evaluate the performance of the text mining systems, several shared task challenges have been arranged for the BioNLP community. With a common and consistent task setting, the BioNLP'09 shared task evaluated complex biomolecular events such as binding and regulation.Finding these events automatically is important in order to improve biomedical event extraction systems. In the present paper, we propose an automatic event extraction system, which contains a model for complex events, by solving a classification problem with rich features. The main contributions of the present paper are: (1) the proposal of an effective bio-event detection method using machine learning, (2) provision of a high-performance event extraction system, and (3) the execution of a quantitative error analysis. The proposed complex (binding and regulation) event detector outperforms the best system from the BioNLP'09 shared task challenge.


Asunto(s)
Biología Computacional , Procesamiento de Lenguaje Natural , Inteligencia Artificial , Bases de Datos Factuales/estadística & datos numéricos , Modelos Estadísticos
9.
BMC Bioinformatics ; 10: 403, 2009 Dec 09.
Artículo en Inglés | MEDLINE | ID: mdl-19995463

RESUMEN

BACKGROUND: The number of corpora, collections of structured texts, has been increasing, as a result of the growing interest in the application of natural language processing methods to biological texts. Many named entity recognition (NER) systems have been developed based on these corpora. However, in the biomedical community, there is yet no general consensus regarding named entity annotation; thus, the resources are largely incompatible, and it is difficult to compare the performance of systems developed on resources that were divergently annotated. On the other hand, from a practical application perspective, it is desirable to utilize as many existing annotated resources as possible, because annotation is costly. Thus, it becomes a task of interest to integrate the heterogeneous annotations in these resources. RESULTS: We explore the potential sources of incompatibility among gene and protein annotations that were made for three common corpora: GENIA, GENETAG and AIMed. To show the inconsistency in the corpora annotations, we first tackle the incompatibility problem caused by corpus integration, and we quantitatively measure the effect of this incompatibility on protein mention recognition. We find that the F-score performance declines tremendously when training with integrated data, instead of training with pure data; in some cases, the performance drops nearly 12%. This degradation may be caused by the newly added heterogeneous annotations, and cannot be fixed without an understanding of the heterogeneities that exist among the corpora. Motivated by the result of this preliminary experiment, we further qualitatively analyze a number of possible sources for these differences, and investigate the factors that would explain the inconsistencies, by performing a series of well-designed experiments. Our analyses indicate that incompatibilities in the gene/protein annotations exist mainly in the following four areas: the boundary annotation conventions, the scope of the entities of interest, the distribution of annotated entities, and the ratio of overlap between annotated entities. We further suggest that almost all of the incompatibilities can be prevented by properly considering the four aspects aforementioned. CONCLUSION: Our analysis covers the key similarities and dissimilarities that exist among the diverse gene/protein corpora. This paper serves to improve our understanding of the differences in the three studied corpora, which can then lead to a better understanding of the performance of protein recognizers that are based on the corpora.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Bases de Datos Factuales , Genes , Reconocimiento de Normas Patrones Automatizadas , Proteínas/genética
10.
Int J Med Inform ; 78(12): e39-46, 2009 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19501018

RESUMEN

Protein-protein interaction (PPI) extraction is an important and widely researched task in the biomedical natural language processing (BioNLP) field. Kernel-based machine learning methods have been used widely to extract PPI automatically, and several kernels focusing on different parts of sentence structure have been published for the PPI task. In this paper, we propose a method to combine kernels based on several syntactic parsers, in order to retrieve the widest possible range of important information from a given sentence. We evaluate the method using a support vector machine (SVM), and we achieve better results than other state-of-the-art PPI systems on four out of five corpora. Further, we analyze the compatibility of the five corpora from the viewpoint of PPI extraction, and we see that some of them have small incompatibilities, but they can still be combined with a little effort.


Asunto(s)
Biología Computacional/métodos , Procesamiento de Lenguaje Natural , Mapeo de Interacción de Proteínas/métodos , Algoritmos , Bases de Datos como Asunto , Análisis de Secuencia de Proteína
11.
Bioinformatics ; 25(3): 394-400, 2009 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-19073593

RESUMEN

MOTIVATION: While text mining technologies for biomedical research have gained popularity as a way to take advantage of the explosive growth of information in text form in biomedical papers, selecting appropriate natural language processing (NLP) tools is still difficult for researchers who are not familiar with recent advances in NLP. This article provides a comparative evaluation of several state-of-the-art natural language parsers, focusing on the task of extracting protein-protein interaction (PPI) from biomedical papers. We measure how each parser, and its output representation, contributes to accuracy improvement when the parser is used as a component in a PPI system. RESULTS: All the parsers attained improvements in accuracy of PPI extraction. The levels of accuracy obtained with these different parsers vary slightly, while differences in parsing speed are larger. The best accuracy in this work was obtained when we combined Miyao and Tsujii's Enju parser and Charniak and Johnson's reranking parser, and the accuracy is better than the state-of-the-art results on the same data. AVAILABILITY: The PPI extraction system used in this work (AkanePPI) is available online at http://www-tsujii.is.s.u-tokyo.ac.jp/downloads/downloads.cgi. The evaluated parsers are also available online from each developer's site.


Asunto(s)
Procesamiento de Lenguaje Natural , Mapeo de Interacción de Proteínas/métodos , Algoritmos , Bases de Datos de Proteínas , Proteínas/química , Proteínas/metabolismo
12.
Genome Biol ; 9 Suppl 2: S6, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18834497

RESUMEN

We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; http://bcms.bioinfo.cnio.es/). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations.


Asunto(s)
Investigación Biomédica/métodos , Biología Computacional/métodos , Almacenamiento y Recuperación de la Información , Internet , Humanos
13.
Pac Symp Biocomput ; : 616-27, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18229720

RESUMEN

Recently, several text mining programs have reached a near-practical level of performance. Some systems are already being used by biologists and database curators. However, it has also been recognized that current Natural Language Processing (NLP) and Text Mining (TM) technology is not easy to deploy, since research groups tend to develop systems that cater specifically to their own requirements. One of the major reasons for the difficulty of deployment of NLP/TM technology is that re-usability and interoperability of software tools are typically not considered during development. While some effort has been invested in making interoperable NLP/TM toolkits, the developers of end-to-end systems still often struggle to reuse NLP/TM tools, and often opt to develop similar programs from scratch instead. This is particularly the case in BioNLP, since the requirements of biologists are so diverse that NLP tools have to be adapted and re-organized in a much more extensive manner than was originally expected. Although generic frameworks like UIMA (Unstructured Information Management Architecture) provide promising ways to solve this problem, the solution that they provide is only partial. In order for truly interoperable toolkits to become a reality, we also need sharable type systems and a developer-friendly environment for software integration that includes functionality for systematic comparisons of available tools, a simple I/O interface, and visualization tools. In this paper, we describe such an environment that was developed based on UIMA, and we show its feasibility through our experience in developing a protein-protein interaction (PPI) extraction system.


Asunto(s)
Biología Computacional , Mapeo de Interacción de Proteínas/estadística & datos numéricos , Almacenamiento y Recuperación de la Información , Procesamiento de Lenguaje Natural
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...