Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
PLoS One ; 19(7): e0305362, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38976665

RESUMO

Disinformation in the medical field is a growing problem that carries a significant risk. Therefore, it is crucial to detect and combat it effectively. In this article, we provide three elements to aid in this fight: 1) a new framework that collects health-related articles from verification entities and facilitates their check-worthiness and fact-checking annotation at the sentence level; 2) a corpus generated using this framework, composed of 10335 sentences annotated in these two concepts and grouped into 327 articles, which we call KEANE (faKe nEws At seNtence lEvel); and 3) a new model for verifying fake news that combines specific identifiers of the medical domain with triplets subject-predicate-object, using Transformers and feedforward neural networks at the sentence level. This model predicts the fact-checking of sentences and evaluates the veracity of the entire article. After training this model on our corpus, we achieved remarkable results in the binary classification of sentences (check-worthiness F1: 0.749, fact-checking F1: 0.698) and in the final classification of complete articles (F1: 0.703). We also tested its performance against another public dataset and found that it performed better than most systems evaluated on that dataset. Moreover, the corpus we provide differs from other existing corpora in its duality of sentence-article annotation, which can provide an additional level of justification of the prediction of truth or untruth made by the model.


Assuntos
Desinformação , Humanos , Redes Neurais de Computação , Processamento de Linguagem Natural , Enganação
2.
J Biomed Inform ; 138: 104279, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36610608

RESUMO

BACKGROUND AND OBJECTIVES: Named Entity Recognition (NER) and Relation Extraction (RE) are two of the most studied tasks in biomedical Natural Language Processing (NLP). The detection of specific terms and entities and the relationships between them are key aspects for the development of more complex automatic systems in the biomedical field. In this work, we explore transfer learning techniques for incorporating information about negation into systems performing NER and RE. The main purpose of this research is to analyse to what extent the successful detection of negated entities in separate tasks helps in the detection of biomedical entities and their relationships. METHODS: Three neural architectures are proposed in this work, all of them mainly based on Bidirectional Long Short-Term Memory (Bi-LSTM) networks and Conditional Random Fields (CRFs). While the first architecture is devoted to detecting triggers and scopes of negated entities in any domain, two specific models are developed for performing isolated NER tasks and joint NER and RE tasks in the biomedical domain. Then, weights related to negation detection learned by the first architecture are incorporated into those last models. Two different languages, Spanish and English, are taken into account in the experiments. RESULTS: Performance of the biomedical models is analysed both when the weights of the neural networks are randomly initialized, and when weights from the negation detection model are incorporated into them. Improvements of around 3.5% of F-Measure in the English language and more than 7% in the Spanish language are achieved in the NER task, while the NER+RE task increases F-Measure scores by more than 13% for the NER submodel and around 2% for the RE submodel. CONCLUSIONS: The obtained results allow us to conclude that negation-based transfer learning techniques are appropriate for performing biomedical NER and RE tasks. These results highlight the importance of detecting negation for improving the identification of biomedical entities and their relationships. The explored techniques show robustness by maintaining consistent results and improvements across different tasks and languages.


Assuntos
Idioma , Redes Neurais de Computação , Processamento de Linguagem Natural , Aprendizado de Máquina
3.
Sci Rep ; 12(1): 18208, 2022 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-36307506

RESUMO

Acquired immunodeficiency syndrome (AIDS) is still one of the main health problems worldwide. It is therefore essential to keep making progress in improving the prognosis and quality of life of affected patients. One way to advance along this pathway is to uncover connections between other disorders associated with HIV/AIDS-so that they can be anticipated and possibly mitigated. We propose to achieve this by using Association Rules (ARs). They allow us to represent the dependencies between a number of diseases and other specific diseases. However, classical techniques systematically generate every AR meeting some minimal conditions on data frequency, hence generating a vast amount of uninteresting ARs, which need to be filtered out. The lack of manually annotated ARs has favored unsupervised filtering, even though they produce limited results. In this paper, we propose a semi-supervised system, able to identify relevant ARs among HIV-related diseases with a minimal amount of annotated training data. Our system has been able to extract a good number of relationships between HIV-related diseases that have been previously detected in the literature but are scattered and are often little known. Furthermore, a number of plausible new relationships have shown up which deserve further investigation by qualified medical experts.


Assuntos
Síndrome da Imunodeficiência Adquirida , Infecções por HIV , Humanos , Qualidade de Vida , Aprendizado de Máquina
4.
BMC Med Inform Decis Mak ; 22(1): 20, 2022 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-35073885

RESUMO

BACKGROUND: Association Rules are one of the main ways to represent structural patterns underlying raw data. They represent dependencies between sets of observations contained in the data. The associations established by these rules are very useful in the medical domain, for example in the predictive health field. Classic algorithms for association rule mining give rise to huge amounts of possible rules that should be filtered in order to select those most likely to be true. Most of the proposed techniques for these tasks are unsupervised. However, the accuracy provided by unsupervised systems is limited. Conversely, resorting to annotated data for training supervised systems is expensive and time-consuming. The purpose of this research is to design a new semi-supervised algorithm that performs like supervised algorithms but uses an affordable amount of training data. METHODS: In this work we propose a new semi-supervised data mining model that combines unsupervised techniques (Fisher's exact test) with limited supervision. Starting with a small seed of annotated data, the model improves results (F-measure) obtained, using a fully supervised system (standard supervised ML algorithms). The idea is based on utilising the agreement between the predictions of the supervised system and those of the unsupervised techniques in a series of iterative steps. RESULTS: The new semi-supervised ML algorithm improves the results of supervised algorithms computed using the F-measure in the task of mining medical association rules, but training with an affordable amount of manually annotated data. CONCLUSIONS: Using a small amount of annotated data (which is easily achievable) leads to results similar to those of a supervised system. The proposal may be an important step for the practical development of techniques for mining association rules and generating new valuable scientific medical knowledge.


Assuntos
Algoritmos , Aprendizado de Máquina Supervisionado , Mineração de Dados/métodos , Humanos
5.
Artif Intell Med ; 121: 102177, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34763812

RESUMO

BACKGROUND AND OBJECTIVES: The 10th version of International Classification of Diseases (ICD-10) codification system has been widely adopted by the health systems of many countries, including Spain. However, manual code assignment of Electronic Health Records (EHR) is a complex and time-consuming task that requires a great amount of specialised human resources. Therefore, several machine learning approaches are being proposed to assist in the assignment task. In this work we present an alternative system for automatically recommending ICD-10 codes to be assigned to EHRs. METHODS: Our proposal is based on characterising ICD-10 codes by a set of keyphrases that represent them. These keyphrases do not only include those that have literally appeared in some EHR with the considered ICD-10 codes assigned, but also others that have been obtained by a statistical process able to capture expressions that have led the annotators to assign the code. RESULTS: The result is an information model that allows to efficiently recommend codes to a new EHR based on their textual content. We explore an approach that proves to be competitive with other state-of-the-art approaches and can be combined with them to optimise results. CONCLUSIONS: In addition to its effectiveness, the recommendations of this method are easily interpretable since the phrases in an EHR leading to recommend an ICD-10 code are known. Moreover, the keyphrases associated with each ICD-10 code can be a valuable additional source of information for other approaches, such as machine learning techniques.


Assuntos
Registros Eletrônicos de Saúde , Classificação Internacional de Doenças , Humanos , Aprendizado de Máquina , Projetos de Pesquisa , Recursos Humanos
6.
Comput Methods Programs Biomed ; 164: 121-129, 2018 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30195420

RESUMO

BACKGROUND AND OBJECTIVE: There is a huge amount of rare diseases, many of which have associated important disabilities. It is paramount to know in advance the evolution of the disease in order to limit and prevent the appearance of disabilities and to prepare the patient to manage the future difficulties. Rare disease associations are making an effort to manually collect this information, but it is a long process. A lot of information about the consequences of rare diseases is published in scientific papers, and our goal is to automatically extract disabilities associated with diseases from them. METHODS: This work presents a new corpus of abstracts from scientific papers related to rare diseases, which has been manually annotated with disabilities. This corpus allows to train machine and deep learning systems that can automatically process other papers, thus extracting new information about the relations between rare diseases and disabilities. The corpus is also annotated with negation and speculation when they appear affecting disabilities. The corpus has been made publicly accessible. RESULTS: We have devised some experiments using deep learning techniques to show the usefulness of the developed corpus. Specifically, we have designed a long short-term memory based architecture for disabilities identification, as well as a convolutional neural network for detecting their relationships to diseases. The systems designed do not need any preprocessing of the data, but only low dimensional vectors representing the words. CONCLUSIONS: The developed corpus will allow to train systems to identify disabilities in biomedical documents, which the current annotation systems are not able to detect. The system could also be trained to detect relationships between them and diseases, as well as negation and speculation, that can change the meaning of the language. The deep learning models designed for identifying disabilities and their relationships to diseases in new documents show that the corpus allows obtaining an F-measure of around 81% for the disability recognition and 75% for relation extraction.


Assuntos
Pessoas com Deficiência/estatística & dados numéricos , Redes Neurais de Computação , Doenças Raras/etiologia , Mineração de Dados , Bases de Dados Factuais/estatística & dados numéricos , Aprendizado Profundo , Humanos , Processamento de Linguagem Natural , Semântica
7.
Artif Intell Med ; 87: 9-19, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29573845

RESUMO

Word sense disambiguation is a key step for many natural language processing tasks (e.g. summarization, text classification, relation extraction) and presents a challenge to any system that aims to process documents from the biomedical domain. In this paper, we present a new graph-based unsupervised technique to address this problem. The knowledge base used in this work is a graph built with co-occurrence information from medical concepts found in scientific abstracts, and hence adapted to the specific domain. Unlike other unsupervised approaches based on static graphs such as UMLS, in this work the knowledge base takes the context of the ambiguous terms into account. Abstracts downloaded from PubMed are used for building the graph and disambiguation is performed using the personalized PageRank algorithm. Evaluation is carried out over two test datasets widely explored in the literature. Different parameters of the system are also evaluated to test robustness and scalability. Results show that the system is able to outperform state-of-the-art knowledge-based systems, obtaining more than 10% of accuracy improvement in some cases, while only requiring minimal external resources.


Assuntos
Bases de Conhecimento , Processamento de Linguagem Natural , Semântica , Algoritmos , Conjuntos de Dados como Assunto , PubMed , Unified Medical Language System
8.
Acta pediátr. hondu ; 8(2): 819-828, oct. 2017-mar. 2018. ilus
Artigo em Espanhol | LILACS | ID: biblio-1015029

RESUMO

La enfermedad de Kawasaki (EK); vasculitis aguda de etiología desconocida, ocurre predominantemente durante la infancia. Las manifestaciones iniciales son fiebre alta, inflamación mucocutánea, linfadenopatía cervical, puede producir aneurismas en las arterias coronarias, depresión de la contracti-lidad miocárdica e insuficiencia cardíaca, infarto de miocardio, arritmias y una morbi-mortalidad significativa, su diagnóstico es clínico. La EK clásica se diagnostica con fiebre mayor de 5 días y al menos 4 de las siguientes características clínicas: inyección conjuntival bilateral, cambios en los labios y cavidad oral, adenopatía cervical, cambios en las extremidades y exantema polimorfo. Si se presentan pocos hallazgos clínicos, pero se encuentran anormalidades en las arterias coronarias en el ecocardiograma, se puede establecer el diagnóstico. La EK atípi-ca se sospecha cuando hay fiebre, al menos 5 días con dos o tres de los síntomas princi-pales, en algunas ocasiones puede presen-tarse como abdomen agudo, meningitis aséptica, pleuritis. La meta del tratamiento es evitar la inflamación sistémica, además prevenir trombosis en los aneurismas desa-rrollados. La inmunoglobulina (IG) es la piedra angular en el tratamiento, se inicia en los primeros 10 días de inicio de la fiebre (2 gr/kg dosis única), la aspirina (80-100 mg/kg por día VO) administrada en combinación con IG como tratamiento inicial durante 4 a 6 semanas. Es importante conocer los criterios de diagnóstico clínico para su detección yKawasaki's DiseaseEnfermedad de Kawasakiasí poder evitar las complicaciones vascula-res que representan una amenaza para la vida del paciente. En la presente revisión se describen su epidemiología, fisiopatología, manifestaciones clínicas, tratamiento y com-plicaciones...(AU)


Assuntos
Humanos , Criança , Vasculite/complicações , Síndrome de Linfonodos Mucocutâneos/diagnóstico , Linfadenopatia , Insuficiência Cardíaca
9.
J Biomed Inform ; 64: 320-332, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-27815227

RESUMO

Ambiguity in the biomedical domain represents a major issue when performing Natural Language Processing tasks over the huge amount of available information in the field. For this reason, Word Sense Disambiguation is critical for achieving accurate systems able to tackle complex tasks such as information extraction, summarization or document classification. In this work we explore whether multilinguality can help to solve the problem of ambiguity, and the conditions required for a system to improve the results obtained by monolingual approaches. Also, we analyze the best ways to generate those useful multilingual resources, and study different languages and sources of knowledge. The proposed system, based on co-occurrence graphs containing biomedical concepts and textual information, is evaluated on a test dataset frequently used in biomedicine. We can conclude that multilingual resources are able to provide a clear improvement of more than 7% compared to monolingual approaches, for graphs built from a small number of documents. Also, empirical results show that automatically translated resources are a useful source of information for this particular task.


Assuntos
Mineração de Dados , Processamento de Linguagem Natural , Algoritmos , Humanos , Bases de Conhecimento , Unified Medical Language System
10.
PLoS One ; 7(8): e43694, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22937081

RESUMO

The size and complexity of actual networked systems hinders the access to a global knowledge of their structure. This fact pushes the problem of navigation to suboptimal solutions, one of them being the extraction of a coherent map of the topology on which navigation takes place. In this paper, we present a Markov chain based algorithm to tag networked terms according only to their topological features. The resulting tagging is used to compute similarity between terms, providing a map of the networked information. This map supports local-based navigation techniques driven by similarity. We compare the efficiency of the resulting paths according to their length compared to that of the shortest path. Additionally we claim that the path steps towards the destination are semantically coherent. To illustrate the algorithm performance we provide some results from the Simple English Wikipedia, which amounts to several thousand of pages. The simplest greedy strategy yields over an 80% of average success rate. Furthermore, the resulting content-coherent paths most often have a cost between one- and threefold compared to shortest-path lengths.


Assuntos
Algoritmos , Armazenamento e Recuperação da Informação , Bases de Dados Factuais , Cadeias de Markov , Redes Neurais de Computação
11.
Phys Rev E Stat Nonlin Soft Matter Phys ; 84(4 Pt 2): 046108, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22181228

RESUMO

The mesoscopic structure of complex networks has proven a powerful level of description to understand the linchpins of the system represented by the network. Nevertheless, the mapping of a series of relationships between elements, in terms of a graph, is sometimes not straightforward. Given that all the information we would extract using complex network tools depend on this initial graph, it is mandatory to preprocess the data to build it on in the most accurate manner. Here we propose a procedure to build a network, attending only to statistically significant relations between constituents. We use a paradigmatic example of word associations to show the development of our approach. Analyzing the modular structure of the obtained network we are able to disentangle categorical relations, disambiguating words with success that is comparable to the best algorithms designed to the same end.

12.
PLoS One ; 5(11): e13749, 2010 Nov 12.
Artigo em Inglês | MEDLINE | ID: mdl-21103058

RESUMO

BACKGROUND: The evolutionary origin of cooperation among unrelated individuals remains a key unsolved issue across several disciplines. Prominent among the several mechanisms proposed to explain how cooperation can emerge is the existence of a population structure that determines the interactions among individuals. Many models have explored analytically and by simulation the effects of such a structure, particularly in the framework of the Prisoner's Dilemma, but the results of these models largely depend on details such as the type of spatial structure or the evolutionary dynamics. Therefore, experimental work suitably designed to address this question is needed to probe these issues. METHODS AND FINDINGS: We have designed an experiment to test the emergence of cooperation when humans play Prisoner's Dilemma on a network whose size is comparable to that of simulations. We find that the cooperation level declines to an asymptotic state with low but nonzero cooperation. Regarding players' behavior, we observe that the population is heterogeneous, consisting of a high percentage of defectors, a smaller one of cooperators, and a large group that shares features of the conditional cooperators of public goods games. We propose an agent-based model based on the coexistence of these different strategies that is in good agreement with all the experimental observations. CONCLUSIONS: In our large experimental setup, cooperation was not promoted by the existence of a lattice beyond a residual level (around 20%) typical of public goods experiments. Our findings also indicate that both heterogeneity and a "moody" conditional cooperation strategy, in which the probability of cooperating also depends on the player's previous action, are required to understand the outcome of the experiment. These results could impact the way game theory on graphs is used to model human interactions in structured groups.


Assuntos
Comportamento Cooperativo , Relações Interpessoais , Comportamento Social , Jogos Experimentais , Humanos , Modelos Psicológicos , Inquéritos e Questionários
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...