Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J Biomed Inform ; 91: 103120, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-30753949

RESUMEN

Concept extraction is an important step in clinical natural language processing. Once extracted, the use of concepts can improve the accuracy and generalization of downstream systems. We present a new unsupervised system for the extraction of concepts from clinical text. The system creates representations of concepts from the Unified Medical Language System (UMLS®) by combining natural language descriptions of concepts with word representations, and composing these into higher-order concept vectors. These concept vectors are then used to assign labels to candidate phrases which are extracted using a syntactic chunker. Our approach scores an exact F-score of.32 and an inexact F-score of.45 on the well-known I2b2-2010 challenge corpus, outperforming the only other unsupervised concept extraction method. As our approach relies only on word representations and a chunker, it is completely unsupervised. As such, it can be applied to languages and corpora for which we do not have prior annotations. All our code is open-source and can be found at www.github.com/clips/conch.


Asunto(s)
Semántica , Unified Medical Language System , Aprendizaje Automático no Supervisado
2.
J Biomed Inform ; 84: 103-113, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-29966746

RESUMEN

We have three contributions in this work: 1. We explore the utility of a stacked denoising autoencoder and a paragraph vector model to learn task-independent dense patient representations directly from clinical notes. To analyze if these representations are transferable across tasks, we evaluate them in multiple supervised setups to predict patient mortality, primary diagnostic and procedural category, and gender. We compare their performance with sparse representations obtained from a bag-of-words model. We observe that the learned generalized representations significantly outperform the sparse representations when we have few positive instances to learn from, and there is an absence of strong lexical features. 2. We compare the model performance of the feature set constructed from a bag of words to that obtained from medical concepts. In the latter case, concepts represent problems, treatments, and tests. We find that concept identification does not improve the classification performance. 3. We propose novel techniques to facilitate model interpretability. To understand and interpret the representations, we explore the best encoded features within the patient representations obtained from the autoencoder model. Further, we calculate feature sensitivity across two networks to identify the most significant input features for different classification tasks when we use these pretrained representations as the supervised input. We successfully extract the most influential features for the pipeline using this technique.


Asunto(s)
Informática Médica/métodos , Registros Médicos , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Bases de Datos Factuales , Femenino , Humanos , Lenguaje , Aprendizaje Automático , Masculino , Modelos Estadísticos , Mortalidad , Procesamiento de Lenguaje Natural , Redes Neurales de la Computación , Curva ROC , Reproducibilidad de los Resultados , Semántica , Programas Informáticos
3.
J Biomed Inform ; 74: 92-103, 2017 10.
Artículo en Inglés | MEDLINE | ID: mdl-28919106

RESUMEN

A multitude of information sources is present in the electronic health record (EHR), each of which can contain clues to automatically assign diagnosis and procedure codes. These sources however show information overlap and quality differences, which complicates the retrieval of these clues. Through feature selection, a denser representation with a consistent quality and less information overlap can be obtained. We introduce and compare coverage-based feature selection methods, based on confidence and information gain. These approaches were evaluated over a range of medical specialties, with seven different medical specialties for ICD-9-CM code prediction (six at the Antwerp University Hospital and one in the MIMIC-III dataset) and two different medical specialties for ICD-10-CM code prediction. Using confidence coverage to integrate all sources in an EHR shows a consistent improvement in F-measure (49.83% for diagnosis codes on average), both compared with the baseline (44.25% for diagnosis codes on average) and with using the best standalone source (44.41% for diagnosis codes on average). Confidence coverage creates a concise patient stay representation independent of a rigid framework such as UMLS, and contains easily interpretable features. Confidence coverage has several advantages to a baseline setup. In our baseline setup, feature selection was limited to a filter removing features with less than five total occurrences in the trainingset. Prediction results improved consistently when using multiple heterogeneous sources to predict clinical codes, while reducing the number of features and the processing time.


Asunto(s)
Registros Electrónicos de Salud , Clasificación Internacional de Enfermedades , Algoritmos , Humanos
4.
J Biomed Inform ; 75S: S112-S119, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-28602906

RESUMEN

The CEGS N-GRID 2016 Shared Task (Filannino et al., 2017) in Clinical Natural Language Processing introduces the assignment of a severity score to a psychiatric symptom, based on a psychiatric intake report. We present a method that employs the inherent interview-like structure of the report to extract relevant information from the report and generate a representation. The representation consists of a restricted set of psychiatric concepts (and the context they occur in), identified using medical concepts defined in UMLS that are directly related to the psychiatric diagnoses present in the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition (DSM-IV) ontology. Random Forests provides a generalization of the extracted, case-specific features in our representation. The best variant presented here scored an inverse mean absolute error (MAE) of 80.64%. A concise concept-based representation, paired with identification of concept certainty and scope (family, patient), shows a robust performance on the task.


Asunto(s)
Trastornos Mentales/psicología , Adulto , Algoritmos , Humanos , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Índice de Severidad de la Enfermedad
5.
J Biomed Inform ; 69: 118-127, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28400312

RESUMEN

Clinical codes are used for public reporting purposes, are fundamental to determining public financing for hospitals, and form the basis for reimbursement claims to insurance providers. They are assigned to a patient stay to reflect the diagnosis and performed procedures during that stay. This paper aims to enrich algorithms for automated clinical coding by taking a data-driven approach and by using unsupervised and semi-supervised techniques for the extraction of multi-word expressions that convey a generalisable medical meaning (referred to as concepts). Several methods for extracting concepts from text are compared, two of which are constructed from a large unannotated corpus of clinical free text. A distributional semantic model (i.c. the word2vec skip-gram model) is used to generalize over concepts and retrieve relations between them. These methods are validated on three sets of patient stay data, in the disease areas of urology, cardiology, and gastroenterology. The datasets are in Dutch, which introduces a limitation on available concept definitions from expert-based ontologies (e.g. UMLS). The results show that when expert-based knowledge in ontologies is unavailable, concepts derived from raw clinical texts are a reliable alternative. Both concepts derived from raw clinical texts perform and concepts derived from expert-created dictionaries outperform a bag-of-words approach in clinical code assignment. Adding features based on tokens that appear in a semantically similar context has a positive influence for predicting diagnostic codes. Furthermore, the experiments indicate that a distributional semantics model can find relations between semantically related concepts in texts but also introduces erroneous and redundant relations, which can undermine clinical coding performance.


Asunto(s)
Codificación Clínica , Bases del Conocimiento , Procesamiento de Lenguaje Natural , Semántica , Algoritmos , Humanos , Lenguaje , Países Bajos
6.
Lang Speech ; 56(Pt 3): 309-28, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-24416959

RESUMEN

Memory-based language processing (MBLP) is an approach to language processing based on exemplar storage during learning and analogical reasoning during processing. From a cognitive perspective, the approach is attractive as a model for human language processing because it does not make any assumptions about the way abstractions are shaped, nor any a priori distinction between regular and exceptional exemplars, allowing it to explain fluidity of linguistic categories, and both regularization and irregularization in processing. Schema-like behaviour and the emergence of categories can be explained in MBLP as by-products of analogical reasoning over exemplars in memory. We focus on the reliance of MBLP on local (versus global) estimation, which is a relatively poorly understood but unique characteristic that separates the memory-based approach from globally abstracting approaches in how the model deals with redundancy and parsimony. We compare our model to related analogy-based methods, as well as to example-based frameworks that assume some systemic form of abstraction.


Asunto(s)
Lenguaje , Memoria , Humanos , Aprendizaje , Lingüística , Modelos Teóricos
7.
Front Artif Intell ; 6: 986890, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37275533

RESUMEN

Introduction: We examine the profiles of hate speech authors in a multilingual dataset of Facebook reactions to news posts discussing topics related to migrants and the LGBT+ community. The included languages are English, Dutch, Slovenian, and Croatian. Methods: First, all utterances were manually annotated as hateful or acceptable speech. Next, we used binary logistic regression to inspect how the production of hateful comments is impacted by authors' profiles (i.e., their age, gender, and language). Results: Our results corroborate previous findings: in all four languages, men produce more hateful comments than women, and people produce more hate speech as they grow older. But our findings also add important nuance to previously attested tendencies: specific age and gender dynamics vary slightly in different languages or cultures, suggesting that distinct (e.g., socio-political) realities are at play. Discussion: Finally, we discuss why author demographics are important in the study of hate speech: the profiles of prototypical "haters" can be used for hate speech detection, for sensibilization on and for counter-initiatives to the spread of (online) hatred.

8.
BMJ Open ; 13(2): e066367, 2023 02 10.
Artículo en Inglés | MEDLINE | ID: mdl-36764726

RESUMEN

BACKGROUND: Pregnant women, foetuses and infants are at risk of infectious disease-related complications. Maternal vaccination is a strategy developed to better protect pregnant women and their offspring against infectious disease-related morbidity and mortality. Vaccines against influenza, pertussis and recently also COVID-19 are widely recommended for pregnant women. Yet, there is still a significant amount of hesitation towards maternal vaccination policies. Furthermore, contradictory messages circulating social media impact vaccine confidence. OBJECTIVES: This scoping review aims to reveal how COVID-19 and COVID-19 vaccination impacted vaccine confidence in pregnant and lactating women. Additionally, this review studied the role social media plays in creating opinions towards vaccination in these target groups. ELIGIBILITY CRITERIA: Articles published between 23 November 2018 and 18 July 2022 that are linked to the objectives of this review were included. Reviews, articles not focusing on the target group, abstracts, articles describing outcomes of COVID-19 infection/COVID-19 vaccination were excluded. SOURCES OF EVIDENCE: The PubMed database was searched to select articles. Search terms used were linked to pregnancy, lactation, vaccination, vaccine hesitancy, COVID-19 and social media. CHARTING METHODS: Included articles were abstracted and synthesised by one reviewer. Verification was done by a second reviewer. Disagreements were addressed through discussion between reviewers and other researchers. RESULTS: Pregnant and lactating women are generally less likely to accept a COVID-19 vaccine compared with non-pregnant and non-nursing women. The main reason to refuse maternal vaccination is safety concerns. A positive link was detected between COVID-19 vaccine willingness and acceptance of other vaccines during pregnancy. The internet and social media are identified as important information sources for maternal vaccination. DISCUSSION AND CONCLUSION: Vaccine hesitancy in pregnant and lactating women remains an important issue, expressing the need for effective interventions to increase vaccine confidence and coverage. The role social media plays in vaccine uptake remains unclear.


Asunto(s)
COVID-19 , Enfermedades Transmisibles , Medios de Comunicación Sociales , Embarazo , Femenino , Humanos , Vacunas contra la COVID-19 , Lactancia , Pandemias/prevención & control , COVID-19/epidemiología , COVID-19/prevención & control , Mujeres Embarazadas , Vacunación
9.
JMIR Form Res ; 7: e41148, 2023 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-37074978

RESUMEN

BACKGROUND: Chatbots are increasingly used to support COVID-19 vaccination programs. Their persuasiveness may depend on the conversation-related context. OBJECTIVE: This study aims to investigate the moderating role of the conversation quality and chatbot expertise cues in the effects of expressing empathy/autonomy support using COVID-19 vaccination chatbots. METHODS: This experiment with 196 Dutch-speaking adults living in Belgium, who engaged in a conversation with a chatbot providing vaccination information, used a 2 (empathy/autonomy support expression: present vs absent) × 2 (chatbot expertise cues: expert endorser vs layperson endorser) between-subject design. Chatbot conversation quality was assessed through actual conversation logs. Perceived user autonomy (PUA), chatbot patronage intention (CPI), and vaccination intention shift (VIS) were measured after the conversation, coded from 1 to 5 (PUA, CPI) and from -5 to 5 (VIS). RESULTS: There was a negative interaction effect of chatbot empathy/autonomy support expression and conversation fallback (CF; the percentage of chatbot answers "I do not understand" in a conversation) on PUA (PROCESS macro, model 1, B=-3.358, SE 1.235, t186=2.718, P=.007). Specifically, empathy/autonomy support expression had a more negative effect on PUA when the CF was higher (conditional effect of empathy/autonomy support expression at the CF level of +1SD: B=-.405, SE 0.158, t186=2.564, P=.011; conditional effects nonsignificant for the mean level: B=-0.103, SE 0.113, t186=0.914, P=.36; conditional effects nonsignificant for the -1SD level: B=0.031, SE=0.123, t186=0.252, P=.80). Moreover, an indirect effect of empathy/autonomy support expression on CPI via PUA was more negative when CF was higher (PROCESS macro, model 7, 5000 bootstrap samples, moderated mediation index=-3.676, BootSE 1.614, 95% CI -6.697 to -0.102; conditional indirect effect at the CF level of +1SD: B=-0.443, BootSE 0.202, 95% CI -0.809 to -0.005; conditional indirect effects nonsignificant for the mean level: B=-0.113, BootSE 0.124, 95% CI -0.346 to 0.137; conditional indirect effects nonsignificant for the -1SD level: B=0.034, BootSE 0.132, 95% CI -0.224 to 0.305). Indirect effects of empathy/autonomy support expression on VIS via PUA were marginally more negative when CF was higher. No effects of chatbot expertise cues were found. CONCLUSIONS: The findings suggest that expressing empathy/autonomy support using a chatbot may harm its evaluation and persuasiveness when the chatbot fails to answer its users' questions. The paper adds to the literature on vaccination chatbots by exploring the conditional effects of chatbot empathy/autonomy support expression. The results will guide policy makers and chatbot developers dealing with vaccination promotion in designing the way chatbots express their empathy and support for user autonomy.

10.
JMIR Med Inform ; 10(4): e37771, 2022 Apr 27.
Artículo en Inglés | MEDLINE | ID: mdl-35442903

RESUMEN

BACKGROUND: Electronic medical records have opened opportunities to analyze clinical practice at large scale. Structured registries and coding procedures such as the International Classification of Primary Care further improved these procedures. However, a large part of the information about the state of patient and the doctors' observations is still entered in free text fields. The main function of those fields is to report the doctor's line of thought, to remind oneself and his or her colleagues on follow-up actions, and to be accountable for clinical decisions. These fields contain rich information that can be complementary to that in coded fields, and until now, they have been hardly used for analysis. OBJECTIVE: This study aims to develop a prediction model to convert the free text information on COVID-19-related symptoms from out of hours care electronic medical records into usable symptom-based data that can be analyzed at large scale. METHODS: The design was a feasibility study in which we examined the content of the raw data, steps and methods for modelling, as well as the precision and accuracy of the models. A data prediction model for 27 preidentified COVID-19-relevant symptoms was developed for a data set derived from the database of primary-care out-of-hours consultations in Flanders. A multiclass, multilabel categorization classifier was developed. We tested two approaches, which were (1) a classical machine learning-based text categorization approach, Binary Relevance, and (2) a deep neural network learning approach with BERTje, including a domain-adapted version. Ethical approval was acquired through the Institutional Review Board of the Institute of Tropical Medicine and the ethics committee of the University Hospital of Antwerpen (ref 20/50/693). RESULTS: The sample set comprised 3957 fields. After cleaning, 2313 could be used for the experiments. Of the 2313 fields, 85% (n=1966) were used to train the model, and 15% (n=347) for testing. The normal BERTje model performed the best on the data. It reached a weighted F1 score of 0.70 and an exact match ratio or accuracy score of 0.38, indicating the instances for which the model has identified all correct codes. The other models achieved respectable results as well, ranging from 0.59 to 0.70 weighted F1. The Binary Relevance method performed the best on the data without a frequency threshold. As for the individual codes, the domain-adapted version of BERTje performs better on several of the less common objective codes, while BERTje reaches higher F1 scores for the least common labels especially, and for most other codes in general. CONCLUSIONS: The artificial intelligence model BERTje can reliably predict COVID-19-related information from medical records using text mining from the free text fields generated in primary care settings. This feasibility study invites researchers to examine further possibilities to use primary care routine data.

11.
Front Artif Intell ; 4: 738278, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34527942

RESUMEN

The present study examines how teenagers adapt their language use to that of their conversation partner (i.e., the linguistic phenomenon of accommodation) in interactions with peers (intragenerational communication) and with older interlocutors (intergenerational communication). We analyze a large corpus of Flemish teenagers' conversations on Facebook Messenger and WhatsApp, which appear to be highly peer-oriented. With Poisson models, we examine whether the teenage participants adjust their writing style to older interlocutors. The same trend emerges for three sets of prototypical markers of the informal online genre: teenagers insert significantly fewer of these markers when interacting with older interlocutors, thus matching their interlocutors' style and increasing linguistic similarity. Finally, the analyses reveal subtle differences in accommodation patterns for the distinct linguistic variables with respect to the impact of the teenagers' sociodemographic profiles and their interlocutors' age.

12.
PLoS One ; 14(2): e0212134, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30811448

RESUMEN

We introduce a novel machine learning approach for investigating speech processing with cochlear implants (CIs)-prostheses used to replace a damaged inner ear. Concretely, we use a simple perceptron and a deep convolutional network to classify speech spectrograms that are modified to approximate CI-delivered speech. Implant-delivered signals suffer from reduced spectral resolution, chiefly due to a small number of frequency channels and a phenomenon called channel interaction. The latter involves the spread of information from neighboring channels to similar populations of neurons and can be modeled by linearly combining adjacent channels. We find that early during training, this input modification degrades performance if the networks are first pre-trained on high-resolution speech-with a larger number of channels, and without added channel interaction. This suggests that the spectral degradation caused by channel interaction alters the signal to conflict with perceptual expectations acquired from high-resolution speech. We thus predict that a reduction of channel interaction will accelerate learning in CI users who are implanted after having adapted to high-resolution speech during normal hearing. (The code for replicating our experiments is available online: https://github.com/clips/SimulatingCochlearImplants).


Asunto(s)
Implantes Cocleares , Aprendizaje Profundo , Modelos Teóricos , Percepción del Habla
13.
Front Psychol ; 10: 80, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30761044

RESUMEN

One of the tasks faced by young children is the segmentation of a continuous stream of speech into discrete linguistic units. Early in development, syllables emerge as perceptual primitives, and the wholesale storage of syllable chunks is one possible strategy for bootstrapping the segmentation process. Here, we investigate what types of chunks children store. Our method involves selecting syllabified utterances from corpora of child-directed speech, which we vary according to (a) their length in syllables, (b) the mutual predictability of their syllables, and (c) their frequency. We then use the number of utterances within which words are contained to predict the time course of word learning, arguing that utterances which perform well at this task are also more likely to be stored, by young children, as undersegmented chunks. Our results show that short utterances are best-suited for predicting when children acquire the words contained within them, although the effect is rather small. Beyond this, we also find that short utterances are the most likely to correspond to words. Together, the two findings suggest that children may not store many complete utterances as undersegmented chunks, with most of the units that children store as hypothesized words corresponding to actual words. However, dovetailing with an item-based account of language-acquisition, when children do store undersegmented chunks, these are likely to be short sequences-not frequent or internally predictable multi-word chunks. We end by discussing implications for work on formulaic multi-word sequences.

14.
PLoS One ; 13(12): e0209449, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30592738

RESUMEN

This paper analyzes distributional properties that facilitate the categorization of words into lexical categories. First, word-context co-occurrence counts were collected using corpora of transcribed English child-directed speech. Then, an unsupervised k-nearest neighbor algorithm was used to categorize words into lexical categories. The categorization outcome was regressed over three main distributional predictors computed for each word, including frequency, contextual diversity, and average conditional probability given all the co-occurring contexts. Results show that both contextual diversity and frequency have a positive effect while the average conditional probability has a negative effect. This indicates that words are easier to categorize in the face of uncertainty: categorization works best for words which are frequent, diverse, and hard to predict given the co-occurring contexts. This shows how, in order for the learner to see an opportunity to form a category, there needs to be a certain degree of uncertainty in the co-occurrence pattern.


Asunto(s)
Desarrollo del Lenguaje , Aprendizaje/fisiología , Modelos Psicológicos , Habla/fisiología , Incertidumbre , Algoritmos , Niño , Humanos
15.
PLoS One ; 13(10): e0203794, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30296299

RESUMEN

While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to identify potential risks automatically. The focus of this paper is on automatic cyberbullying detection in social media text by modelling posts written by bullies, victims, and bystanders of online bullying. We describe the collection and fine-grained annotation of a cyberbullying corpus for English and Dutch and perform a series of binary classification experiments to determine the feasibility of automatic cyberbullying detection. We make use of linear support vector machines exploiting a rich feature set and investigate which information sources contribute the most for the task. Experiments on a hold-out test set reveal promising results for the detection of cyberbullying-related posts. After optimisation of the hyperparameters, the classifier yields an F1 score of 64% and 61% for English and Dutch respectively, and considerably outperforms baseline systems.


Asunto(s)
Ciberacoso/psicología , Internet , Semántica , Medios de Comunicación Sociales , Víctimas de Crimen/psicología , Humanos , Lenguaje , Máquina de Vectores de Soporte
16.
Front Psychol ; 8: 555, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28450842

RESUMEN

Previous studies have suggested that children and adults form cognitive representations of co-occurring word sequences. We propose (1) that the formation of such multi-word unit (MWU) representations precedes and facilitates the formation of single-word representations in children and thus benefits word learning, and (2) that MWU representations facilitate adult word recognition and thus benefit lexical processing. Using a modified version of an existing computational model (McCauley and Christiansen, 2014), we extract MWUs from a corpus of child-directed speech (CDS) and a corpus of conversations among adults. We then correlate the number of MWUs within which each word appears with (1) age of first production and (2) adult reaction times on a word recognition task. In doing so, we take care to control for the effect of word frequency, as frequent words will naturally tend to occur in many MWUs. We also compare results to a baseline model which randomly groups words into sequences-and find that MWUs have a unique facilitatory effect on both response variables, suggesting that they benefit word learning in children and word recognition in adults. The effect is strongest on age of first production, implying that MWUs are comparatively more important for word learning than for adult lexical processing. We discuss possible underlying mechanisms and formulate testable predictions.

17.
J Am Med Inform Assoc ; 23(e1): e11-9, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26316458

RESUMEN

OBJECTIVE: Enormous amounts of healthcare data are becoming increasingly accessible through the large-scale adoption of electronic health records. In this work, structured and unstructured (textual) data are combined to assign clinical diagnostic and procedural codes (specifically ICD-9-CM) to patient stays. We investigate whether integrating these heterogeneous data types improves prediction strength compared to using the data types in isolation. METHODS: Two separate data integration approaches were evaluated. Early data integration combines features of several sources within a single model, and late data integration learns a separate model per data source and combines these predictions with a meta-learner. This is evaluated on data sources and clinical codes from a broad set of medical specialties. RESULTS: When compared with the best individual prediction source, late data integration leads to improvements in predictive power (eg, overall F-measure increased from 30.6% to 38.3% for International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnostic codes), while early data integration is less consistent. The predictive strength strongly differs between medical specialties, both for ICD-9-CM diagnostic and procedural codes. DISCUSSION: Structured data provides complementary information to unstructured data (and vice versa) for predicting ICD-9-CM codes. This can be captured most effectively by the proposed late data integration approach. CONCLUSIONS: We demonstrated that models using multiple electronic health record data sources systematically outperform models using data sources in isolation in the task of predicting ICD-9-CM codes over a broad range of medical specialties.


Asunto(s)
Codificación Clínica/métodos , Registros Electrónicos de Salud/organización & administración , Clasificación Internacional de Enfermedades , Minería de Datos , Conjuntos de Datos como Asunto , Humanos , Aprendizaje Automático
18.
Artif Intell Med ; 26(1-2): 87-107, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-12234719

RESUMEN

Present-day healthcare witnesses a growing demand for coordination of patient care. Coordination is needed especially in those cases in which hospitals have structured healthcare into specialty-oriented units, while a substantial portion of patient care is not limited to single units. From a logistic point of view, this multi-disciplinary patient care creates a tension between controlling the hospital's units, and the need for a control of the patient flow between units. A possible solution is the creation of new units in which different specialties work together for specific groups of patients. A first step in this solution is to identify the salient patient groups in need of multi-disciplinary care. Grouping techniques seem to offer a solution. However, most grouping approaches in medicine are driven by a search for pathophysiological homogeneity. In this paper, we present an alternative logistic-driven grouping approach. The starting point of our approach is a database with medical cases for 3,603 patients with peripheral arterial vascular (PAV) diseases. For these medical cases, six basic logistic variables (such as the number of visits to different specialist) are selected. Using these logistic variables, clustering techniques are used to group the medical cases in logistically homogeneous groups. In our approach, the quality of the resulting grouping is not measured by statistical significance, but by (i) the usefulness of the grouping for the creation of new multi-disciplinary units; (ii) how well patients can be selected for treatment in the new units. Given a priori knowledge of a patient (e.g. age, diagnosis), machine learning techniques are employed to induce rules that can be used for the selection of the patients eligible for treatment in the new units. In the paper, we describe the results of the above-proposed methodology for patients with PAV diseases. Two groupings and the accompanied classification rule sets are presented. One grouping is based on all the logistic variables, and another grouping is based on two latent factors found by applying factor analysis. On the basis of the experimental results, we can conclude that it is possible to search for medical logistic homogenous groups (i) that can be characterized by rules based on the aggregated logistic variables; (ii) for which we can formulate rules to predict to which cluster new patients belong.


Asunto(s)
Bases de Datos Factuales , Modelos Logísticos , Planificación de Atención al Paciente , Grupo de Atención al Paciente , Enfermedades Vasculares Periféricas/terapia , Análisis Factorial , Humanos , Relaciones Interprofesionales
19.
Biomed Inform Insights ; 5(Suppl. 1): 61-9, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22879761

RESUMEN

We present a system to automatically identify emotion-carrying sentences in suicide notes and to detect the specific fine-grained emotion conveyed. With this system, we competed in Track 2 of the 2011 Medical NLP Challenge,14 where the task was to distinguish between fifteen emotion labels, from guilt, sorrow, and hopelessness to hopefulness and happiness.Since a sentence can be annotated with multiple emotions, we designed a thresholding approach that enables assigning multiple labels to a single instance. We rely on the probability estimates returned by an SVM classifier and experimentally set thresholds on these probabilities. Emotion labels are assigned only if their probability exceeds a certain threshold and if the probability of the sentence being emotion-free is low enough. We show the advantages of this thresholding approach by comparing it to a naïve system that assigns only the most probable label to each test sentence, and to a system trained on emotion-carrying sentences only.

20.
Genome Biol ; 12(6): R57, 2011 Jun 22.
Artículo en Inglés | MEDLINE | ID: mdl-21696594

RESUMEN

We present BioGraph, a data integration and data mining platform for the exploration and discovery of biomedical information. The platform offers prioritizations of putative disease genes, supported by functional hypotheses. We show that BioGraph can retrospectively confirm recently discovered disease genes and identify potential susceptibility genes, outperforming existing technologies, without requiring prior domain knowledge. Additionally, BioGraph allows for generic biomedical applications beyond gene discovery. BioGraph is accessible at http://www.biograph.be.


Asunto(s)
Biología Computacional , Minería de Datos , Programas Informáticos , Bases de Datos Factuales , Estudio de Asociación del Genoma Completo , Humanos , Esquizofrenia/genética , Integración de Sistemas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA