RESUMEN
Dementia affects cognitive functions of adults, including memory, language, and behaviour. Standard diagnostic biomarkers such as MRI are costly, whilst neuropsychological tests suffer from sensitivity issues in detecting dementia onset. The analysis of speech and language has emerged as a promising and non-intrusive technology to diagnose and monitor dementia. Currently, most work in this direction ignores the multi-modal nature of human communication and interactive aspects of everyday conversational interaction. Moreover, most studies ignore changes in cognitive status over time due to the lack of consistent longitudinal data. Here we introduce a novel fine-grained longitudinal multi-modal corpus collected in a natural setting from healthy controls and people with dementia over two phases, each spanning 28 sessions. The corpus consists of spoken conversations, a subset of which are transcribed, as well as typed and written thoughts and associated extra-linguistic information such as pen strokes and keystrokes. We present the data collection process and describe the corpus in detail. Furthermore, we establish baselines for capturing longitudinal changes in language across different modalities for two cohorts, healthy controls and people with dementia, outlining future research directions enabled by the corpus.
RESUMEN
The importance of incorporating Natural Language Processing (NLP) methods in clinical informatics research has been increasingly recognized over the past years, and has led to transformative advances. Typically, clinical NLP systems are developed and evaluated on word, sentence, or document level annotations that model specific attributes and features, such as document content (e.g., patient status, or report type), document section types (e.g., current medications, past medical history, or discharge summary), named entities and concepts (e.g., diagnoses, symptoms, or treatments) or semantic attributes (e.g., negation, severity, or temporality). From a clinical perspective, on the other hand, research studies are typically modelled and evaluated on a patient- or population-level, such as predicting how a patient group might respond to specific treatments or patient monitoring over time. While some NLP tasks consider predictions at the individual or group user level, these tasks still constitute a minority. Owing to the discrepancy between scientific objectives of each field, and because of differences in methodological evaluation priorities, there is no clear alignment between these evaluation approaches. Here we provide a broad summary and outline of the challenging issues involved in defining appropriate intrinsic and extrinsic evaluation methods for NLP research that is to be used for clinical outcomes research, and vice versa. A particular focus is placed on mental health research, an area still relatively understudied by the clinical NLP research community, but where NLP methods are of notable relevance. Recent advances in clinical NLP method development have been significant, but we propose more emphasis needs to be placed on rigorous evaluation for the field to advance further. To enable this, we provide actionable suggestions, including a minimal protocol that could be used when reporting clinical NLP method development and its evaluation.
Asunto(s)
Registros Electrónicos de Salud , Informática Médica/métodos , Servicios de Salud Mental/organización & administración , Procesamiento de Lenguaje Natural , Semántica , Algoritmos , Recolección de Datos/métodos , Humanos , Informática Médica/tendencias , Trastornos Mentales/terapia , Evaluación de Resultado en la Atención de Salud , Reproducibilidad de los ResultadosRESUMEN
Sentiment lexicons and word embeddings constitute well-established sources of information for sentiment analysis in online social media. Although their effectiveness has been demonstrated in state-of-the-art sentiment analysis and related tasks in the English language, such publicly available resources are much less developed and evaluated for the Greek language. In this paper, we tackle the problems arising when analyzing text in such an under-resourced language. We present and make publicly available a rich set of such resources, ranging from a manually annotated lexicon, to semi-supervised word embedding vectors and annotated datasets for different tasks. Our experiments using different algorithms and parameters on our resources show promising results over standard baselines; on average, we achieve a 24.9% relative improvement in F-score on the cross-domain sentiment analysis task when training the same algorithms with our resources, compared to training them on more traditional feature sources, such as n-grams. Importantly, while our resources were built with the primary focus on the cross-domain sentiment analysis task, they also show promising results in related tasks, such as emotion analysis and sarcasm detection.
RESUMEN
Networks of molecular interactions explain complex biological processes, and all known information on molecular events is contained in a number of public repositories including the scientific literature. Metabolic and signalling pathways are often viewed separately, even though both types are composed of interactions involving proteins and other chemical entities. It is necessary to be able to combine data from all available resources to judge the functionality, complexity and completeness of any given network overall, but especially the full integration of relevant information from the scientific literature is still an ongoing and complex task. Currently, the text-mining research community is steadily moving towards processing the full body of the scientific literature by making use of rich linguistic features such as full text parsing, to extract biological interactions. The next step will be to combine these with information from scientific databases to support hypothesis generation for the discovery of new knowledge and the extension of biological networks. The generation of comprehensive networks requires technologies such as entity grounding, coordination resolution and co-reference resolution, which are not fully solved and are required to further improve the quality of results. Here, we analyse the state of the art for the extraction of network information from the scientific literature and the evaluation of extraction methods against reference corpora, discuss challenges involved and identify directions for future research.
Asunto(s)
Minería de Datos , Lingüística , Transducción de SeñalRESUMEN
BACKGROUND: Artificial intelligence (AI) systems for automated chest x-ray interpretation hold promise for standardising reporting and reducing delays in health systems with shortages of trained radiologists. Yet, there are few freely accessible AI systems trained on large datasets for practitioners to use with their own data with a view to accelerating clinical deployment of AI systems in radiology. We aimed to contribute an AI system for comprehensive chest x-ray abnormality detection. METHODS: In this retrospective cohort study, we developed open-source neural networks, X-Raydar and X-Raydar-NLP, for classifying common chest x-ray findings from images and their free-text reports. Our networks were developed using data from six UK hospitals from three National Health Service (NHS) Trusts (University Hospitals Coventry and Warwickshire NHS Trust, University Hospitals Birmingham NHS Foundation Trust, and University Hospitals Leicester NHS Trust) collectively contributing 2 513 546 chest x-ray studies taken from a 13-year period (2006-19), which yielded 1 940 508 usable free-text radiological reports written by the contemporary assessing radiologist (collectively referred to as the "historic reporters") and 1 896 034 frontal images. Chest x-rays were labelled using a taxonomy of 37 findings by a custom-trained natural language processing (NLP) algorithm, X-Raydar-NLP, from the original free-text reports. X-Raydar-NLP was trained on 23 230 manually annotated reports and tested on 4551 reports from all hospitals. 1 694 921 labelled images from the training set and 89 238 from the validation set were then used to train a multi-label image classifier. Our algorithms were evaluated on three retrospective datasets: a set of exams sampled randomly from the full NHS dataset reported during clinical practice and annotated using NLP (n=103 328); a consensus set sampled from all six hospitals annotated by three expert radiologists (two independent annotators for each image and a third consultant to facilitate disagreement resolution) under research conditions (n=1427); and an independent dataset, MIMIC-CXR, consisting of NLP-annotated exams (n=252 374). FINDINGS: X-Raydar achieved a mean AUC of 0·919 (SD 0·039) on the auto-labelled set, 0·864 (0·102) on the consensus set, and 0·842 (0·074) on the MIMIC-CXR test, demonstrating similar performance to the historic clinical radiologist reporters, as assessed on the consensus set, for multiple clinically important findings, including pneumothorax, parenchymal opacification, and parenchymal mass or nodules. On the consensus set, X-Raydar outperformed historical reporter balanced accuracy with significance on 27 of 37 findings, was non-inferior on nine, and inferior on one finding, resulting in an average improvement of 13·3% (SD 13·1) to 0·763 (0·110), including a mean 5·6% (13·2) improvement in critical findings to 0·826 (0·119). INTERPRETATION: Our study shows that automated classification of chest x-rays under a comprehensive taxonomy can achieve performance levels similar to those of historical reporters and exhibit robust generalisation to external data. The open-sourced neural networks can serve as foundation models for further research and are freely available to the research community. FUNDING: Wellcome Trust.
Asunto(s)
Inteligencia Artificial , Interpretación de Imagen Asistida por Computador , Redes Neurales de la Computación , Humanos , Estudios Retrospectivos , Rayos XRESUMEN
OBJECTIVES: The Social media, Smartphone use and Self-Harm (3S-YP) study is a prospective observational cohort study to investigate the mechanisms underpinning associations between social media and smartphone use and self-harm in a clinical youth sample. We present here a comprehensive description of the cohort from baseline data and an overview of data available from baseline and follow-up assessments. METHODS: Young people aged 13-25 years were recruited from a mental health trust in England and followed up for 6 months. Self-report data was collected at baseline and monthly during follow-up and linked with electronic health records (EHR) and user-generated data. FINDINGS: A total of 362 young people enrolled and provided baseline questionnaire data. Most participants had a history of self-harm according to clinical (n = 295, 81.5%) and broader definitions (n = 296, 81.8%). At baseline, there were high levels of current moderate/severe anxiety (n = 244; 67.4%), depression (n = 255; 70.4%) and sleep disturbance (n = 171; 47.2%). Over half used social media and smartphones after midnight on weekdays (n = 197, 54.4%; n = 215, 59.4%) and weekends (n = 241, 66.6%; n = 263, 72.7%), and half met the cut-off for problematic smartphone use (n = 177; 48.9%). Of the cohort, we have questionnaire data at month 6 from 230 (63.5%), EHR data from 345 (95.3%), social media data from 110 (30.4%) and smartphone data from 48 (13.3%). CONCLUSION: The 3S-YP study is the first prospective study with a clinical youth sample, for whom to investigate the impact of digital technology on youth mental health using novel data linkages. Baseline findings indicate self-harm, anxiety, depression, sleep disturbance and digital technology overuse are prevalent among clinical youth. Future analyses will explore associations between outcomes and exposures over time and compare self-report with user-generated data in this cohort.
Asunto(s)
Conducta Autodestructiva , Teléfono Inteligente , Medios de Comunicación Sociales , Humanos , Adolescente , Conducta Autodestructiva/epidemiología , Conducta Autodestructiva/psicología , Masculino , Femenino , Estudios Prospectivos , Adulto Joven , Adulto , Servicios de Salud Mental , Ansiedad/epidemiología , Encuestas y Cuestionarios , Depresión/epidemiología , Autoinforme , Inglaterra/epidemiología , Estudios de CohortesRESUMEN
MOTIVATION: Scholarly biomedical publications report on the findings of a research investigation. Scientists use a well-established discourse structure to relate their work to the state of the art, express their own motivation and hypotheses and report on their methods, results and conclusions. In previous work, we have proposed ways to explicitly annotate the structure of scientific investigations in scholarly publications. Here we present the means to facilitate automatic access to the scientific discourse of articles by automating the recognition of 11 categories at the sentence level, which we call Core Scientific Concepts (CoreSCs). These include: Hypothesis, Motivation, Goal, Object, Background, Method, Experiment, Model, Observation, Result and Conclusion. CoreSCs provide the structure and context to all statements and relations within an article and their automatic recognition can greatly facilitate biomedical information extraction by characterizing the different types of facts, hypotheses and evidence available in a scientific publication. RESULTS: We have trained and compared machine learning classifiers (support vector machines and conditional random fields) on a corpus of 265 full articles in biochemistry and chemistry to automatically recognize CoreSCs. We have evaluated our automatic classifications against a manually annotated gold standard, and have achieved promising accuracies with 'Experiment', 'Background' and 'Model' being the categories with the highest F1-scores (76%, 62% and 53%, respectively). We have analysed the task of CoreSC annotation both from a sentence classification as well as sequence labelling perspective and we present a detailed feature evaluation. The most discriminative features are local sentence features such as unigrams, bigrams and grammatical dependencies while features encoding the document structure, such as section headings, also play an important role for some of the categories. We discuss the usefulness of automatically generated CoreSCs in two biomedical applications as well as work in progress. AVAILABILITY: A web-based tool for the automatic annotation of articles with CoreSCs and corresponding documentation is available online at http://www.sapientaproject.com/software http://www.sapientaproject.com also contains detailed information pertaining to CoreSC annotation and links to annotation guidelines as well as a corpus of manually annotated articles, which served as our training data. CONTACT: liakata@ebi.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Inteligencia Artificial , Reconocimiento de Normas Patrones Automatizadas/métodos , Publicaciones Periódicas como Asunto/clasificación , Máquina de Vectores de Soporte , Algoritmos , Internet , Programas InformáticosRESUMEN
INTRODUCTION: Young people are the most frequent users of social media and smartphones and there has been an increasing speculation about the potential negative impacts of their use on mental health. This has coincided with a sharp increase in the levels of self-harm in young people. To date, studies researching this potential association are predominantly cross-sectional and reliant on self-report data, which precludes the ability to objectively analyse behaviour over time. This study is one of the first attempts to explore temporal patterns of real-world usage prior to self-harm, to identify whether there are usage patterns associated with an increased risk. METHODS AND ANALYSIS: To study the mechanisms by which social media and smartphone use underpin self-harm in a clinical sample of young people, the Social media, Smartphone use and Self-harm in Young People (3S-YP) study uses a prospective, observational study design. Up to 600 young people aged 13-25 years old from secondary mental health services will be recruited and followed for up to 6 months. Primary analysis will compare real-world data in the 7 days leading up to a participant or clinician recorded self-harm episode, to categorise patterns of problematic usage. Secondary analyses will explore potential mediating effects of anxiety, depression, sleep disturbance, loneliness and bullying. ETHICS AND DISSEMINATION: This study was approved by the National Research Ethics Service, London - Riverside, as well as by the Joint Research and Development Office of the Institute of Psychiatry, Psychology and Neuroscience and South London and Maudsley NHS Foundation Trust (SLaM), and the SLaM Clinical Research Interactive Search (CRIS) Oversight Committee. The findings from this study will be disseminated through peer-reviewed scientific journals, conferences, websites, social media and stakeholder engagement activities. TRIAL REGISTRATION NUMBER: NCT04601220.
Asunto(s)
Conducta Autodestructiva , Medios de Comunicación Sociales , Humanos , Adolescente , Adulto Joven , Adulto , Teléfono Inteligente , Estudios Prospectivos , Estudios Transversales , Conducta Autodestructiva/epidemiología , Conducta Autodestructiva/psicología , Estudios Observacionales como AsuntoRESUMEN
Social media usage impacts upon the mental health and wellbeing of young people, yet there is not enough evidence to determine who is affected, how and to what extent. While it has widened and strengthened communication networks for many, the dangers posed to at-risk youth are serious. Social media data offers unique insights into the minute details of a user's online life. Timely consented access to data could offer many opportunities to transform understanding of its effects on mental wellbeing in different contexts. However, limited data access by researchers is preventing such advances from being made. Our multidisciplinary authorship includes a lived experience adviser, academic and practicing psychiatrists, and academic psychology, as well as computational, statistical, and qualitative researchers. In this Perspective article, we propose a framework to support secure and confidential access to social media platform data for research to make progress toward better public mental health.
RESUMEN
BACKGROUND: Many practical tasks in biomedicine require accessing specific types of information in scientific literature; e.g. information about the results or conclusions of the study in question. Several schemes have been developed to characterize such information in scientific journal articles. For example, a simple section-based scheme assigns individual sentences in abstracts under sections such as Objective, Methods, Results and Conclusions. Some schemes of textual information structure have proved useful for biomedical text mining (BIO-TM) tasks (e.g. automatic summarization). However, user-centered evaluation in the context of real-life tasks has been lacking. METHODS: We take three schemes of different type and granularity--those based on section names, Argumentative Zones (AZ) and Core Scientific Concepts (CoreSC)--and evaluate their usefulness for a real-life task which focuses on biomedical abstracts: Cancer Risk Assessment (CRA). We annotate a corpus of CRA abstracts according to each scheme, develop classifiers for automatic identification of the schemes in abstracts, and evaluate both the manual and automatic classifications directly as well as in the context of CRA. RESULTS: Our results show that for each scheme, the majority of categories appear in abstracts, although two of the schemes (AZ and CoreSC) were developed originally for full journal articles. All the schemes can be identified in abstracts relatively reliably using machine learning. Moreover, when cancer risk assessors are presented with scheme annotated abstracts, they find relevant information significantly faster than when presented with unannotated abstracts, even when the annotations are produced using an automatic classifier. Interestingly, in this user-based evaluation the coarse-grained scheme based on section names proved nearly as useful for CRA as the finest-grained CoreSC scheme. CONCLUSIONS: We have shown that existing schemes aimed at capturing information structure of scientific documents can be applied to biomedical abstracts and can be identified in them automatically with an accuracy which is high enough to benefit a real-life task in biomedicine.
Asunto(s)
Inteligencia Artificial , Minería de Datos , Procesamiento Automatizado de Datos/métodos , Neoplasias , Indización y Redacción de Resúmenes/clasificación , Biología Computacional/métodos , Humanos , Medición de RiesgoRESUMEN
Metabolite fingerprinting of Arabidopsis (Arabidopsis thaliana) mutants with known or predicted metabolic lesions was performed by (1)H-nuclear magnetic resonance, Fourier transform infrared, and flow injection electrospray-mass spectrometry. Fingerprinting enabled processing of five times more plants than conventional chromatographic profiling and was competitive for discriminating mutants, other than those affected in only low-abundance metabolites. Despite their rapidity and complexity, fingerprints yielded metabolomic insights (e.g. that effects of single lesions were usually not confined to individual pathways). Among fingerprint techniques, (1)H-nuclear magnetic resonance discriminated the most mutant phenotypes from the wild type and Fourier transform infrared discriminated the fewest. To maximize information from fingerprints, data analysis was crucial. One-third of distinctive phenotypes might have been overlooked had data models been confined to principal component analysis score plots. Among several methods tested, machine learning (ML) algorithms, namely support vector machine or random forest (RF) classifiers, were unsurpassed for phenotype discrimination. Support vector machines were often the best performing classifiers, but RFs yielded some particularly informative measures. First, RFs estimated margins between mutant phenotypes, whose relations could then be visualized by Sammon mapping or hierarchical clustering. Second, RFs provided importance scores for the features within fingerprints that discriminated mutants. These scores correlated with analysis of variance F values (as did Kruskal-Wallis tests, true- and false-positive measures, mutual information, and the Relief feature selection algorithm). ML classifiers, as models trained on one data set to predict another, were ideal for focused metabolomic queries, such as the distinctiveness and consistency of mutant phenotypes. Accessible software for use of ML in plant physiology is highlighted.
Asunto(s)
Arabidopsis/metabolismo , Inteligencia Artificial , Metabolómica , Algoritmos , Análisis por Conglomerados , Espectroscopía de Resonancia Magnética , Espectrometría de Masas , Fenotipo , Análisis de Componente Principal , Espectroscopía Infrarroja por Transformada de FourierRESUMEN
Recent work has suggested that disorganised speech might be a powerful predictor of later psychotic illness in clinical high risk subjects. To that end, several automated measures to quantify disorganisation of transcribed speech have been proposed. However, it remains unclear which measures are most strongly associated with psychosis, how different measures are related to each other and what the best strategies are to collect speech data from participants. Here, we assessed whether twelve automated Natural Language Processing markers could differentiate transcribed speech excerpts from subjects at clinical high risk for psychosis, first episode psychosis patients and healthy control subjects (total N = 54). In-line with previous work, several measures showed significant differences between groups, including semantic coherence, speech graph connectivity and a measure of whether speech was on-topic, the latter of which outperformed the related measure of tangentiality. Most NLP measures examined were only weakly related to each other, suggesting they provide complementary information. Finally, we compared the ability of transcribed speech generated using different tasks to differentiate the groups. Speech generated from picture descriptions of the Thematic Apperception Test and a story re-telling task outperformed free speech, suggesting that choice of speech generation method may be an important consideration. Overall, quantitative speech markers represent a promising direction for future clinical applications.
Asunto(s)
Procesamiento de Lenguaje Natural , Trastornos Psicóticos , Biomarcadores , Cognición , Humanos , Trastornos Psicóticos/diagnóstico , HablaRESUMEN
In this article we describe our experiences with computational text analysis involving rich social and cultural concepts. We hope to achieve three primary goals. First, we aim to shed light on thorny issues not always at the forefront of discussions about computational text analysis methods. Second, we hope to provide a set of key questions that can guide work in this area. Our guidance is based on our own experiences and is therefore inherently imperfect. Still, given our diversity of disciplinary backgrounds and research practices, we hope to capture a range of ideas and identify commonalities that resonate for many. This leads to our final goal: to help promote interdisciplinary collaborations. Interdisciplinary insights and partnerships are essential for realizing the full potential of any computational text analysis involving social and cultural concepts, and the more we bridge these divides, the more fruitful we believe our work will be.
RESUMEN
How does scientific research affect the world around us? Being able to answer this question is of great importance in order to appropriately channel efforts and resources in science. The impact by scientists in academia is currently measured by citation based metrics such as h-index, i-index and citation counts. These academic metrics aim to represent the dissemination of knowledge among scientists rather than the impact of the research on the wider world. In this work we are interested in measuring scientific impact beyond academia, on the economy, society, health and legislation (comprehensive impact). Indeed scientists are asked to demonstrate evidence of such comprehensive impact by authoring case studies in the context of the Research Excellence Framework (REF). We first investigate the extent to which existing citation based metrics can be indicative of comprehensive impact. We have collected all recent REF impact case studies from 2014 and we have linked these to papers in citation networks that we constructed and derived from CiteSeerX, arXiv and PubMed Central using a number of text processing and information retrieval techniques. We have demonstrated that existing citation-based metrics for impact measurement do not correlate well with REF impact results. We also consider metrics of online attention surrounding scientific works, such as those provided by the Altmetric API. We argue that in order to be able to evaluate wider non-academic impact we need to mine information from a much wider set of resources, including social media posts, press releases, news articles and political debates stemming from academic work. We also provide our data as a free and reusable collection for further analysis, including the PubMed citation network and the correspondence between REF case studies, grant applications and the academic literature.
Asunto(s)
Logro , Investigación Biomédica/normas , Factor de Impacto de la Revista , Modelos Estadísticos , Edición/estadística & datos numéricos , Humanos , Ciencia , Medios de Comunicación SocialesRESUMEN
The number of people affected by mental illness is on the increase and with it the burden on health and social care use, as well as the loss of both productivity and quality-adjusted life-years. Natural language processing of electronic health records is increasingly used to study mental health conditions and risk behaviours on a large scale. However, narrative notes written by clinicians do not capture first-hand the patients' own experiences, and only record cross-sectional, professional impressions at the point of care. Social media platforms have become a source of 'in the moment' daily exchange, with topics including well-being and mental health. In this study, we analysed posts from the social media platform Reddit and developed classifiers to recognise and classify posts related to mental illness according to 11 disorder themes. Using a neural network and deep learning approach, we could automatically recognise mental illness-related posts in our balenced dataset with an accuracy of 91.08% and select the correct theme with a weighted average accuracy of 71.37%. We believe that these results are a first step in developing methods to characterise large amounts of user-generated content that could support content curation and targeted interventions.
RESUMEN
As breaking news unfolds people increasingly rely on social media to stay abreast of the latest updates. The use of social media in such situations comes with the caveat that new information being released piecemeal may encourage rumours, many of which remain unverified long after their point of release. Little is known, however, about the dynamics of the life cycle of a social media rumour. In this paper we present a methodology that has enabled us to collect, identify and annotate a dataset of 330 rumour threads (4,842 tweets) associated with 9 newsworthy events. We analyse this dataset to understand how users spread, support, or deny rumours that are later proven true or false, by distinguishing two levels of status in a rumour life cycle i.e., before and after its veracity status is resolved. The identification of rumours associated with each event, as well as the tweet that resolved each rumour as true or false, was performed by journalist members of the research team who tracked the events in real time. Our study shows that rumours that are ultimately proven true tend to be resolved faster than those that turn out to be false. Whilst one can readily see users denying rumours once they have been debunked, users appear to be less capable of distinguishing true from false rumours when their veracity remains in question. In fact, we show that the prevalent tendency for users is to support every unverified rumour. We also analyse the role of different types of users, finding that highly reputable users such as news organisations endeavour to post well-grounded statements, which appear to be certain and accompanied by evidence. Nevertheless, these often prove to be unverified pieces of information that give rise to false rumours. Our study reinforces the need for developing robust machine learning techniques that can provide assistance in real time for assessing the veracity of rumours. The findings of our study provide useful insights for achieving this aim.
Asunto(s)
Comunicación , Medios de Comunicación Sociales , Negación en Psicología , Apoyo SocialRESUMEN
Out-of-date or incomplete drug product labeling information may increase the risk of otherwise preventable adverse drug events. In recognition of these concerns, the United States Federal Drug Administration (FDA) requires drug product labels to include specific information. Unfortunately, several studies have found that drug product labeling fails to keep current with the scientific literature. We present a novel approach to addressing this issue. The primary goal of this novel approach is to better meet the information needs of persons who consult the drug product label for information on a drug's efficacy, effectiveness, and safety. Using FDA product label regulations as a guide, the approach links drug claims present in drug information sources available on the Semantic Web with specific product label sections. Here we report on pilot work that establishes the baseline performance characteristics of a proof-of-concept system implementing the novel approach. Claims from three drug information sources were linked to the Clinical Studies, Drug Interactions, and Clinical Pharmacology sections of the labels for drug products that contain one of 29 psychotropic drugs. The resulting Linked Data set maps 409 efficacy/effectiveness study results, 784 drug-drug interactions, and 112 metabolic pathway assertions derived from three clinically-oriented drug information sources (ClinicalTrials.gov, the National Drug File - Reference Terminology, and the Drug Interaction Knowledge Base) to the sections of 1,102 product labels. Proof-of-concept web pages were created for all 1,102 drug product labels that demonstrate one possible approach to presenting information that dynamically enhances drug product labeling. We found that approximately one in five efficacy/effectiveness claims were relevant to the Clinical Studies section of a psychotropic drug product, with most relevant claims providing new information. We also identified several cases where all of the drug-drug interaction claims linked to the Drug Interactions section for a drug were potentially novel. The baseline performance characteristics of the proof-of-concept will enable further technical and user-centered research on robust methods for scaling the approach to the many thousands of product labels currently on the market.
RESUMEN
We describe our approach for creating a system able to detect emotions in suicide notes. Motivated by the sparse and imbalanced data as well as the complex annotation scheme, we have considered three hybrid approaches for distinguishing between the different categories. Each of the three approaches combines machine learning with manually derived rules, where the latter target very sparse emotion categories. The first approach considers the task as single label multi-class classification, where an SVM and a CRF classifier are trained to recognise fifteen different categories and their results are combined. Our second approach trains individual binary classifiers (SVM and CRF) for each of the fifteen sentence categories and returns the union of the classifiers as the final result. Finally, our third approach is a combination of binary and multi-class classifiers (SVM and CRF) trained on different subsets of the training data. We considered a number of different feature configurations. All three systems were tested on 300 unseen messages. Our second system had the best performance of the three, yielding an F1 score of 45.6% and a Precision of 60.1% whereas our best Recall (43.6%) was obtained using the third system.
RESUMEN
The reuse of scientific knowledge obtained from one investigation in another investigation is basic to the advance of science. Scientific investigations should therefore be recorded in ways that promote the reuse of the knowledge they generate. The use of logical formalisms to describe scientific knowledge has potential advantages in facilitating such reuse. Here, we propose a formal framework for using logical formalisms to promote reuse. We demonstrate the utility of this framework by using it in a worked example from biology: demonstrating cycles of investigation formalization [F] and reuse [R] to generate new knowledge. We first used logic to formally describe a Robot scientist investigation into yeast (Saccharomyces cerevisiae) functional genomics [f(1)]. With Robot scientists, unlike human scientists, the production of comprehensive metadata about their investigations is a natural by-product of the way they work. We then demonstrated how this formalism enabled the reuse of the research in investigating yeast phenotypes [r(1) = R(f(1))]. This investigation found that the removal of non-essential enzymes generally resulted in enhanced growth. The phenotype investigation was then formally described using the same logical formalism as the functional genomics investigation [f(2) = F(r(1))]. We then demonstrated how this formalism enabled the reuse of the phenotype investigation to investigate yeast systems-biology modelling [r(2) = R(f(2))]. This investigation found that yeast flux-balance analysis models fail to predict the observed changes in growth. Finally, the systems biology investigation was formalized for reuse in future investigations [f(3) = F(r(2))]. These cycles of reuse are a model for the general reuse of scientific knowledge.