RESUMO
This paper contributes to the pursuit of leveraging unstructured medical notes to structured clinical decision making. In particular, we present a pipeline for clinical information extraction from medical notes related to preterm birth, and discuss the main challenges as well as its potential for clinical practice. A large collection of medical notes, created by staff during hospitalizations of patients who were at risk of delivering preterm, was gathered and analyzed. Based on an annotated collection of notes, we trained and evaluated information extraction components to discover clinical entities such as symptoms, events, anatomical sites and procedures, as well as attributes linked to these clinical entities. In a retrospective study, we show that these are highly informative for clinical decision support models that are trained to predict whether delivery is likely to occur within specific time windows, in combination with structured information from electronic health records.
Assuntos
Nascimento Prematuro , Mineração de Dados , Registros Eletrônicos de Saúde , Feminino , Humanos , Recém-Nascido , Gravidez , Nascimento Prematuro/epidemiologia , Estudos RetrospectivosRESUMO
BACKGROUND: Headache disorders are an important health burden, having a large health-economic impact worldwide. Current treatment & follow-up processes are often archaic, creating opportunities for computer-aided and decision support systems to increase their efficiency. Existing systems are mostly completely data-driven, and the underlying models are a black-box, deteriorating interpretability and transparency, which are key factors in order to be deployed in a clinical setting. METHODS: In this paper, a decision support system is proposed, composed of three components: (i) a cross-platform mobile application to capture the required data from patients to formulate a diagnosis, (ii) an automated diagnosis support module that generates an interpretable decision tree, based on data semantically annotated with expert knowledge, in order to support physicians in formulating the correct diagnosis and (iii) a web application such that the physician can efficiently interpret captured data and learned insights by means of visualizations. RESULTS: We show that decision tree induction techniques achieve competitive accuracy rates, compared to other black- and white-box techniques, on a publicly available dataset, referred to as migbase. Migbase contains aggregated information of headache attacks from 849 patients. Each sample is labeled with one of three possible primary headache disorders. We demonstrate that we are able to reduce the classification error, statistically significant (ρ≤0.05), with more than 10% by balancing the dataset using prior expert knowledge. Furthermore, we achieve high accuracy rates by using features extracted using the Weisfeiler-Lehman kernel, which is completely unsupervised. This makes it an ideal approach to solve a potential cold start problem. CONCLUSION: Decision trees are the perfect candidate for the automated diagnosis support module. They achieve predictive performances competitive to other techniques on the migbase dataset and are, foremost, completely interpretable. Moreover, the incorporation of prior knowledge increases both predictive performance as well as transparency of the resulting predictive model on the studied dataset.
Assuntos
Sistemas de Apoio a Decisões Clínicas , Transtornos da Cefaleia/diagnóstico , Árvores de Decisões , Sistemas Inteligentes , Seguimentos , Humanos , SoftwareRESUMO
Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram database. However, we argue that these results are overly optimistic due to a methodological flaw being made. In this work, we focus on one specific type of methodological flaw: applying over-sampling before partitioning the data into mutually exclusive training and testing sets. We show how this causes the results to be biased using two artificial datasets and reproduce results of studies in which this flaw was identified. Moreover, we evaluate the actual impact of over-sampling on predictive performance, when applied prior to data partitioning, using the same methodologies of related studies, to provide a realistic view of these methodologies' generalization capabilities. We make our research reproducible by providing all the code under an open license.
Assuntos
Nascimento Prematuro , Bases de Dados Factuais , Feminino , Humanos , Recém-Nascido , GravidezRESUMO
Plasticizers and other plastics additives have been extensively used as ingredients of plastics and are as a result thereof easily released in the aquatic environment, due to different physical diffusion processes. In this context, a dedicated method was developed for the simultaneous quantification of 27 known and a virtually unlimited number of unknown alkylphenols, Bisphenol A and phthalates in 2 aquatic matrices, i.e. sea- and freshwater. To this extent, a novel instrumental HESI-UHPLC-HRMS (heated electro-spray ionization ultra-high performance liquid chromatographic high resolution mass spectrometric) method was devised for the simultaneous analysis of 7 phenols (i.e. 6 alkylphenols and Bisphenol A) and 20 phthalates within 10â¯min. Thereafter, a solid-phase extraction protocol was statistically (95% confidence interval, pâ¯>â¯0.05) optimized based on experimental designs. The method was proven fit-for-purpose through a successful validation at environmentally relevant nanomolar concentrations. Analytical precautions were taken for minimizing false-positive results to suppress in-house contamination. The method demonstrated an excellent analytical performance across all known plasticizers and plastics additives for sea- and freshwater, revealing good linearity (R2â¯>â¯0.99, nâ¯=â¯39), stable recoveries (98.5-105.8%), satisfactory repeatability (RSDâ¯<â¯8%, nâ¯=â¯54) and reproducibility (RSDâ¯<â¯10%, nâ¯=â¯36). Subsequently, a novel analytical strategy was devised for the tentative identification of unknown plasticizers and plastics additives using specific in-house determined fragments incorporated in a Python code. The applicability of the analytical platform was demonstrated by measuring 24 seawater samples. Interestingly, 16 out of 27 known plasticizers, plastics additives and primary metabolites could be quantified while the untargeted analysis uncovered 1042 compounds, whereof 5% (nâ¯=â¯46) could be assigned a plasticizer-plastics additive chemical identity, providing evidence for the severe plastic contamination status of our marine environment.
RESUMO
Bone age is an essential measure of skeletal maturity in children with growth disorders. It is typically assessed by a trained physician using radiographs of the hand and a reference model. However, it has been described that the reference models leave room for interpretation leading to a large inter-observer and intra-observer variation. In this work, we explore a novel method for automated bone age assessment to assist physicians with their estimation. It consists of a powerful combination of deep learning and Gaussian process regression. Using this combination, sensitivity of the deep learning model to rotations and flips of the input images can be exploited to increase overall predictive performance compared to only using the deep learning network. We validate our approach retrospectively on a set of 12611 radiographs of patients between 0 and 19 years of age.
Assuntos
Determinação da Idade pelo Esqueleto , Aprendizado Profundo , Ossos da Mão/diagnóstico por imagem , Interpretação de Imagem Assistida por Computador , Adolescente , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Recém-Nascido , Distribuição Normal , Variações Dependentes do Observador , Radiografia , Estudos Retrospectivos , Adulto JovemRESUMO
Physiological signals have shown to be reliable indicators of stress in laboratory studies, yet large-scale ambulatory validation is lacking. We present a large-scale cross-sectional study for ambulatory stress detection, consisting of 1002 subjects, containing subjects' demographics, baseline psychological information, and five consecutive days of free-living physiological and contextual measurements, collected through wearable devices and smartphones. This dataset represents a healthy population, showing associations between wearable physiological signals and self-reported daily-life stress. Using a data-driven approach, we identified digital phenotypes characterized by self-reported poor health indicators and high depression, anxiety and stress scores that are associated with blunted physiological responses to stress. These results emphasize the need for large-scale collections of multi-sensor data, to build personalized stress models for precision medicine.