Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Bioinformatics ; 37(17): 2780-2781, 2021 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-33515233

RESUMO

SUMMARY: Unsupervised machine learning provides tools for researchers to uncover latent patterns in large-scale data, based on calculated distances between observations. Methods to visualize high-dimensional data based on these distances can elucidate subtypes and interactions within multi-dimensional and high-throughput data. However, researchers can select from a vast number of distance metrics and visualizations, each with their own strengths and weaknesses. The Mercator R package facilitates selection of a biologically meaningful distance from 10 metrics, together appropriate for binary, categorical and continuous data, and visualization with 5 standard and high-dimensional graphics tools. Mercator provides a user-friendly pipeline for informaticians or biologists to perform unsupervised analyses, from exploratory pattern recognition to production of publication-quality graphics. AVAILABILITYAND IMPLEMENTATION: Mercator is freely available at the Comprehensive R Archive Network (https://cran.r-project.org/web/packages/Mercator/index.html).

2.
BMC Ophthalmol ; 22(1): 166, 2022 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-35418088

RESUMO

BACKGROUND: To examine the potential utility of five multifocal pupillographic objective perimetry (mfPOP) protocols, in the assessment of early diabetic retinopathy (DR) and generalised diabetes-related tissue injury in subjects with type 1 diabetes (T1D). METHODS: Twenty-five T1D subjects (age 41.8 ± 12.1 (SD) years, 13 male) with either no DR (n = 13) or non-proliferative DR (n = 12), and 23 age and gender-matched control subjects (age 39.7 ± 12.9 years, 9 male) were examined by mfPOP using five different stimulus methods differing in visual field eccentricity (central 30° and 60°), and colour (blue, yellow or green test-stimuli presented on, respectively, a blue, yellow or red background), each assessing 44 test-locations per eye. In the T1D subjects, we assessed 16 metabolic status and diabetes complications variables. These were summarised as three principal component analysis (PCA) factors. DR severity was assessed using Early Treatment of Diabetic Retinopathy Study (ETDRS) scores. Area under the curve (AUC) from receiver operator characteristic analyses quantified the diagnostic power of mfPOP response sensitivity and delay deviations for differentiating: (i) T1D subjects from control subjects, (ii) T1D subjects according to three levels of the identified PCA-factors from control subjects, and (iii) TID subjects with from those without non-proliferative DR. RESULTS: The two largest PCA-factors describing the T1D subjects were associated with metabolic variables (e.g. body mass index, HbA1c), and tissue-injury variables (e.g. serum creatinine, vibration perception). Linear models showed that mfPOP per-region response delays were more strongly associated than sensitivities with the metabolic PCA-factor and ETDRS scores. Combined mfPOP amplitude and delay measures produced AUCs of 90.4 ± 8.9% (mean ± SE) for discriminating T1D subjects with DR from control subjects, and T1D subjects with DR from those without of 85.9 ± 8.8%. The yellow and green stimuli performed better than blue on most measures. CONCLUSIONS/INTERPRETATION: In T1D subjects, mfPOP testing was able to identify localised visual field functional abnormalities (retinal/neural reflex) in the absence or presence of mild DR. mfPOP responses were also associated with T1D metabolic status, but less so with early stages of non-ophthalmic diabetes complications.


Assuntos
Diabetes Mellitus Tipo 1 , Retinopatia Diabética , Adulto , Diabetes Mellitus Tipo 1/complicações , Retinopatia Diabética/diagnóstico , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Pupila/fisiologia , Testes de Campo Visual/métodos , Campos Visuais
3.
Chem Biodivers ; 19(11): e202200657, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36216587

RESUMO

We present a novel model of time-series analysis to learn from electronic health record (EHR) data when infection occurred in the intensive care unit (ICU) by translating methods from proteomics and Bayesian statistics. Using 48,536 patients hospitalized in an ICU, we describe each hospital course as an 'alphabet' of 23 physician actions ('events') in temporal order. We analyze these as k-mers of length 3-12 events and apply a Bayesian model of (cumulative) relative risk (RR). The log2-transformed RR (median=0.248, mean=0.226) supported the conclusion that the events selected were individually associated with increased risk of infection. Selecting from all possible cutoffs of maximum gain (MG), MG>0.0244 predicts administration of antibiotics with PPV 82.0 %, NPV 44.4 %, and AUC 0.706. Our approach holds value for retrospective analysis of other clinical syndromes for which time-of-onset is critical to analysis but poorly marked in EHRs, including delirium and decompensation.


Assuntos
Registros Eletrônicos de Saúde , Unidades de Terapia Intensiva , Humanos , Estudos Retrospectivos , Teorema de Bayes
4.
BMC Bioinformatics ; 22(1): 100, 2021 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-33648439

RESUMO

BACKGROUND: There have been many recent breakthroughs in processing and analyzing large-scale data sets in biomedical informatics. For example, the CytoGPS algorithm has enabled the use of text-based karyotypes by transforming them into a binary model. However, such advances are accompanied by new problems of data sparsity, heterogeneity, and noisiness that are magnified by the large-scale multidimensional nature of the data. To address these problems, we developed the Mercator R package, which processes and visualizes binary biomedical data. We use Mercator to address biomedical questions of cytogenetic patterns relating to lymphoid hematologic malignancies, which include a broad set of leukemias and lymphomas. Karyotype data are one of the most common form of genetic data collected on lymphoid malignancies, because karyotyping is part of the standard of care in these cancers. RESULTS: In this paper we combine the analytic power of CytoGPS and Mercator to perform a large-scale multidimensional pattern recognition study on 22,741 karyotype samples in 47 different hematologic malignancies obtained from the public Mitelman database. CONCLUSION: Our findings indicate that Mercator was able to identify both known and novel cytogenetic patterns across different lymphoid malignancies, furthering our understanding of the genetics of these diseases.


Assuntos
Doenças Hematológicas , Cariotipagem , Neoplasias , Aberrações Cromossômicas , Humanos , Cariótipo
5.
J Biomed Inform ; 118: 103788, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33862229

RESUMO

INTRODUCTION: Clustering analyses in clinical contexts hold promise to improve the understanding of patient phenotype and disease course in chronic and acute clinical medicine. However, work remains to ensure that solutions are rigorous, valid, and reproducible. In this paper, we evaluate best practices for dissimilarity matrix calculation and clustering on mixed-type, clinical data. METHODS: We simulate clinical data to represent problems in clinical trials, cohort studies, and EHR data, including single-type datasets (binary, continuous, categorical) and 4 data mixtures. We test 5 single distance metrics (Jaccard, Hamming, Gower, Manhattan, Euclidean) and 3 mixed distance metrics (DAISY, Supersom, and Mercator) with 3 clustering algorithms (hierarchical (HC), k-medoids, self-organizing maps (SOM)). We quantitatively and visually validate by Adjusted Rand Index (ARI) and silhouette width (SW). We applied our best methods to two real-world data sets: (1) 21 features collected on 247 patients with chronic lymphocytic leukemia, and (2) 40 features collected on 6000 patients admitted to an intensive care unit. RESULTS: HC outperformed k-medoids and SOM by ARI across data types. DAISY produced the highest mean ARI for mixed data types for all mixtures except unbalanced mixtures dominated by continuous data. Compared to other methods, DAISY with HC uncovered superior, separable clusters in both real-world data sets. DISCUSSION: Selecting an appropriate mixed-type metric allows the investigator to obtain optimal separation of patient clusters and get maximum use of their data. Superior metrics for mixed-type data handle multiple data types using multiple, type-focused distances. Better subclassification of disease opens avenues for targeted treatments, precision medicine, clinical decision support, and improved patient outcomes.


Assuntos
Leucemia Linfocítica Crônica de Células B , Algoritmos , Análise por Conglomerados , Simulação por Computador , Humanos
6.
BMC Med Inform Decis Mak ; 21(1): 97, 2021 03 09.
Artigo em Inglês | MEDLINE | ID: mdl-33750375

RESUMO

BACKGROUND: In the intensive care unit (ICU), delirium is a common, acute, confusional state associated with high risk for short- and long-term morbidity and mortality. Machine learning (ML) has promise to address research priorities and improve delirium outcomes. However, due to clinical and billing conventions, delirium is often inconsistently or incompletely labeled in electronic health record (EHR) datasets. Here, we identify clinical actions abstracted from clinical guidelines in electronic health records (EHR) data that indicate risk of delirium among intensive care unit (ICU) patients. We develop a novel prediction model to label patients with delirium based on a large data set and assess model performance. METHODS: EHR data on 48,451 admissions from 2001 to 2012, available through Medical Information Mart for Intensive Care-III database (MIMIC-III), was used to identify features to develop our prediction models. Five binary ML classification models (Logistic Regression; Classification and Regression Trees; Random Forests; Naïve Bayes; and Support Vector Machines) were fit and ranked by Area Under the Curve (AUC) scores. We compared our best model with two models previously proposed in the literature for goodness of fit, precision, and through biological validation. RESULTS: Our best performing model with threshold reclassification for predicting delirium was based on a multiple logistic regression using the 31 clinical actions (AUC 0.83). Our model out performed other proposed models by biological validation on clinically meaningful, delirium-associated outcomes. CONCLUSIONS: Hurdles in identifying accurate labels in large-scale datasets limit clinical applications of ML in delirium. We developed a novel labeling model for delirium in the ICU using a large, public data set. By using guideline-directed clinical actions independent from risk factors, treatments, and outcomes as model predictors, our classifier could be used as a delirium label for future clinically targeted models.


Assuntos
Delírio , Unidades de Terapia Intensiva , Teorema de Bayes , Delírio/diagnóstico , Registros Eletrônicos de Saúde , Humanos , Aprendizado de Máquina
7.
J Am Med Inform Assoc ; 27(7): 1019-1027, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32483590

RESUMO

OBJECTIVE: Unsupervised machine learning approaches hold promise for large-scale clinical data. However, the heterogeneity of clinical data raises new methodological challenges in feature selection, choosing a distance metric that captures biological meaning, and visualization. We hypothesized that clustering could discover prognostic groups from patients with chronic lymphocytic leukemia, a disease that provides biological validation through well-understood outcomes. METHODS: To address this challenge, we applied k-medoids clustering with 10 distance metrics to 2 experiments ("A" and "B") with mixed clinical features collapsed to binary vectors and visualized with both multidimensional scaling and t-stochastic neighbor embedding. To assess prognostic utility, we performed survival analysis using a Cox proportional hazard model, log-rank test, and Kaplan-Meier curves. RESULTS: In both experiments, survival analysis revealed a statistically significant association between clusters and survival outcomes (A: overall survival, P = .0164; B: time from diagnosis to treatment, P = .0039). Multidimensional scaling separated clusters along a gradient mirroring the order of overall survival. Longer survival was associated with mutated immunoglobulin heavy-chain variable region gene (IGHV) status, absent Zap 70 expression, female sex, and younger age. CONCLUSIONS: This approach to mixed-type data handling and selection of distance metric captured well-understood, binary, prognostic markers in chronic lymphocytic leukemia (sex, IGHV mutation status, ZAP70 expression status) with high fidelity.


Assuntos
Cadeias Pesadas de Imunoglobulinas/genética , Leucemia Linfocítica Crônica de Células B/mortalidade , Mutação , Aprendizado de Máquina não Supervisionado , Proteína-Tirosina Quinase ZAP-70/metabolismo , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Estimativa de Kaplan-Meier , Leucemia Linfocítica Crônica de Células B/imunologia , Leucemia Linfocítica Crônica de Células B/metabolismo , Masculino , Pessoa de Meia-Idade , Prognóstico , Modelos de Riscos Proporcionais
8.
Cancer Genet ; 248-249: 34-38, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33059160

RESUMO

Karyotyping, the practice of visually examining and recording chromosomal abnormalities, is commonly used to diagnose diseases of genetic origin, including cancers. Karyotypes are recorded as text written in the International System for Human Cytogenetic Nomenclature (ISCN). Downstream analysis of karyotypes is conducted manually, due to the visual nature of analysis and the linguistic structure of the ISCN. The ISCN has not been computer-readable and, as such, prevents the full potential of these genomic data from being realized. In response, we developed CytoGPS, a platform to analyze large volumes of cytogenetic data using a Loss-Gain-Fusion model that converts the human-readable ISCN karyotypes into a machine-readable binary format. As proof of principle, we applied CytoGPS to cytogenetic data from the Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer, a National Cancer Institute hosted database of over 69,000 karyotypes of human cancers. Using the Jaccard coefficient to determine similarity between karyotypes structured as binary vectors, we were able to identify novel patterns from 4,968 Mitelman CML karyotypes, such as the co-occurrence of trisomy 19 and 21. The CytoGPS platform unlocks the potential for large-scale, comparative analysis of cytogenetic data. This methodological platform is freely available at CytoGPS.org.


Assuntos
Algoritmos , Aberrações Cromossômicas , Cromossomos Humanos , Bases de Dados Factuais , Cariotipagem/métodos , Leucemia Mielogênica Crônica BCR-ABL Positiva/genética , Leucemia Mielogênica Crônica BCR-ABL Positiva/patologia , Análise Citogenética , Humanos , Prognóstico
9.
Curr Infect Dis Rep ; 21(11): 41, 2019 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-31630276

RESUMO

PURPOSE OF REVIEW: Novel technologies, such as high-definition cameras, encryption software, electronic stethoscopes, microfluidic diagnostic systems, and widely available broadband Internet have expanded the potential for telemedicine. This narrative review presents current and future uses of telemedicine in the prevention, diagnosis, treatment, stewardship, and management of infectious disease. RECENT FINDINGS: Beginning in the 1990s, early approaches to telemedicine in infectious disease focused largely on treatment of HIV/AIDS, hepatitis C, and tuberculosis. However, recent innovations allow for targeting of additional diseases and in increasingly remote settings. Telemedicine allows virtual visits between patients in the home and remote providers, permitting outpatient management of complex conditions, such as post-surgical site monitoring, and non-urgent infectious maladies, such as uncomplicated urinary tract infection. Remote provider education by videoconference and integrated clinical decision support tools create avenues to improve inpatient care, including antimicrobial stewardship. Technological strides from miniaturization of diagnostic tests to robotic telepresence physical exams improve access to infectious disease care in isolated and infrastructure-poor environments, from cargo ships to other resource-limited settings. Telemedicine in the field of infectious disease is rapidly expanding in clinical, technological, geographical, and human capacity. Recent innovations narrow gaps in access to care for populations traditionally underserved, stigmatized, isolated by remote geography, or lacking technological infrastructure. Current and future approaches will transform inpatient, outpatient, and remote care.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa