RESUMO
Opal is the first published example of a full-stack platform infrastructure for an implementation science designed for ML in anesthesia that solves the problem of leveraging ML for clinical decision support. Users interact with a secure online Opal web application to select a desired operating room (OR) case cohort for data extraction, visualize datasets with built-in graphing techniques, and run in-client ML or extract data for external use. Opal was used to obtain data from 29,004 unique OR cases from a single academic institution for pre-operative prediction of post-operative acute kidney injury (AKI) based on creatinine KDIGO criteria using predictors which included pre-operative demographic, past medical history, medications, and flowsheet information. To demonstrate utility with unsupervised learning, Opal was also used to extract intra-operative flowsheet data from 2995 unique OR cases and patients were clustered using PCA analysis and k-means clustering. A gradient boosting machine model was developed using an 80/20 train to test ratio and yielded an area under the receiver operating curve (ROC-AUC) of 0.85 with 95% CI [0.80-0.90]. At the default probability decision threshold of 0.5, the model sensitivity was 0.9 and the specificity was 0.8. K-means clustering was performed to partition the cases into two clusters and for hypothesis generation of potential groups of outcomes related to intraoperative vitals. Opal's design has created streamlined ML functionality for researchers and clinicians in the perioperative setting and opens the door for many future clinical applications, including data mining, clinical simulation, high-frequency prediction, and quality improvement.
Assuntos
Anestesia , Sistemas de Apoio a Decisões Clínicas , Creatinina , Humanos , Ciência da Implementação , Aprendizado de MáquinaRESUMO
Breast cancer is the most common solid organ malignancy and the most frequent cause of cancer death among women worldwide. Previous research has yielded insights into its genetic etiology, but there remains a gap in the understanding of genetic factors that contribute to risk, and particularly in the biological mechanisms by which genetic variation modulates risk. The National Cancer Institute's "Up for a Challenge" (U4C) competition provided an opportunity to further elucidate the genetic basis of the disease. Our group leveraged the seven datasets made available by the U4C organizers and data from the publicly available UK Biobank cohort to examine associations between imputed gene expression and breast cancer risk. In particular, we used reference datasets describing the breast tissue and whole blood transcriptomes to impute expression levels in breast cancer cases and controls. In trans-ethnic meta-analyses of U4C and UK Biobank data, we found significant associations between breast cancer risk and the expression of RCCD1 (joint p-value: 3.6x10-06) and DHODH (p-value: 7.1x10-06) in breast tissue, as well as a suggestive association for ANKLE1 (p-value: 9.3x10-05). Expression of RCCD1 in whole blood was also suggestively associated with disease risk (p-value: 1.2x10-05), as were expression of ACAP1 (p-value: 1.9x10-05) and LRRC25 (p-value: 5.2x10-05). While genome-wide association studies (GWAS) have implicated RCCD1 and ANKLE1 in breast cancer risk, they have not identified the remaining three genes. Among the genetic variants that contributed to the predicted expression of the five genes, we found 23 nominally (p-value < 0.05) associated with breast cancer risk, among which 15 are not in high linkage disequilibrium with risk variants previously identified by GWAS. In summary, we used a transcriptome-based approach to investigate the genetic underpinnings of breast carcinogenesis. This approach provided an avenue for deciphering the functional relevance of genes and genetic variants involved in breast cancer.
Assuntos
Neoplasias da Mama/genética , Proteínas de Transporte/genética , Proteínas Ativadoras de GTPase/genética , Predisposição Genética para Doença , Proteínas de Membrana/genética , Locos de Características Quantitativas/genética , Mama/metabolismo , Mama/patologia , Neoplasias da Mama/sangue , Neoplasias da Mama/patologia , Proteínas de Transporte/sangue , Endonucleases/sangue , Endonucleases/genética , Etnicidade , Feminino , Proteínas Ativadoras de GTPase/sangue , Estudo de Associação Genômica Ampla , Humanos , Proteínas de Membrana/sangue , Polimorfismo de Nucleotídeo Único , Fatores de Risco , Transcriptoma/genéticaRESUMO
Purpose To develop and validate a deep learning algorithm that predicts the final diagnosis of Alzheimer disease (AD), mild cognitive impairment, or neither at fluorine 18 (18F) fluorodeoxyglucose (FDG) PET of the brain and compare its performance to that of radiologic readers. Materials and Methods Prospective 18F-FDG PET brain images from the Alzheimer's Disease Neuroimaging Initiative (ADNI) (2109 imaging studies from 2005 to 2017, 1002 patients) and retrospective independent test set (40 imaging studies from 2006 to 2016, 40 patients) were collected. Final clinical diagnosis at follow-up was recorded. Convolutional neural network of InceptionV3 architecture was trained on 90% of ADNI data set and tested on the remaining 10%, as well as the independent test set, with performance compared to radiologic readers. Model was analyzed with sensitivity, specificity, receiver operating characteristic (ROC), saliency map, and t-distributed stochastic neighbor embedding. Results The algorithm achieved area under the ROC curve of 0.98 (95% confidence interval: 0.94, 1.00) when evaluated on predicting the final clinical diagnosis of AD in the independent test set (82% specificity at 100% sensitivity), an average of 75.8 months prior to the final diagnosis, which in ROC space outperformed reader performance (57% [four of seven] sensitivity, 91% [30 of 33] specificity; P < .05). Saliency map demonstrated attention to known areas of interest but with focus on the entire brain. Conclusion By using fluorine 18 fluorodeoxyglucose PET of the brain, a deep learning algorithm developed for early prediction of Alzheimer disease achieved 82% specificity at 100% sensitivity, an average of 75.8 months prior to the final diagnosis. © RSNA, 2018 Online supplemental material is available for this article. See also the editorial by Larvie in this issue.
Assuntos
Doença de Alzheimer/diagnóstico por imagem , Aprendizado Profundo , Interpretação de Imagem Assistida por Computador/métodos , Tomografia por Emissão de Pósitrons/métodos , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Disfunção Cognitiva/diagnóstico por imagem , Feminino , Fluordesoxiglucose F18/uso terapêutico , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Sensibilidade e EspecificidadeRESUMO
Applying state-of-the-art machine learning techniques to medical images requires a thorough selection and normalization of input data. One of such steps in digital mammography screening for breast cancer is the labeling and removal of special diagnostic views, in which diagnostic tools or magnification are applied to assist in assessment of suspicious initial findings. As a common task in medical informatics is prediction of disease and its stage, these special diagnostic views, which are only enriched among the cohort of diseased cases, will bias machine learning disease predictions. In order to automate this process, here, we develop a machine learning pipeline that utilizes both DICOM headers and images to predict such views in an automatic manner, allowing for their removal and the generation of unbiased datasets. We achieve AUC of 99.72% in predicting special mammogram views when combining both types of models. Finally, we apply these models to clean up a dataset of about 772,000 images with expected sensitivity of 99.0%. The pipeline presented in this paper can be applied to other datasets to obtain high-quality image sets suitable to train algorithms for disease detection.
Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado de Máquina , Mamografia/classificação , Mamografia/métodos , Automação , Conjuntos de Dados como Assunto , Feminino , Humanos , Sistemas de Informação em Radiologia , Sensibilidade e EspecificidadeRESUMO
Breast cancer is a leading cause of cancer death among women in the USA. Screening mammography is effective in reducing mortality, but has a high rate of unnecessary recalls and biopsies. While deep learning can be applied to mammography, large-scale labeled datasets, which are difficult to obtain, are required. We aim to remove many barriers of dataset development by automatically harvesting data from existing clinical records using a hybrid framework combining traditional NLP and IBM Watson. An expert reviewer manually annotated 3521 breast pathology reports with one of four outcomes: left positive, right positive, bilateral positive, negative. Traditional NLP techniques using seven different machine learning classifiers were compared to IBM Watson's automated natural language classifier. Techniques were evaluated using precision, recall, and F-measure. Logistic regression outperformed all other traditional machine learning classifiers and was used for subsequent comparisons. Both traditional NLP and Watson's NLC performed well for cases under 1024 characters with weighted average F-measures above 0.96 across all classes. Performance of traditional NLP was lower for cases over 1024 characters with an F-measure of 0.83. We demonstrate a hybrid framework using traditional NLP techniques combined with IBM Watson to annotate over 10,000 breast pathology reports for development of a large-scale database to be used for deep learning in mammography. Our work shows that traditional NLP and IBM Watson perform extremely well for cases under 1024 characters and can accelerate the rate of data annotation.
Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado Profundo/estatística & dados numéricos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Interpretação de Imagem Assistida por Computador/métodos , Mamografia/métodos , Mama/diagnóstico por imagem , Bases de Dados Factuais , Feminino , Humanos , Pessoa de Meia-IdadeRESUMO
BACKGROUND: Ehlers Danlos Syndrome is a rare form of inherited connective tissue disorder, which primarily affects skin, joints, muscle, and blood cells. The current study aimed at finding the mutation that causing EDS type VII C also known as "Dermatosparaxis" in this family. METHODS: Through systematic data querying of the electronic medical records (EMRs) of over 80,000 individuals, we recently identified an EDS family that indicate an autosomal dominant inheritance. The family was consented for genomic analysis of their de-identified data. After a negative screen for known mutations, we performed whole genome sequencing on the male proband, his affected father, and unaffected mother. We filtered the list of non-synonymous variants that are common between the affected individuals. RESULTS: The analysis of non-synonymous variants lead to identifying a novel mutation in the ADAMTSL2 (p. Gly421Ser) gene in the affected individuals. Sanger sequencing confirmed the mutation. CONCLUSION: Our work is significant not only because it sheds new light on the pathophysiology of EDS for the affected family and the field at large, but also because it demonstrates the utility of unbiased large-scale clinical recruitment in deciphering the genetic etiology of rare mendelian diseases. With unbiased large-scale clinical recruitment we strive to sequence as many rare mendelian diseases as possible, and this work in EDS serves as a successful proof of concept to that effect.
Assuntos
Proteínas ADAM/genética , Mineração de Dados/métodos , Bases de Dados Genéticas , Síndrome de Ehlers-Danlos/genética , Variação Genética/genética , Proteínas ADAMTS , Criança , Síndrome de Ehlers-Danlos/diagnóstico , Feminino , Humanos , Masculino , LinhagemRESUMO
We examined the burden of large, rare, copy-number variants (CNVs) in 192 individuals with renal hypodysplasia (RHD) and replicated findings in 330 RHD cases from two independent cohorts. CNV distribution was significantly skewed toward larger gene-disrupting events in RHD cases compared to 4,733 ethnicity-matched controls (p = 4.8 × 10(-11)). This excess was attributable to known and novel (i.e., not present in any database or in the literature) genomic disorders. All together, 55/522 (10.5%) RHD cases harbored 34 distinct known genomic disorders, which were detected in only 0.2% of 13,839 population controls (p = 1.2 × 10(-58)). Another 32 (6.1%) RHD cases harbored large gene-disrupting CNVs that were absent from or extremely rare in the 13,839 population controls, identifying 38 potential novel or rare genomic disorders for this trait. Deletions at the HNF1B locus and the DiGeorge/velocardiofacial locus were most frequent. However, the majority of disorders were detected in a single individual. Genomic disorders were detected in 22.5% of individuals with multiple malformations and 14.5% of individuals with isolated urinary-tract defects; 14 individuals harbored two or more diagnostic or rare CNVs. Strikingly, the majority of the known CNV disorders detected in the RHD cohort have previous associations with developmental delay or neuropsychiatric diseases. Up to 16.6% of individuals with kidney malformations had a molecular diagnosis attributable to a copy-number disorder, suggesting kidney malformations as a sentinel manifestation of pathogenic genomic imbalances. A search for pathogenic CNVs should be considered in this population for the diagnosis of their specific genomic disorders and for the evaluation of the potential for developmental delay.
Assuntos
Variações do Número de Cópias de DNA , Nefropatias/congênito , Nefropatias/genética , Estudos de Casos e Controles , Aberrações Cromossômicas , Estudos de Associação Genética , Genótipo , Humanos , Anotação de Sequência MolecularRESUMO
OBJECTIVES: Tetralogy of Fallot (TOF) is the most common cyanotic congenital heart defect in the United States. We aimed to identify genetic variations associated with TOF using meta-analysis of publicly available digital samples to spotlight targets for prevention, screening, and treatment strategies. METHODS: We used the Search Tag Analyze Resource for Gene Expression Omnibus (STARGEO) platform to identify 39 TOF and 19 non-TOF right ventricle tissue samples from microarray data and identified upregulated and downregulated genes. Associated gene expression data were analyzed using ingenuity pathway analysis and restricted to genes with a statistically significant (p < .05) difference and an absolute experimental log ratio >0.1 between disease and control samples. RESULTS: Our analysis identified 220 genes whose expression profiles were significantly altered in TOF vs. non-TOF samples. The most striking differences identified in gene expression included genes FBXO32, PTGES, MYL12a, and NR2F2. Some top associated canonical pathways included adrenergic signaling, estrogen receptor signaling, and the role of NFAT in cardiac hypertrophy. In general, genes involved in adaptive, defensive, and reparative cardiovascular responses showed altered expression in TOF vs. non-TOF samples. CONCLUSIONS: We introduced the interpretation of open "big data" using the STARGEO platform to define robust genomic signatures of congenital heart disease pathology of TOF. Overall, our meta-analysis results indicated increased metabolism, inflammation, and altered gene expression in TOF patients. Estrogen receptor signaling and the role of NFAT in cardiac hypertrophy represent unique pathways upregulated in TOF patients and are potential targets for future pharmacologic treatments.
Assuntos
Cardiopatias Congênitas , Tetralogia de Fallot , Humanos , Estados Unidos , Tetralogia de Fallot/genética , Cardiomegalia , Receptores de Estrogênio/genética , Expressão GênicaRESUMO
We executed a genome-wide association scan for age-related macular degeneration (AMD) in 2,157 cases and 1,150 controls. Our results validate AMD susceptibility loci near CFH (P < 10(-75)), ARMS2 (P < 10(-59)), C2/CFB (P < 10(-20)), C3 (P < 10(-9)), and CFI (P < 10(-6)). We compared our top findings with the Tufts/Massachusetts General Hospital genome-wide association study of advanced AMD (821 cases, 1,709 controls) and genotyped 30 promising markers in additional individuals (up to 7,749 cases and 4,625 controls). With these data, we identified a susceptibility locus near TIMP3 (overall P = 1.1 x 10(-11)), a metalloproteinase involved in degradation of the extracellular matrix and previously implicated in early-onset maculopathy. In addition, our data revealed strong association signals with alleles at two loci (LIPC, P = 1.3 x 10(-7); CETP, P = 7.4 x 10(-7)) that were previously associated with high-density lipoprotein cholesterol (HDL-c) levels in blood. Consistent with the hypothesis that HDL metabolism is associated with AMD pathogenesis, we also observed association with AMD of HDL-c-associated alleles near LPL (P = 3.0 x 10(-3)) and ABCA1 (P = 5.6 x 10(-4)). Multilocus analysis including all susceptibility loci showed that 329 of 331 individuals (99%) with the highest-risk genotypes were cases, and 85% of these had advanced AMD. Our studies extend the catalog of AMD associated loci, help identify individuals at high risk of disease, and provide clues about underlying cellular pathways that should eventually lead to new therapies.
Assuntos
Predisposição Genética para Doença , Lipoproteínas HDL/metabolismo , Degeneração Macular/genética , Inibidor Tecidual de Metaloproteinase-3/genética , Alelos , Estudos de Casos e Controles , Mapeamento Cromossômico , Fator I do Complemento/genética , Variação Genética , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único , Análise de Regressão , Risco , Inibidor Tecidual de Metaloproteinase-3/fisiologiaRESUMO
STUDY DESIGN: A retrospective study at a single academic institution. OBJECTIVE: The purpose of this study is to utilize machine learning to predict hospital length of stay (LOS) and discharge disposition following adult elective spine surgery, and to compare performance metrics of machine learning models to the American College of Surgeon's National Surgical Quality Improvement Program's (ACS NSQIP) prediction calculator. SUMMARY OF BACKGROUND DATA: A total of 3678 adult patients undergoing elective spine surgery between 2014 and 2019, acquired from the electronic health record. METHODS: Patients were divided into three stratified cohorts: cervical degenerative, lumbar degenerative, and adult spinal deformity groups. Predictive variables included demographics, body mass index, surgical region, surgical invasiveness, surgical approach, and comorbidities. Regression, classification trees, and least absolute shrinkage and selection operator (LASSO) were used to build predictive models. Validation of the models was conducted on 16% of patients (N=587), using area under the receiver operator curve (AUROC), sensitivity, specificity, and correlation. Patient data were manually entered into the ACS NSQIP online risk calculator to compare performance. Outcome variables were discharge disposition (home vs. rehabilitation) and LOS (days). RESULTS: Of 3678 patients analyzed, 51.4% were male (n=1890) and 48.6% were female (n=1788). The average LOS was 3.66 days. In all, 78% were discharged home and 22% discharged to rehabilitation. Compared with NSQIP (Pearson R2 =0.16), the predictions of poisson regression ( R2 =0.29) and LASSO ( R2 =0.29) models were significantly more correlated with observed LOS ( P =0.025 and 0.004, respectively). Of the models generated to predict discharge location, logistic regression yielded an AUROC of 0.79, which was statistically equivalent to the AUROC of 0.75 for NSQIP ( P =0.135). CONCLUSION: The predictive models developed in this study can enable accurate preoperative estimation of LOS and risk of rehabilitation discharge for adult patients undergoing elective spine surgery. The demonstrated models exhibited better performance than NSQIP for prediction of LOS and equivalent performance to NSQIP for prediction of discharge location.
Assuntos
Complicações Pós-Operatórias , Melhoria de Qualidade , Adulto , Estados Unidos , Humanos , Estudos Retrospectivos , Complicações Pós-Operatórias/cirurgia , Procedimentos Cirúrgicos Eletivos , Coluna Vertebral/cirurgia , Tempo de Internação , Medição de RiscoRESUMO
The etiology of monoclonal gammopathy of undetermined significance (MGUS) and multiple myeloma (MM) is still obscure as are the processes that enable the progression of MGUS to MM. Understanding the unique vs. shared transcriptomes can potentially elucidate why individuals develop one or the other. Furthermore, highlighting key pathways and genes involved in the pathogenesis of MM or the development of MGUS to MM may allow the discovery of novel drug targets and therapies. We employed STARGEO platform to perform three separate meta-analysis to compare MGUS and MM transcriptomes. For these analyses we tagged (1) 101 MGUS patient plasma cells from bone marrow samples and 64 plasma cells from healthy controls (2) 383 MM patient CD138+ cells from bone marrow and the 101 MGUS samples in the first analysis as controls (3) 517 MM patient peripheral blood samples and 97 peripheral blood samples from healthy controls. We then utilized Ingenuity Pathway Analysis (IPA) to analyze the unique genomic signatures within and across these samples. Our study identified genes that may have unique roles in MGUS (GADD45RA and COMMD3), but also newly identified signaling pathways (EIF2, JAK/STAT, and MYC) and gene activity (NRG3, RBFOX2, and PARP15) in MGUS that have previously been shown to be involved in MM suggesting a spectrum of molecular overlap. On the other hand, genes such as DUSP4, RN14, LAMP5, differentially upregulated in MM, may be seen as tipping the scales from benignity to malignancy and could serve as drug targets or novel biomarkers for risk of progression. Furthermore, our analysis of MM identified newly associated gene/pathway activity such as inhibition of Wnt-signaling and defective B cell development. Finally, IPA analysis, suggests the multifactorial, oncogenic qualities of IFNγ signaling in MM may be a unifying pathway for these diverse mechanisms and prompts the need for further studies.
RESUMO
PURPOSE: To investigate whether there is an association between known age-related macular degeneration genetic risk variants in the CFH, ARMS2, and HTRA1 genes and response to anti-vascular endothelial growth factor (VEGF) (ranibizumab or bevacizumab) treatment for wet age-related macular degeneration. METHODS: A retrospective review of 150 patients with documented wet age-related macular degeneration based on clinical examination and fluorescein angiogram was performed. Patients received anti-VEGF therapy with ranibizumab and/or bevacizumab. Patients were genotyped for the single-nucleotide polymorphism rs1061170, rs10490924, rs3750848, rs3793917, rs11200638, and rs932275 and for the indel del443ins54 spanning the CFH, ARMS2, and HTRA1 genes. RESULTS: There were 57 patients who were characterized as negative responders to anti-VEGF therapy, and 93 patients who were characterized as positive responders. There was no significant difference in mean baseline visual acuity between the groups. Negative responders were followed for a mean duration of 24.0 months, while positive responders were followed for a mean duration of 22.0 months. Although the frequency of the at-risk alleles was higher in the positive responders when compared with the negative responder, this did not reach statistical significance. Additionally, there was no significant association between genotype and the number of injections or absolute change in visual acuity in both groups of responders. CONCLUSION: In our patient cohort, there was no statistically significant association between response to anti-VEGF therapy and the genotype in both positive-responder and negative-responder groups. Larger studies with more power are necessary to further determine whether a pharmacogenetic association exists between wet age-related macular degeneration and anti-VEGF therapy.
Assuntos
Inibidores da Angiogênese/administração & dosagem , Anticorpos Monoclonais Humanizados/administração & dosagem , Degeneração Macular/tratamento farmacológico , Degeneração Macular/genética , Fator A de Crescimento do Endotélio Vascular/antagonistas & inibidores , Idoso , Idoso de 80 Anos ou mais , Bevacizumab , Fator H do Complemento/genética , Esquema de Medicação , Serina Peptidase 1 de Requerimento de Alta Temperatura A , Humanos , Degeneração Macular/fisiopatologia , Polimorfismo de Nucleotídeo Único/genética , Proteínas/genética , Ranibizumab , Estudos Retrospectivos , Fatores de Risco , Serina Endopeptidases/genética , Resultado do Tratamento , Acuidade Visual/fisiologiaRESUMO
The genetics underlying the autism spectrum disorders (ASDs) is complex and remains poorly understood. Previous work has demonstrated an important role for structural variation in a subset of cases, but has lacked the resolution necessary to move beyond detection of large regions of potential interest to identification of individual genes. To pinpoint genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. Through prioritization of exonic deletions (eDels), exonic duplications (eDups), and whole gene duplication events (gDups), we identified more than 150 loci harboring rare variants in multiple unrelated probands, but no controls. Importantly, 27 of these were confirmed on examination of an independent replication cohort comprised of 859 cases and an additional 1,051 controls. Rare variants at known loci, including exonic deletions at NRXN1 and whole gene duplications encompassing UBE3A and several other genes in the 15q11-q13 region, were observed in the course of these analyses. Strong support was likewise observed for previously unreported genes such as BZRAP1, an adaptor molecule known to regulate synaptic transmission, with eDels or eDups observed in twelve unrelated cases but no controls (p = 2.3x10(-5)). Less is known about MDGA2, likewise observed to be case-specific (p = 1.3x10(-4)). But, it is notable that the encoded protein shows an unexpectedly high similarity to Contactin 4 (BLAST E-value = 3x10(-39)), which has also been linked to disease. That hundreds of distinct rare variants were each seen only once further highlights complexity in the ASDs and points to the continued need for larger cohorts.
Assuntos
Transtorno Autístico/genética , Éxons , Dosagem de Genes , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Adolescente , Proteínas de Ligação ao Cálcio , Estudos de Casos e Controles , Moléculas de Adesão Celular Neuronais , Criança , Pré-Escolar , Estudos de Coortes , Feminino , Duplicação Gênica , Humanos , Masculino , Proteínas do Tecido Nervoso/genética , Moléculas de Adesão de Célula Nervosa , Linhagem , Deleção de Sequência , Ubiquitina-Proteína Ligases/genética , Adulto JovemRESUMO
In healthcare, artificial intelligence (AI) technologies have the potential to create significant value by improving time-sensitive outcomes while lowering error rates for each patient. Diagnostic images, clinical notes, and reports are increasingly generated and stored in electronic medical records. This heterogeneous data presenting us with challenges in data analytics and reusability that is by nature has high complexity, thereby necessitating novel ways to store, manage and process, and reuse big data. This presents an urgent need to develop new, scalable, and expandable AI infrastructure and analytical methods that can enable healthcare providers to access knowledge for individual patients, yielding better decisions and outcomes. In this review article, we briefly discuss the nature of data in breast cancer study and the role of AI for generating "smart data" which offer actionable information that supports the better decision for personalized medicine for individual patients. In our view, the biggest challenge is to create a system that makes data robust and smart for healthcare providers and patients that can lead to more effective clinical decision-making, improved health outcomes, and ultimately, managing the healthcare outcomes and costs. We highlight some of the challenges in using breast cancer data and propose the need for an AI-driven environment to address them. We illustrate our vision with practical use cases and discuss a path for empowering the study of breast cancer databases with the application of AI and future directions.
Assuntos
Inteligência Artificial , Neoplasias da Mama , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/terapia , Atenção à Saúde , Feminino , Humanos , Poder Psicológico , Medicina de PrecisãoRESUMO
The study of precision medicine that measures the effects of social, cultural, and environmental influences on health is essential to improve health outcomes. Race is a social concept used historically to divide, track, control populations, and reinforce social hierarchies. Beyond genetics, race is also a surrogate for other socioeconomic factors affecting patient outcomes. Our data analytics study aims to analyze the Electronic Medical Record (EMR) to study patients of different races in diagnosing and treating Coronary Artery Disease (CAD). We found no race discrepancies at the University of California San Francisco Medical Centers. This study opens several new hypotheses for further research in this crucial field.
Assuntos
Doença da Artéria Coronariana , Registros Eletrônicos de Saúde , Doença da Artéria Coronariana/diagnóstico , Doença da Artéria Coronariana/terapia , Ciência de Dados , Humanos , Medicina de Precisão , Fatores SocioeconômicosRESUMO
The development of an ontology facilitates the organization of the variety of concepts used to describe different terms in different resources. The proposed ontology will facilitate the study of cardiothoracic surgical education and data analytics in electronic medical records (EMR) with the standard vocabulary.
Assuntos
Ontologias Biológicas , Ciência de Dados , Registros Eletrônicos de Saúde , VocabulárioRESUMO
Background: Women continue to have worse Coronary Artery Disease (CAD) outcomes than men. The causes of this discrepancy have yet to be fully elucidated. The main objective of this study is to detect gender discrepancies in the diagnosis and treatment of CAD. Methods: We used data analytics to risk stratify ~32,000 patients with CAD of the total 960,129 patients treated at the UCSF Medical Center over an 8 year period. We implemented a multidimensional data analytics framework to trace patients from admission through treatment to create a path of events. Events are any medications or noninvasive and invasive procedures. The time between events for a similar set of paths was calculated. Then, the average waiting time for each step of the treatment was calculated. Finally, we applied statistical analysis to determine differences in time between diagnosis and treatment steps for men and women. Results: There is a significant time difference from the first time of admission to diagnostic Cardiac Catheterization between genders (p-value = 0.000119), while the time difference from diagnostic Cardiac Catheterization to CABG is not statistically significant. Conclusion: Women had a significantly longer interval between their first physician encounter indicative of CAD and their first diagnostic cardiac catheterization compared to men. Avoiding this delay in diagnosis may provide more timely treatment and a better outcome for patients at risk. Finally, we conclude by discussing the impact of the study on improving patient care with early detection and managing individual patients at risk of rapid progression of CAD.
RESUMO
Early detection plays a key role to enhance the outcome for Coronary Artery Disease. We utilized a big data analytics platform on â¼32,000 patients to trace patients from the first encounter to CAD treatment. There are significant gender-based differences in patients younger than 60 from the time of the first encounter to Coronary Artery Bypass Grafting with a p-value=0.03. This recognition makes significant changes in outcome by avoiding delay in treatment.
Assuntos
Doença da Artéria Coronariana , Ponte de Artéria Coronária/efeitos adversos , Doença da Artéria Coronariana/diagnóstico , Doença da Artéria Coronariana/cirurgia , Ciência de Dados , Registros Eletrônicos de Saúde , Feminino , Humanos , Fatores de Risco , Tempo para o Tratamento , Resultado do TratamentoRESUMO
Our objective was to develop deep learning models with chest radiograph data to predict healthcare costs and classify top-50% spenders. 21,872 frontal chest radiographs were retrospectively collected from 19,524 patients with at least 1-year spending data. Among the patients, 11,003 patients had 3 years of cost data, and 1678 patients had 5 years of cost data. Model performances were measured with area under the receiver operating characteristic curve (ROC-AUC) for classification of top-50% spenders and Spearman ρ for prediction of healthcare cost. The best model predicting 1-year (N = 21,872) expenditure achieved ROC-AUC of 0.806 [95% CI 0.793-0.819] for top-50% spender classification and ρ of 0.561 [0.536-0.586] for regression. Similarly, for predicting 3-year (N = 12,395) expenditure, ROC-AUC of 0.771 [0.750-0.794] and ρ of 0.524 [0.489-0.559]; for predicting 5-year (N = 1779) expenditure ROC-AUC of 0.729 [0.667-0.729] and ρ of 0.424 [0.324-0.529]. Our deep learning model demonstrated the feasibility of predicting health care expenditure as well as classifying top 50% healthcare spenders at 1, 3, and 5 year(s), implying the feasibility of combining deep learning with information-rich imaging data to uncover hidden associations that may allude to physicians. Such a model can be a starting point of making an accurate budget in reimbursement models in healthcare industries.
Assuntos
Aprendizado Profundo , Atenção à Saúde , Humanos , Projetos Piloto , Curva ROC , Radiografia , Estudos RetrospectivosRESUMO
BACKGROUND: Non-alcoholic fatty liver disease (NAFLD) is the most common chronic liver disease in the United States and globally. The currently understood model of pathogenesis consists of a 'multiple hit' hypothesis in which environmental and genetic factors contribute to hepatic inflammation and injury. AIM: To examine the genetic expression of NAFLD and non-alcoholic steatohepatitis (NASH) tissue samples to identify common pathways that contribute to NAFLD and NASH pathogenesis. METHODS: We employed the Search Tag Analyze Resource for Gene Expression Omnibus platform to search the The National Center for Biotechnology Information Gene Expression Omnibus to elucidate NAFLD and NASH pathology. For NAFLD, we conducted meta-analysis of data from 58 NAFLD liver biopsies and 60 healthy liver biopsies; for NASH, we analyzed 187 NASH liver biopsies and 154 healthy liver biopsies. RESULTS: Our results from the NAFLD analysis reinforce the role of altered metabolism, inflammation, and cell survival in pathogenesis and support recently described contributors to disease activity, such as altered androgen and long non-coding RNA activity. The top upstream regulator was found to be sterol regulatory element binding transcription factor 1 (SREBF1), a transcription factor involved in lipid homeostasis. Downstream of SREBF1, we observed upregulation in CXCL10, HMGCR, HMGCS1, fatty acid binding protein 5, paternally expressed imprinted gene 10, and downregulation of sex hormone-binding globulin and insulin-like growth factor 1. These molecular changes reflect low-grade inflammation secondary to accumulation of fatty acids in the liver. Our results from the NASH analysis emphasized the role of cholesterol in pathogenesis. Top canonical pathways, disease networks, and disease functions were related to cholesterol synthesis, lipid metabolism, adipogenesis, and metabolic disease. Top upstream regulators included pro-inflammatory cytokines tumor necrosis factor and IL1B, PDGF BB, and beta-estradiol. Inhibition of beta-estradiol was shown to be related to derangement of several cellular downstream processes including metabolism, extracellular matrix deposition, and tumor suppression. Lastly, we found riciribine (an AKT inhibitor) and ZSTK-474 (a PI3K inhibitor) as potential drugs that targeted the differential gene expression in our dataset. CONCLUSION: In this study we describe several molecular processes that may correlate with NAFLD disease and progression. We also identified ricirbine and ZSTK-474 as potential therapy.