Pesquisa | BVS Integralidade em Saúde

1.

Developing a novel causal inference algorithm for personalized biomedical causal graph learning using meta machine learning.

Wu, Hang; Shi, Wenqi; Wang, May D.

BMC Med Inform Decis Mak ; 24(1): 137, 2024 May 27.

Artigo em Inglês | MEDLINE | ID: mdl-38802809

RESUMO

BACKGROUND: Modeling causality through graphs, referred to as causal graph learning, offers an appropriate description of the dynamics of causality. The majority of current machine learning models in clinical decision support systems only predict associations between variables, whereas causal graph learning models causality dynamics through graphs. However, building personalized causal graphs for each individual is challenging due to the limited amount of data available for each patient. METHOD: In this study, we present a new algorithmic framework using meta-learning for learning personalized causal graphs in biomedicine. Our framework extracts common patterns from multiple patient graphs and applies this information to develop individualized graphs. In multi-task causal graph learning, the proposed optimized initial guess of shared commonality enables the rapid adoption of knowledge to new tasks for efficient causal graph learning. RESULTS: Experiments on one real-world biomedical causal graph learning benchmark data and four synthetic benchmarks show that our algorithm outperformed the baseline methods. Our algorithm can better understand the underlying patterns in the data, leading to more accurate predictions of the causal graph. Specifically, we reduce the structural hamming distance by 50-75%, indicating an improvement in graph prediction accuracy. Additionally, the false discovery rate is decreased by 20-30%, demonstrating that our algorithm made fewer incorrect predictions compared to the baseline algorithms. CONCLUSION: To the best of our knowledge, this is the first study to demonstrate the effectiveness of meta-learning in personalized causal graph learning and cause inference modeling for biomedicine. In addition, the proposed algorithm can also be generalized to transnational research areas where integrated analysis is necessary for various distributions of datasets, including different clinical institutions.

Assuntos

Algoritmos , Aprendizado de Máquina , Humanos , Causalidade

2.

Explainable synthetic image generation to improve risk assessment of rare pediatric heart transplant rejection.

Giuste, Felipe O; Sequeira, Ryan; Keerthipati, Vikranth; Lais, Peter; Mirzazadeh, Ali; Mohseni, Arshawn; Zhu, Yuanda; Shi, Wenqi; Marteau, Benoit; Zhong, Yishan; Tong, Li; Das, Bibhuti; Shehata, Bahig; Deshpande, Shriprasad; Wang, May D.

J Biomed Inform ; 139: 104303, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-36736449

RESUMO

Expert microscopic analysis of cells obtained from frequent heart biopsies is vital for early detection of pediatric heart transplant rejection to prevent heart failure. Detection of this rare condition is prone to low levels of expert agreement due to the difficulty of identifying subtle rejection signs within biopsy samples. The rarity of pediatric heart transplant rejection also means that very few gold-standard images are available for developing machine learning models. To solve this urgent clinical challenge, we developed a deep learning model to automatically quantify rejection risk within digital images of biopsied tissue using an explainable synthetic data augmentation approach. We developed this explainable AI framework to illustrate how our progressive and inspirational generative adversarial network models distinguish between normal tissue images and those containing cellular rejection signs. To quantify biopsy-level rejection risk, we first detect local rejection features using a binary image classifier trained with expert-annotated and synthetic examples. We converted these local predictions into a biopsy-wide rejection score via an interpretable histogram-based approach. Our model significantly improves upon prior works with the same dataset with an area under the receiver operating curve (AUROC) of 98.84% for the local rejection detection task and 95.56% for the biopsy-rejection prediction task. A biopsy-level sensitivity of 83.33% makes our approach suitable for early screening of biopsies to prioritize expert analysis. Our framework provides a solution to rare medical imaging challenges currently limited by small datasets.

Assuntos

Insuficiência Cardíaca , Transplante de Coração , Humanos , Criança , Diagnóstico por Imagem , Aprendizado de Máquina , Medição de Risco , Complicações Pós-Operatórias

3.

Coronavirus Disease 2019 Temperature Trajectories Correlate With Hyperinflammatory and Hypercoagulable Subphenotypes.

Bhavani, Sivasubramanium V; Verhoef, Philip A; Maier, Cheryl L; Robichaux, Chad; Parker, William F; Holder, Andre; Kamaleswaran, Rishikesan; Wang, May D; Churpek, Matthew M; Coopersmith, Craig M.

Crit Care Med ; 50(2): 212-223, 2022 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-35100194

RESUMO

OBJECTIVES: Body temperature trajectories of infected patients are associated with specific immune profiles and survival. We determined the association between temperature trajectories and distinct manifestations of coronavirus disease 2019. DESIGN: Retrospective observational study. SETTING: Four hospitals within an academic healthcare system from March 2020 to February 2021. PATIENTS: All adult patients hospitalized with coronavirus disease 2019. INTERVENTIONS: Using a validated group-based trajectory model, we classified patients into four previously defined temperature trajectory subphenotypes using oral temperature measurements from the first 72 hours of hospitalization. Clinical characteristics, biomarkers, and outcomes were compared between subphenotypes. MEASUREMENTS AND MAIN RESULTS: The 5,903 hospitalized coronavirus disease 2019 patients were classified into four subphenotypes: hyperthermic slow resolvers (n = 1,452, 25%), hyperthermic fast resolvers (1,469, 25%), normothermics (2,126, 36%), and hypothermics (856, 15%). Hypothermics had abnormal coagulation markers, with the highest d-dimer and fibrin monomers (p < 0.001) and the highest prevalence of cerebrovascular accidents (10%, p = 0.001). The prevalence of venous thromboembolism was significantly different between subphenotypes (p = 0.005), with the highest rate in hypothermics (8.5%) and lowest in hyperthermic slow resolvers (5.1%). Hyperthermic slow resolvers had abnormal inflammatory markers, with the highest C-reactive protein, ferritin, and interleukin-6 (p < 0.001). Hyperthermic slow resolvers had increased odds of mechanical ventilation, vasopressors, and 30-day inpatient mortality (odds ratio, 1.58; 95% CI, 1.13-2.19) compared with hyperthermic fast resolvers. Over the course of the pandemic, we observed a drastic decrease in the prevalence of hyperthermic slow resolvers, from representing 53% of admissions in March 2020 to less than 15% by 2021. We found that dexamethasone use was associated with significant reduction in probability of hyperthermic slow resolvers membership (27% reduction; 95% CI, 23-31%; p < 0.001). CONCLUSIONS: Hypothermics had abnormal coagulation markers, suggesting a hypercoagulable subphenotype. Hyperthermic slow resolvers had elevated inflammatory markers and the highest odds of mortality, suggesting a hyperinflammatory subphenotype. Future work should investigate whether temperature subphenotypes benefit from targeted antithrombotic and anti-inflammatory strategies.

Assuntos

Temperatura Corporal , COVID-19/patologia , Hipertermia/patologia , Hipotermia/patologia , Fenótipo , Centros Médicos Acadêmicos , Idoso , Anti-Inflamatórios/uso terapêutico , Biomarcadores/sangue , Coagulação Sanguínea , Estudos de Coortes , Dexametasona/uso terapêutico , Feminino , Humanos , Inflamação , Masculino , Pessoa de Meia-Idade , Escores de Disfunção Orgânica , Estudos Retrospectivos , SARS-CoV-2

4.

Integrating multi-omics data by learning modality invariant representations for improved prediction of overall survival of cancer.

Tong, Li; Wu, Hang; Wang, May D.

Methods ; 189: 74-85, 2021 05.

Artigo em Inglês | MEDLINE | ID: mdl-32763377

RESUMO

Breast and ovarian cancers are the second and the fifth leading causes of cancer death among women. Predicting the overall survival of breast and ovarian cancer patients can facilitate the therapeutics evaluation and treatment decision making. Multi-scale multi-omics data such as gene expression, DNA methylation, miRNA expression, and copy number variations can provide insights on personalized survival. However, how to effectively integrate multi-omics data remains a challenging task. In this paper, we develop multi-omics integration methods to improve the prediction of overall survival for breast cancer and ovarian cancer patients. Because multi-omics data for the same patient jointly impact the survival of cancer patients, features from different -omics modality are related and can be modeled by either association or causal relationship (e.g., pathways). By extracting these relationships among modalities, we can get rid of the irrelevant information from high-throughput multi-omics data. However, it is infeasible to use the Brute Force method to capture all possible multi-omics interactions. Thus, we use deep neural networks with novel divergence-based consensus regularization to capture multi-omics interactions implicitly by extracting modality-invariant representations. In comparing the concatenation-based integration networks with our new divergence-based consensus networks, the breast cancer overall survival C-index is improved from 0.655±0.062 to 0.671±0.046 when combing DNA methylation and miRNA expression, and from 0.627±0.062 to 0.667±0.073 when combing miRNA expression and copy number variations. In summary, our novel deep consensus neural network has successfully improved the prediction of overall survival for breast cancer and ovarian cancer patients by implicitly learning the multi-omics interactions.

Assuntos

Neoplasias da Mama/genética , Metilação de DNA , Regulação Neoplásica da Expressão Gênica , Genômica/métodos , Redes Neurais de Computação , Neoplasias Ovarianas/genética , Neoplasias da Mama/mortalidade , Biologia Computacional , Variações do Número de Cópias de DNA , Epigenômica , Feminino , Humanos , Aprendizado de Máquina , Neoplasias Ovarianas/mortalidade

5.

Did COVID-19 Affect Time to Presentation in the Setting of Pediatric Testicular Torsion?

Littman, Annalise R; Janssen, Karmon M; Tong, Li; Wu, Hang; Wang, May D; Blum, Emily; Kirsch, Andrew J.

Pediatr Emerg Care ; 37(2): 123-125, 2021 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-33512891

RESUMO

OBJECTIVES: To determine if boys with acute testicular torsion, a surgical emergency requiring prompt diagnosis and treatment to optimize salvage of the testicle, delayed presentation to a medical facility and experienced an extended duration of symptoms (DoS), and secondarily, a higher rate of orchiectomy, during the coronavirus disease 2019 (COVID-19) pandemic. METHODS: Single-center, descriptive retrospective chart review of boys presenting with acute testicular torsion from March 15, to May 4, 2020 ("during COVID-19" or group 2), as well as for the same time window in the 5-year period from 2015 to 2019 ("pre-COVID-19" or group 1). RESULTS: A total of 78 boys met inclusion criteria, group 1 (n = 57) and group 2 (n = 21). The mean age was 12.86 ± 2.63 (group 1) and 12.86 ± 2.13 (group 2). Mean DoS before presentation at a medical facility was 23.2 ± 35.0 hours in group 1 compared with 21.3 ± 29.7 hours in group 2 (P < 0.37). When DoS was broken down into acute (<24 hours) versus delayed (≥24 hours), 41 (71.9%) of 57 boys in group 1 and 16 (76.2%) of 21 boys in group 2 presented within less than 24 hours of symptom onset (P < 0.78). There was no difference in rate of orchiectomy between group 1 and group 2 (44.7% vs 25%, P < 0.17), respectively. CONCLUSIONS: Boys with acute testicular torsion in our catchment area did not delay presentation to a medical facility from March 15, to May 4, 2020, and did not subsequently undergo a higher rate of orchiectomy.

Assuntos

COVID-19/epidemiologia , Torção do Cordão Espermático/cirurgia , Adolescente , Criança , Serviço Hospitalar de Emergência , Humanos , Masculino , Orquiectomia/estatística & dados numéricos , Estudos Retrospectivos , SARS-CoV-2 , Torção do Cordão Espermático/diagnóstico , Torção do Cordão Espermático/epidemiologia , Testículo/cirurgia , Fatores de Tempo , Tempo para o Tratamento

6.

Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis.

Tong, Li; Mitchel, Jonathan; Chatlin, Kevin; Wang, May D.

BMC Med Inform Decis Mak ; 20(1): 225, 2020 09 15.

Artigo em Inglês | MEDLINE | ID: mdl-32933515

RESUMO

BACKGROUND: Breast cancer is the most prevalent and among the most deadly cancers in females. Patients with breast cancer have highly variable survival lengths, indicating a need to identify prognostic biomarkers for personalized diagnosis and treatment. With the development of new technologies such as next-generation sequencing, multi-omics information are becoming available for a more thorough evaluation of a patient's condition. In this study, we aim to improve breast cancer overall survival prediction by integrating multi-omics data (e.g., gene expression, DNA methylation, miRNA expression, and copy number variations (CNVs)). METHODS: Motivated by multi-view learning, we propose a novel strategy to integrate multi-omics data for breast cancer survival prediction by applying complementary and consensus principles. The complementary principle assumes each -omics data contains modality-unique information. To preserve such information, we develop a concatenation autoencoder (ConcatAE) that concatenates the hidden features learned from each modality for integration. The consensus principle assumes that the disagreements among modalities upper bound the model errors. To get rid of the noises or discrepancies among modalities, we develop a cross-modality autoencoder (CrossAE) to maximize the agreement among modalities to achieve a modality-invariant representation. We first validate the effectiveness of our proposed models on the MNIST simulated data. We then apply these models to the TCCA breast cancer multi-omics data for overall survival prediction. RESULTS: For breast cancer overall survival prediction, the integration of DNA methylation and miRNA expression achieves the best overall performance of 0.641 ± 0.031 with ConcatAE, and 0.63 ± 0.081 with CrossAE. Both strategies outperform baseline single-modality models using only DNA methylation (0.583 ± 0.058) or miRNA expression (0.616 ± 0.057). CONCLUSIONS: In conclusion, we achieve improved overall survival prediction performance by utilizing either the complementary or consensus information among multi-omics data. The proposed ConcatAE and CrossAE models can inspire future deep representation-based multi-omics integration techniques. We believe these novel multi-omics integration models can benefit the personalized diagnosis and treatment of breast cancer patients.

Assuntos

Neoplasias da Mama , Aprendizado Profundo , MicroRNAs , Neoplasias da Mama/genética , Neoplasias da Mama/terapia , Variações do Número de Cópias de DNA , Feminino , Genômica , Humanos , Análise de Sobrevida

7.

LncADeep: an ab initio lncRNA identification and functional annotation tool based on deep learning.

Yang, Cheng; Yang, Longshu; Zhou, Man; Xie, Haoling; Zhang, Chengjiu; Wang, May D; Zhu, Huaiqiu.

Bioinformatics ; 34(22): 3825-3834, 2018 11 15.

Artigo em Inglês | MEDLINE | ID: mdl-29850816

RESUMO

Motivation: To characterize long non-coding RNAs (lncRNAs), both identifying and functionally annotating them are essential to be addressed. Moreover, a comprehensive construction for lncRNA annotation is desired to facilitate the research in the field. Results: We present LncADeep, a novel lncRNA identification and functional annotation tool. For lncRNA identification, LncADeep integrates intrinsic and homology features into a deep belief network and constructs models targeting both full- and partial-length transcripts. For functional annotation, LncADeep predicts a lncRNA's interacting proteins based on deep neural networks, using both sequence and structure information. Furthermore, LncADeep integrates KEGG and Reactome pathway enrichment analysis and functional module detection with the predicted interacting proteins, and provides the enriched pathways and functional modules as functional annotations for lncRNAs. Test results show that LncADeep outperforms state-of-the-art tools, both for lncRNA identification and lncRNA-protein interaction prediction, and then presents a functional interpretation. We expect that LncADeep can contribute to identifying and annotating novel lncRNAs. Availability and implementation: LncADeep is freely available for academic use at http://cqb.pku.edu.cn/ZhuLab/lncadeep/ and https://github.com/cyang235/LncADeep/. Supplementary information: Supplementary data are available at Bioinformatics online.

Assuntos

Aprendizado Profundo , RNA Longo não Codificante/genética , Anotação de Sequência Molecular , Redes Neurais de Computação

8.

Reverse engineering biomolecular systems using -omic data: challenges, progress and opportunities.

Quo, Chang F; Kaddi, Chanchala; Phan, John H; Zollanvari, Amin; Xu, Mingqing; Wang, May D; Alterovitz, Gil.

Brief Bioinform ; 13(4): 430-45, 2012 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-22833495

RESUMO

Recent advances in high-throughput biotechnologies have led to the rapid growing research interest in reverse engineering of biomolecular systems (REBMS). 'Data-driven' approaches, i.e. data mining, can be used to extract patterns from large volumes of biochemical data at molecular-level resolution while 'design-driven' approaches, i.e. systems modeling, can be used to simulate emergent system properties. Consequently, both data- and design-driven approaches applied to -omic data may lead to novel insights in reverse engineering biological systems that could not be expected before using low-throughput platforms. However, there exist several challenges in this fast growing field of reverse engineering biomolecular systems: (i) to integrate heterogeneous biochemical data for data mining, (ii) to combine top-down and bottom-up approaches for systems modeling and (iii) to validate system models experimentally. In addition to reviewing progress made by the community and opportunities encountered in addressing these challenges, we explore the emerging field of synthetic biology, which is an exciting approach to validate and analyze theoretical system models directly through experimental synthesis, i.e. analysis-by-synthesis. The ultimate goal is to address the present and future challenges in reverse engineering biomolecular systems (REBMS) using integrated workflow of data mining, systems modeling and synthetic biology.

Assuntos

Mineração de Dados/métodos , Biologia de Sistemas , Bioengenharia/métodos , Biotecnologia

9.

Integrating Multi-Omics Data With EHR for Precision Medicine Using Advanced Artificial Intelligence.

Tong, Li; Shi, Wenqi; Isgut, Monica; Zhong, Yishan; Lais, Peter; Gloster, Logan; Sun, Jimin; Swain, Aniketh; Giuste, Felipe; Wang, May D.

IEEE Rev Biomed Eng ; 17: 80-97, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-37824325

RESUMO

With the recent advancement of novel biomedical technologies such as high-throughput sequencing and wearable devices, multi-modal biomedical data ranging from multi-omics molecular data to real-time continuous bio-signals are generated at an unprecedented speed and scale every day. For the first time, these multi-modal biomedical data are able to make precision medicine close to a reality. However, due to data volume and the complexity, making good use of these multi-modal biomedical data requires major effort. Researchers and clinicians are actively developing artificial intelligence (AI) approaches for data-driven knowledge discovery and causal inference using a variety of biomedical data modalities. These AI-based approaches have demonstrated promising results in various biomedical and healthcare applications. In this review paper, we summarize the state-of-the-art AI models for integrating multi-omics data and electronic health records (EHRs) for precision medicine. We discuss the challenges and opportunities in integrating multi-omics data with EHRs and future directions. We hope this review can inspire future research and developing in integrating multi-omics data with EHRs for precision medicine.

Assuntos

Inteligência Artificial , Multiômica , Humanos , Medicina de Precisão , Registros Eletrônicos de Saúde , Atenção à Saúde

10.

A fast least-squares algorithm for population inference.

Parry, R Mitchell; Wang, May D.

BMC Bioinformatics ; 14: 28, 2013 Jan 23.

Artigo em Inglês | MEDLINE | ID: mdl-23343408

RESUMO

BACKGROUND: Population inference is an important problem in genetics used to remove population stratification in genome-wide association studies and to detect migration patterns or shared ancestry. An individual's genotype can be modeled as a probabilistic function of ancestral population memberships, Q, and the allele frequencies in those populations, P. The parameters, P and Q, of this binomial likelihood model can be inferred using slow sampling methods such as Markov Chain Monte Carlo methods or faster gradient based approaches such as sequential quadratic programming. This paper proposes a least-squares simplification of the binomial likelihood model motivated by a Euclidean interpretation of the genotype feature space. This results in a faster algorithm that easily incorporates the degree of admixture within the sample of individuals and improves estimates without requiring trial-and-error tuning. RESULTS: We show that the expected value of the least-squares solution across all possible genotype datasets is equal to the true solution when part of the problem has been solved, and that the variance of the solution approaches zero as its size increases. The Least-squares algorithm performs nearly as well as Admixture for these theoretical scenarios. We compare least-squares, Admixture, and FRAPPE for a variety of problem sizes and difficulties. For particularly hard problems with a large number of populations, small number of samples, or greater degree of admixture, least-squares performs better than the other methods. On simulated mixtures of real population allele frequencies from the HapMap project, Admixture estimates sparsely mixed individuals better than Least-squares. The least-squares approach, however, performs within 1.5% of the Admixture error. On individual genotypes from the HapMap project, Admixture and least-squares perform qualitatively similarly and within 1.2% of each other. Significantly, the least-squares approach nearly always converges 1.5- to 6-times faster. CONCLUSIONS: The computational advantage of the least-squares approach along with its good estimation performance warrants further research, especially for very large datasets. As problem sizes increase, the difference in estimation performance between all algorithms decreases. In addition, when prior information is known, the least-squares approach easily incorporates the expected degree of admixture to improve the estimate.

Assuntos

Algoritmos , Técnicas de Genotipagem , Frequência do Gene , Genética Populacional/métodos , Estudo de Associação Genômica Ampla , Genótipo , Projeto HapMap , Humanos , Análise dos Mínimos Quadrados , Funções Verossimilhança , Cadeias de Markov , Modelos Estatísticos , Método de Monte Carlo

11.

Assessing the impact of human genome annotation choice on RNA-seq expression estimates.

Wu, Po-Yen; Phan, John H; Wang, May D.

BMC Bioinformatics ; 14 Suppl 11: S8, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24564364

RESUMO

BACKGROUND: Genome annotation is a crucial component of RNA-seq data analysis. Much effort has been devoted to producing an accurate and rational annotation of the human genome. An annotated genome provides a comprehensive catalogue of genomic functional elements. Currently, at least six human genome annotations are publicly available, including AceView Genes, Ensembl Genes, H-InvDB Genes, RefSeq Genes, UCSC Known Genes, and Vega Genes. Characteristics of these annotations differ because of variations in annotation strategies and information sources. When performing RNA-seq data analysis, researchers need to choose a genome annotation. However, the effect of genome annotation choice on downstream RNA-seq expression estimates is still unclear. This study (1) investigates the effect of different genome annotations on RNA-seq quantification and (2) provides guidelines for choosing a genome annotation based on research focus. RESULTS: We define the complexity of human genome annotations in terms of the number of genes, isoforms, and exons. This definition facilitates an investigation of potential relationships between complexity and variations in RNA-seq quantification. We apply several evaluation metrics to demonstrate the impact of genome annotation choice on RNA-seq expression estimates. In the mapping stage, the least complex genome annotation, RefSeq Genes, appears to have the highest percentage of uniquely mapped short sequence reads. In the quantification stage, RefSeq Genes results in the most stable expression estimates in terms of the average coefficient of variation over all genes. Stable expression estimates in the quantification stage translate to accurate statistics for detecting differentially expressed genes. We observe that RefSeq Genes produces the most accurate fold-change measures with respect to a ground truth of RT-qPCR gene expression estimates. CONCLUSIONS: Based on the observed variations in the mapping, quantification, and differential expression calling stages, we demonstrate that the selection of human genome annotation results in different gene expression estimates. When conducting research that emphasizes reproducible and robust gene expression estimates, a less complex genome annotation may be preferred. However, simpler genome annotations may limit opportunities for identifying or characterizing novel transcriptional or regulatory mechanisms. When conducting research that aims to be more exploratory, a more complex genome annotation may be preferred.

Assuntos

Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA/genética , Análise de Sequência de RNA/métodos , Éxons , Genômica/métodos , Humanos , Isoformas de Proteínas/genética

12.

Histological image classification using biologically interpretable shape-based features.

Kothari, Sonal; Phan, John H; Young, Andrew N; Wang, May D.

BMC Med Imaging ; 13: 9, 2013 Mar 13.

Artigo em Inglês | MEDLINE | ID: mdl-23497380

RESUMO

BACKGROUND: Automatic cancer diagnostic systems based on histological image classification are important for improving therapeutic decisions. Previous studies propose textural and morphological features for such systems. These features capture patterns in histological images that are useful for both cancer grading and subtyping. However, because many of these features lack a clear biological interpretation, pathologists may be reluctant to adopt these features for clinical diagnosis. METHODS: We examine the utility of biologically interpretable shape-based features for classification of histological renal tumor images. Using Fourier shape descriptors, we extract shape-based features that capture the distribution of stain-enhanced cellular and tissue structures in each image and evaluate these features using a multi-class prediction model. We compare the predictive performance of the shape-based diagnostic model to that of traditional models, i.e., using textural, morphological and topological features. RESULTS: The shape-based model, with an average accuracy of 77%, outperforms or complements traditional models. We identify the most informative shapes for each renal tumor subtype from the top-selected features. Results suggest that these shapes are not only accurate diagnostic features, but also correlate with known biological characteristics of renal tumors. CONCLUSIONS: Shape-based analysis of histological renal tumor images accurately classifies disease subtypes and reveals biologically insightful discriminatory features. This method for shape-based analysis can be extended to other histological datasets to aid pathologists in diagnostic and therapeutic decisions.

Assuntos

Algoritmos , Inteligência Artificial , Biópsia/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Neoplasias/patologia , Reconhecimento Automatizado de Padrão/métodos , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade

13.

An integrative approach for the large-scale identification of human genome kinases regulating cancer metastasis.

Zhang, Hanshuo; Wu, Po-Yen; Ma, Ming; Ye, Yanzhen; Hao, Yang; Yang, Junyu; Yin, Shenyi; Sun, Changhong; Phan, John H; Wang, May D; Xi, Jianzhong Jeff.

Nanomedicine ; 9(6): 732-6, 2013 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-23751374

RESUMO

Kinases become one of important groups of drug targets. To identify more kinases being potential for cancer therapy, we developed an integrative approach for the large-scale screen of functional genes capable of regulating the main traits of cancer metastasis. We first employed self-assembled cell microarray to screen functional genes that regulate cancer cell migration using a human genome kinase siRNA library. We identified 81 genes capable of significantly regulating cancer cell migration. Following with invasion assays and bio-informatics analysis, we discovered that 16 genes with differentially expression in cancer samples can regulate both cell migration and invasion, among which 10 genes have been well known to play critical roles in the cancer development. The remaining 6 genes were experimentally validated to have the capacities of regulating cell proliferation, apoptosis and anoikis activities besides cell motility. Together, these findings provide a new insight into the therapeutic use of human kinases. FROM THE CLINICAL EDITOR: This team of authors have utilized a self-assembled cell microarray to screen genes that regulate cancer cell migration using a human genome siRNA library of kinases. They validated previously known genes and identified novel ones that may serve as therapeutic targets.

Assuntos

Metástase Neoplásica , Neoplasias/enzimologia , Fosfotransferases/isolamento & purificação , Apoptose/genética , Movimento Celular/genética , Proliferação de Células , Biologia Computacional , Genoma Humano , Células HeLa , Humanos , Invasividade Neoplásica/genética , Neoplasias/patologia , Fosfotransferases/genética , Fosfotransferases/metabolismo , RNA Interferente Pequeno , Análise Serial de Tecidos

14.

Systematic Review of Advanced AI Methods for Improving Healthcare Data Quality in Post COVID-19 Era.

Isgut, Monica; Gloster, Logan; Choi, Katherine; Venugopalan, Janani; Wang, May D.

IEEE Rev Biomed Eng ; 16: 53-69, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36269930

RESUMO

At the beginning of the COVID-19 pandemic, there was significant hype about the potential impact of artificial intelligence (AI) tools in combatting COVID-19 on diagnosis, prognosis, or surveillance. However, AI tools have not yet been widely successful. One of the key reason is the COVID-19 pandemic has demanded faster real-time development of AI-driven clinical and health support tools, including rapid data collection, algorithm development, validation, and deployment. However, there was not enough time for proper data quality control. Learning from the hard lessons in COVID-19, we summarize the important health data quality challenges during COVID-19 pandemic such as lack of data standardization, missing data, tabulation errors, and noise and artifact. Then we conduct a systematic investigation of computational methods that address these issues, including emerging novel advanced AI data quality control methods that achieve better data quality outcomes and, in some cases, simplify or automate the data cleaning process. We hope this article can assist healthcare community to improve health data quality going forward with novel AI development.

Assuntos

Inteligência Artificial , COVID-19 , Humanos , Confiabilidade dos Dados , Pandemias , Algoritmos

15.

Improving explainable AI with patch perturbation-based evaluation pipeline: a COVID-19 X-ray image analysis case study.

Sun, Jimin; Shi, Wenqi; Giuste, Felipe O; Vaghani, Yog S; Tang, Lingzi; Wang, May D.

Sci Rep ; 13(1): 19488, 2023 11 09.

Artigo em Inglês | MEDLINE | ID: mdl-37945586

RESUMO

Recent advances in artificial intelligence (AI) have sparked interest in developing explainable AI (XAI) methods for clinical decision support systems, especially in translational research. Although using XAI methods may enhance trust in black-box models, evaluating their effectiveness has been challenging, primarily due to the absence of human (expert) intervention, additional annotations, and automated strategies. In order to conduct a thorough assessment, we propose a patch perturbation-based approach to automatically evaluate the quality of explanations in medical imaging analysis. To eliminate the need for human efforts in conventional evaluation methods, our approach executes poisoning attacks during model retraining by generating both static and dynamic triggers. We then propose a comprehensive set of evaluation metrics during the model inference stage to facilitate the evaluation from multiple perspectives, covering a wide range of correctness, completeness, consistency, and complexity. In addition, we include an extensive case study to showcase the proposed evaluation strategy by applying widely-used XAI methods on COVID-19 X-ray imaging classification tasks, as well as a thorough review of existing XAI methods in medical imaging analysis with evaluation availability. The proposed patch perturbation-based workflow offers model developers an automated and generalizable evaluation strategy to identify potential pitfalls and optimize their proposed explainable solutions, while also aiding end-users in comparing and selecting appropriate XAI methods that meet specific clinical needs in real-world clinical research and practice.

Assuntos

COVID-19 , Sistemas de Apoio a Decisões Clínicas , Humanos , Inteligência Artificial , COVID-19/diagnóstico por imagem , Raios X , Benchmarking

16.

Explainable Artificial Intelligence Methods in Combating Pandemics: A Systematic Review.

Giuste, Felipe; Shi, Wenqi; Zhu, Yuanda; Naren, Tarun; Isgut, Monica; Sha, Ying; Tong, Li; Gupte, Mitali; Wang, May D.

IEEE Rev Biomed Eng ; 16: 5-21, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-35737637

RESUMO

Despite the myriad peer-reviewed papers demonstrating novel Artificial Intelligence (AI)-based solutions to COVID-19 challenges during the pandemic, few have made a significant clinical impact, especially in diagnosis and disease precision staging. One major cause for such low impact is the lack of model transparency, significantly limiting the AI adoption in real clinical practice. To solve this problem, AI models need to be explained to users. Thus, we have conducted a comprehensive study of Explainable Artificial Intelligence (XAI) using PRISMA technology. Our findings suggest that XAI can improve model performance, instill trust in the users, and assist users in decision-making. In this systematic review, we introduce common XAI techniques and their utility with specific examples of their application. We discuss the evaluation of XAI results because it is an important step for maximizing the value of AI-based clinical decision support systems. Additionally, we present the traditional, modern, and advanced XAI models to demonstrate the evolution of novel techniques. Finally, we provide a best practice guideline that developers can refer to during the model experimentation. We also offer potential solutions with specific examples for common challenges in AI model experimentation. This comprehensive review, hopefully, can promote AI adoption in biomedicine and healthcare.

Assuntos

Inteligência Artificial , COVID-19 , Humanos , Pandemias , Atenção à Saúde

17.

Novel methods for elucidating modality importance in multimodal electrophysiology classifiers.

Ellis, Charles A; Sendi, Mohammad S E; Zhang, Rongen; Carbajal, Darwin A; Wang, May D; Miller, Robyn L; Calhoun, Vince D.

Front Neuroinform ; 17: 1123376, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37006636

RESUMO

Introduction: Multimodal classification is increasingly common in electrophysiology studies. Many studies use deep learning classifiers with raw time-series data, which makes explainability difficult, and has resulted in relatively few studies applying explainability methods. This is concerning because explainability is vital to the development and implementation of clinical classifiers. As such, new multimodal explainability methods are needed. Methods: In this study, we train a convolutional neural network for automated sleep stage classification with electroencephalogram (EEG), electrooculogram, and electromyogram data. We then present a global explainability approach that is uniquely adapted for electrophysiology analysis and compare it to an existing approach. We present the first two local multimodal explainability approaches. We look for subject-level differences in the local explanations that are obscured by global methods and look for relationships between the explanations and clinical and demographic variables in a novel analysis. Results: We find a high level of agreement between methods. We find that EEG is globally the most important modality for most sleep stages and that subject-level differences in importance arise in local explanations that are not captured in global explanations. We further show that sex, followed by medication and age, had significant effects upon the patterns learned by the classifier. Discussion: Our novel methods enhance explainability for the growing field of multimodal electrophysiology classification, provide avenues for the advancement of personalized medicine, yield unique insights into the effects of demographic and clinical variables upon classifiers, and help pave the way for the implementation of multimodal electrophysiology clinical classifiers.

18.

Early and fair COVID-19 outcome risk assessment using robust feature selection.

Giuste, Felipe O; He, Lawrence; Lais, Peter; Shi, Wenqi; Zhu, Yuanda; Hornback, Andrew; Tsai, Chiche; Isgut, Monica; Anderson, Blake; Wang, May D.

Sci Rep ; 13(1): 18981, 2023 11 03.

Artigo em Inglês | MEDLINE | ID: mdl-37923795

RESUMO

Personalized medicine plays an important role in treatment optimization for COVID-19 patient management. Early treatment in patients at high risk of severe complications is vital to prevent death and ventilator use. Predicting COVID-19 clinical outcomes using machine learning may provide a fast and data-driven solution for optimizing patient care by estimating the need for early treatment. In addition, it is essential to accurately predict risk across demographic groups, particularly those underrepresented in existing models. Unfortunately, there is a lack of studies demonstrating the equitable performance of machine learning models across patient demographics. To overcome this existing limitation, we generate a robust machine learning model to predict patient-specific risk of death or ventilator use in COVID-19 positive patients using features available at the time of diagnosis. We establish the value of our solution across patient demographics, including gender and race. In addition, we improve clinical trust in our automated predictions by generating interpretable patient clustering, patient-level clinical feature importance, and global clinical feature importance within our large real-world COVID-19 positive patient dataset. We achieved 89.38% area under receiver operating curve (AUROC) performance for severe outcomes prediction and our robust feature ranking approach identified the presence of dementia as a key indicator for worse patient outcomes. We also demonstrated that our deep-learning clustering approach outperforms traditional clustering in separating patients by severity of outcome based on mutual information performance. Finally, we developed an application for automated and fair patient risk assessment with minimal manual data entry using existing data exchange standards.

Assuntos

COVID-19 , Humanos , Medição de Risco , Avaliação de Resultados em Cuidados de Saúde , Prognóstico , Aprendizado de Máquina , Estudos Retrospectivos

19.

Win percentage: a novel measure for assessing the suitability of machine classifiers for biological problems.

Parry, R Mitchell; Phan, John H; Wang, May D.

BMC Bioinformatics ; 13 Suppl 3: S7, 2012 Mar 21.

Artigo em Inglês | MEDLINE | ID: mdl-22536905

RESUMO

BACKGROUND: Selecting an appropriate classifier for a particular biological application poses a difficult problem for researchers and practitioners alike. In particular, choosing a classifier depends heavily on the features selected. For high-throughput biomedical datasets, feature selection is often a preprocessing step that gives an unfair advantage to the classifiers built with the same modeling assumptions. In this paper, we seek classifiers that are suitable to a particular problem independent of feature selection. We propose a novel measure, called "win percentage", for assessing the suitability of machine classifiers to a particular problem. We define win percentage as the probability a classifier will perform better than its peers on a finite random sample of feature sets, giving each classifier equal opportunity to find suitable features. RESULTS: First, we illustrate the difficulty in evaluating classifiers after feature selection. We show that several classifiers can each perform statistically significantly better than their peers given the right feature set among the top 0.001% of all feature sets. We illustrate the utility of win percentage using synthetic data, and evaluate six classifiers in analyzing eight microarray datasets representing three diseases: breast cancer, multiple myeloma, and neuroblastoma. After initially using all Gaussian gene-pairs, we show that precise estimates of win percentage (within 1%) can be achieved using a smaller random sample of all feature pairs. We show that for these data no single classifier can be considered the best without knowing the feature set. Instead, win percentage captures the non-zero probability that each classifier will outperform its peers based on an empirical estimate of performance. CONCLUSIONS: Fundamentally, we illustrate that the selection of the most suitable classifier (i.e., one that is more likely to perform better than its peers) not only depends on the dataset and application but also on the thoroughness of feature selection. In particular, win percentage provides a single measurement that could assist users in eliminating or selecting classifiers for their particular application.

Assuntos

Algoritmos , Análise de Sequência com Séries de Oligonucleotídeos , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Humanos , Método de Monte Carlo , Mieloma Múltiplo/diagnóstico , Mieloma Múltiplo/genética , Neuroblastoma/diagnóstico , Neuroblastoma/genética , Distribuição Normal

20.

Nanotechnology applications in surgical oncology.

Singhal, Sunil; Nie, Shuming; Wang, May D.

Annu Rev Med ; 61: 359-73, 2010.

Artigo em Inglês | MEDLINE | ID: mdl-20059343

RESUMO

Surgery is currently the most effective and widely used procedure in treating human cancers, and the single most important predictor of patient survival is a complete surgical resection. Major opportunities exist to develop new and innovative technologies that could help the surgeon to delineate tumor margins, to identify residual tumor cells and micrometastases, and to determine if the tumor has been completely removed. Here we discuss recent advances in nanotechnology and optical instrumentation, and how these advances can be integrated for applications in surgical oncology. A fundamental rationale is that nanometer-sized particles such as quantum dots and colloidal gold have functional and structural properties that are not available from either discrete molecules or bulk materials. When conjugated with targeting ligands such as monoclonal antibodies, peptides, or small molecules, these nanoparticles can be used to target malignant tumor cells and tumor microenvironments with high specificity and affinity. In the "mesoscopic" size range of 10-100 nm, nanoparticles also have large surface areas for conjugating to multiple diagnostic and therapeutic agents, opening new possibilities in integrated cancer imaging and therapy.

Assuntos

Nanotecnologia , Neoplasias/diagnóstico , Neoplasias/cirurgia , Animais , Coloide de Ouro , Humanos , Nanopartículas Metálicas , Camundongos , Pontos Quânticos , Análise Espectral Raman

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa