Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35788823

RESUMO

Predicting the drug-target interaction is crucial for drug discovery as well as drug repurposing. Machine learning is commonly used in drug-target affinity (DTA) problem. However, the machine learning model faces the cold-start problem where the model performance drops when predicting the interaction of a novel drug or target. Previous works try to solve the cold start problem by learning the drug or target representation using unsupervised learning. While the drug or target representation can be learned in an unsupervised manner, it still lacks the interaction information, which is critical in drug-target interaction. To incorporate the interaction information into the drug and protein interaction, we proposed using transfer learning from chemical-chemical interaction (CCI) and protein-protein interaction (PPI) task to drug-target interaction task. The representation learned by CCI and PPI tasks can be transferred smoothly to the DTA task due to the similar nature of the tasks. The result on the DTA datasets shows that our proposed method has advantages compared to other pre-training methods in the DTA task.


Assuntos
Desenvolvimento de Medicamentos , Aprendizado de Máquina , Descoberta de Drogas/métodos , Reposicionamento de Medicamentos
2.
BMC Med Inform Decis Mak ; 24(1): 274, 2024 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-39334279

RESUMO

BACKGROUND: In the age of big data, linked social and administrative health data in combination with machine learning (ML) is being increasingly used to improve prediction in chronic disease, e.g., cardiovascular diseases (CVD). In this study we aimed to apply ML methods on extensive national-level health and social administrative datasets to assess the utility of these for predicting future diabetes complications, including by ethnicity. METHODS: Five ML models were used to predict CVD events among all people with known diabetes in the population of New Zealand, utilizing nationwide individual-level administrative data. RESULTS: The Xgboost ML model had the best predictive power for predicting CVD events three years into the future among the population with diabetes (N = 145,600). The optimization procedure also found limited improvement in prediction by ethnicity (using area under the receiver operating curve, [AUC]). The results indicated no trade-off between model predictive performance and equity gap of prediction by ethnicity (that is improving model prediction and reducing performance gaps by ethnicity can be achieved simultaneously). The list of variables of importance was different among different models/ethnic groups, for example: age, deprivation (neighborhood-level), having had a hospitalization event, and the number of years living with diabetes. DISCUSSION AND CONCLUSIONS: We provide further evidence that ML with administrative health data can be used for meaningful future prediction of health outcomes. As such, it could be utilized to inform health planning and healthcare resource allocation for diabetes management and the prevention of CVD events. Our results may suggest limited scope for developing prediction models by ethnic group and that the major ways to reduce inequitable health outcomes is probably via improved delivery of prevention and management to those groups with diabetes at highest need.


Assuntos
Complicações do Diabetes , Disparidades nos Níveis de Saúde , Aprendizado de Máquina , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Doenças Cardiovasculares/etnologia , Complicações do Diabetes/etnologia , Diabetes Mellitus/etnologia , Etnicidade , Nova Zelândia , Medição de Risco
3.
BMC Genomics ; 21(Suppl 4): 256, 2020 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-32689932

RESUMO

BACKGROUND: Technological advances in next-generation sequencing (NGS) and chromatographic assays [e.g., liquid chromatography mass spectrometry (LC-MS)] have made it possible to identify thousands of microbe and metabolite species, and to measure their relative abundance. In this paper, we propose a sparse neural encoder-decoder network to predict metabolite abundances from microbe abundances. RESULTS: Using paired data from a cohort of inflammatory bowel disease (IBD) patients, we show that our neural encoder-decoder model outperforms linear univariate and multivariate methods in terms of accuracy, sparsity, and stability. Importantly, we show that our neural encoder-decoder model is not simply a black box designed to maximize predictive accuracy. Rather, the network's hidden layer (i.e., the latent space, comprised only of sparsely weighted microbe counts) actually captures key microbe-metabolite relationships that are themselves clinically meaningful. Although this hidden layer is learned without any knowledge of the patient's diagnosis, we show that the learned latent features are structured in a way that predicts IBD and treatment status with high accuracy. CONCLUSIONS: By imposing a non-negative weights constraint, the network becomes a directed graph where each downstream node is interpretable as the additive combination of the upstream nodes. Here, the middle layer comprises distinct microbe-metabolite axes that relate key microbial biomarkers with metabolite biomarkers. By pre-processing the microbiome and metabolome data using compositional data analysis methods, we ensure that our proposed multi-omics workflow will generalize to any pair of -omics data. To the best of our knowledge, this work is the first application of neural encoder-decoders for the interpretable integration of multi-omics biological data.


Assuntos
Microbioma Gastrointestinal , Doenças Inflamatórias Intestinais/metabolismo , Doenças Inflamatórias Intestinais/microbiologia , Metaboloma , Redes Neurais de Computação , Humanos , Modelos Estatísticos
4.
J Biomed Inform ; 69: 218-229, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28410981

RESUMO

Personalized predictive medicine necessitates the modeling of patient illness and care processes, which inherently have long-term temporal dependencies. Healthcare observations, stored in electronic medical records are episodic and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural network that reads medical records, stores previous illness history, infers current illness states and predicts future medical outcomes. At the data level, DeepCare represents care episodes as vectors and models patient health state trajectories by the memory of historical records. Built on Long Short-Term Memory (LSTM), DeepCare introduces methods to handle irregularly timed events by moderating the forgetting and consolidation of memory. DeepCare also explicitly models medical interventions that change the course of illness and shape future medical risk. Moving up to the health state level, historical and present health states are then aggregated through multiscale temporal pooling, before passing through a neural network that estimates future outcomes. We demonstrate the efficacy of DeepCare for disease progression modeling, intervention recommendation, and future risk prediction. On two important cohorts with heavy social and economic burden - diabetes and mental health - the results show improved prediction accuracy.


Assuntos
Atenção à Saúde , Registros Eletrônicos de Saúde , Redes Neurais de Computação , Progressão da Doença , Nível de Saúde , Humanos
5.
J Med Internet Res ; 18(12): e323, 2016 12 16.
Artigo em Inglês | MEDLINE | ID: mdl-27986644

RESUMO

BACKGROUND: As more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs. OBJECTIVE: To attain a set of guidelines on the use of machine learning predictive models within clinical settings to make sure the models are correctly applied and sufficiently reported so that true discoveries can be distinguished from random coincidence. METHODS: A multidisciplinary panel of machine learning experts, clinicians, and traditional statisticians were interviewed, using an iterative process in accordance with the Delphi method. RESULTS: The process produced a set of guidelines that consists of (1) a list of reporting items to be included in a research article and (2) a set of practical sequential steps for developing predictive models. CONCLUSIONS: A set of guidelines was generated to enable correct application of machine learning models and consistent reporting of model specifications and results in biomedical research. We believe that such guidelines will accelerate the adoption of big data analysis, particularly with machine learning methods, in the biomedical research community.


Assuntos
Pesquisa Biomédica/métodos , Interpretação Estatística de Dados , Aprendizado de Máquina , Pesquisa Biomédica/normas , Humanos , Estudos Interdisciplinares , Modelos Biológicos
6.
J Biomed Inform ; 54: 96-105, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25661261

RESUMO

Electronic medical record (EMR) offers promises for novel analytics. However, manual feature engineering from EMR is labor intensive because EMR is complex - it contains temporal, mixed-type and multimodal data packed in irregular episodes. We present a computational framework to harness EMR with minimal human supervision via restricted Boltzmann machine (RBM). The framework derives a new representation of medical objects by embedding them in a low-dimensional vector space. This new representation facilitates algebraic and statistical manipulations such as projection onto 2D plane (thereby offering intuitive visualization), object grouping (hence enabling automated phenotyping), and risk stratification. To enhance model interpretability, we introduced two constraints into model parameters: (a) nonnegative coefficients, and (b) structural smoothness. These result in a novel model called eNRBM (EMR-driven nonnegative RBM). We demonstrate the capability of the eNRBM on a cohort of 7578 mental health patients under suicide risk assessment. The derived representation not only shows clinically meaningful feature grouping but also facilitates short-term risk stratification. The F-scores, 0.21 for moderate-risk and 0.36 for high-risk, are significantly higher than those obtained by clinicians and competitive with the results obtained by support vector machines.


Assuntos
Registros Eletrônicos de Saúde , Informática Médica/métodos , Modelos Estatísticos , Medição de Risco/métodos , Feminino , Humanos , Masculino , Cadeias de Markov , Transtornos Mentais/epidemiologia , Neoplasias/epidemiologia , Neoplasias/terapia , Redes Neurais de Computação , Suicídio/estatística & dados numéricos , Máquina de Vetores de Suporte
7.
BMC Bioinformatics ; 15: 425, 2014 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-25547173

RESUMO

BACKGROUND: Feature engineering is a time consuming component of predictive modeling. We propose a versatile platform to automatically extract features for risk prediction, based on a pre-defined and extensible entity schema. The extraction is independent of disease type or risk prediction task. We contrast auto-extracted features to baselines generated from the Elixhauser comorbidities. RESULTS: Hospital medical records was transformed to event sequences, to which filters were applied to extract feature sets capturing diversity in temporal scales and data types. The features were evaluated on a readmission prediction task, comparing with baseline feature sets generated from the Elixhauser comorbidities. The prediction model was through logistic regression with elastic net regularization. Predictions horizons of 1, 2, 3, 6, 12 months were considered for four diverse diseases: diabetes, COPD, mental disorders and pneumonia, with derivation and validation cohorts defined on non-overlapping data-collection periods. For unplanned readmissions, auto-extracted feature set using socio-demographic information and medical records, outperformed baselines derived from the socio-demographic information and Elixhauser comorbidities, over 20 settings (5 prediction horizons over 4 diseases). In particular over 30-day prediction, the AUCs are: COPD-baseline: 0.60 (95% CI: 0.57, 0.63), auto-extracted: 0.67 (0.64, 0.70); diabetes-baseline: 0.60 (0.58, 0.63), auto-extracted: 0.67 (0.64, 0.69); mental disorders-baseline: 0.57 (0.54, 0.60), auto-extracted: 0.69 (0.64,0.70); pneumonia-baseline: 0.61 (0.59, 0.63), auto-extracted: 0.70 (0.67, 0.72). CONCLUSIONS: The advantages of auto-extracted standard features from complex medical records, in a disease and task agnostic manner were demonstrated. Auto-extracted features have good predictive power over multiple time horizons. Such feature sets have potential to form the foundation of complex automated analytic tasks.


Assuntos
Diabetes Mellitus/etiologia , Transtornos Mentais/etiologia , Pneumonia/etiologia , Doença Pulmonar Obstrutiva Crônica/etiologia , Medição de Risco , Software , Idoso , Área Sob a Curva , Comorbidade , Bases de Dados Factuais , Feminino , Hospitais , Humanos , Modelos Logísticos , Masculino , Modelos Teóricos
8.
BMC Psychiatry ; 14: 76, 2014 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-24628849

RESUMO

BACKGROUND: To date, our ability to accurately identify patients at high risk from suicidal behaviour, and thus to target interventions, has been fairly limited. This study examined a large pool of factors that are potentially associated with suicide risk from the comprehensive electronic medical record (EMR) and to derive a predictive model for 1-6 month risk. METHODS: 7,399 patients undergoing suicide risk assessment were followed up for 180 days. The dataset was divided into a derivation and validation cohorts of 4,911 and 2,488 respectively. Clinicians used an 18-point checklist of known risk factors to divide patients into low, medium, or high risk. Their predictive ability was compared with a risk stratification model derived from the EMR data. The model was based on the continuation-ratio ordinal regression method coupled with lasso (which stands for least absolute shrinkage and selection operator). RESULTS: In the year prior to suicide assessment, 66.8% of patients attended the emergency department (ED) and 41.8% had at least one hospital admission. Administrative and demographic data, along with information on prior self-harm episodes, as well as mental and physical health diagnoses were predictive of high-risk suicidal behaviour. Clinicians using the 18-point checklist were relatively poor in predicting patients at high-risk in 3 months (AUC 0.58, 95% CIs: 0.50 - 0.66). The model derived EMR was superior (AUC 0.79, 95% CIs: 0.72 - 0.84). At specificity of 0.72 (95% CIs: 0.70-0.73) the EMR model had sensitivity of 0.70 (95% CIs: 0.56-0.83). CONCLUSION: Predictive models applied to data from the EMR could improve risk stratification of patients presenting with potential suicidal behaviour. The predictive factors include known risks for suicide, but also other information relating to general health and health service utilisation.


Assuntos
Registros Eletrônicos de Saúde/estatística & dados numéricos , Prevenção do Suicídio , Suicídio/estatística & dados numéricos , Adolescente , Adulto , Idoso , Austrália/epidemiologia , Serviço Hospitalar de Emergência/estatística & dados numéricos , Feminino , Humanos , Masculino , Anamnese/estatística & dados numéricos , Pessoa de Meia-Idade , Modelos Estatísticos , Prognóstico , Estudos Retrospectivos , Medição de Risco/estatística & dados numéricos , Fatores de Risco , Ideação Suicida , Suicídio/psicologia , Adulto Jovem
9.
Aust Health Rev ; 38(4): 377-82, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25001433

RESUMO

OBJECTIVE: Readmission rates are high following acute myocardial infarction (AMI), but risk stratification has proved difficult because known risk factors are only weakly predictive. In the present study, we applied hospital data to identify the risk of unplanned admission following AMI hospitalisations. METHODS: The study included 1660 consecutive AMI admissions. Predictive models were derived from 1107 randomly selected records and tested on the remaining 553 records. The electronic medical record (EMR) model was compared with a seven-factor predictive score known as the HOSPITAL score and a model derived from Elixhauser comorbidities. All models were evaluated for the ability to identify patients at high risk of 30-day ischaemic heart disease readmission and those at risk of all-cause readmission within 12 months following the initial AMI hospitalisation. RESULTS: The EMR model has higher discrimination than other models in predicting ischaemic heart disease readmissions (area under the curve (AUC) 0.78; 95% confidence interval (CI) 0.71-0.85 for 30-day readmission). The positive predictive value was significantly higher with the EMR model, which identifies cohorts that were up to threefold more likely to be readmitted. Factors associated with readmission included emergency department attendances, cardiac diagnoses and procedures, renal impairment and electrolyte disturbances. The EMR model also performed better than other models (AUC 0.72; 95% CI 0.66-0.78), and with greater positive predictive value, in identifying 12-month risk of all-cause readmission. CONCLUSIONS: Routine hospital data can help identify patients at high risk of readmission following AMI. This could lead to decreased readmission rates by identifying patients suitable for targeted clinical interventions.


Assuntos
Infarto do Miocárdio , Readmissão do Paciente/estatística & dados numéricos , Adulto , Idoso , Idoso de 80 Anos ou mais , Bases de Dados Factuais , Registros Eletrônicos de Saúde , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Centros de Atenção Terciária , Vitória , Adulto Jovem
10.
Artigo em Inglês | MEDLINE | ID: mdl-39361461

RESUMO

Typically developing infants, between the corrected age of 9-20 weeks, produce fidgety movements. These movements can be identified with the General Movement Assessment, but their identification requires trained professionals to conduct the assessment from video recordings. Since trained professionals are expensive and their demand may be higher than their availability, computer vision-based solutions have been developed to assist practitioners. However, most solutions to date treat the problem as a direct mapping from video to infant status, without modeling fidgety movements throughout the video. To address that, we propose to directly model infants' short movements and classify them as fidgety or non-fidgety. In this way, we model the explanatory factor behind the infant's status and improve model interpretability. The issue with our proposal is that labels for an infant's short movements are not available, which precludes us to train such a model. We overcome this issue with active learning. Active learning is a framework that minimizes the amount of labeled data required to train a model, by only labeling examples that are considered "informative" to the model. The assumption is that a model trained on informative examples reaches a higher performance level than a model trained with randomly selected examples. We validate our framework by modeling the movements of infants' hips on two representative cohorts: typically developing and at-risk infants. Our results show that active learning is suitable to our problem and that it works adequately even when the models are trained with labels provided by a novice annotator.

11.
IEEE/ACM Trans Comput Biol Bioinform ; 20(2): 1020-1029, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-35820003

RESUMO

Many high-performance DTA deep learning models have been proposed, but they are mostly black-box and thus lack human interpretability. Explainable AI (XAI) can make DTA models more trustworthy, and allows to distill biological knowledge from the models. Counterfactual explanation is one popular approach to explaining the behaviour of a deep neural network, which works by systematically answering the question "How would the model output change if the inputs were changed in this way?". We propose a multi-agent reinforcement learning framework, Multi-Agent Counterfactual Drug-target binding Affinity (MACDA), to generate counterfactual explanations for the drug-protein complex. Our proposed framework provides human-interpretable counterfactual instances while optimizing both the input drug and target for counterfactual generation at the same time. We benchmark the proposed MACDA framework using the Davis and PDBBind dataset and find that our framework produces more parsimonious explanations with no loss in explanation validity, as measured by encoding similarity. We then present a case study involving ABL1 and Nilotinib to demonstrate how MACDA can explain the behaviour of a DTA model in the underlying substructure interaction between inputs in its prediction, revealing mechanisms that align with prior domain knowledge.


Assuntos
Benchmarking , Redes Neurais de Computação , Humanos , Desenvolvimento de Medicamentos
12.
IEEE J Biomed Health Inform ; 27(10): 5042-5053, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37498761

RESUMO

Fidgety movements occur in infants between the age of 9 to 20 weeks post-term, and their absence are a strong indicator that an infant has cerebral palsy. Prechtl's General Movement Assessment method evaluates whether an infant has fidgety movements, but requires a trained expert to conduct it. Timely evaluation facilitates early interventions, and thus computer-based methods have been developed to aid domain experts. However, current solutions rely on complex models or high-dimensional representations of the data, which hinder their interpretability and generalization ability. To address that we propose [Formula: see text], a method that detects fidgety movements and uses them towards an assessment of the quality of an infant's general movements. [Formula: see text] is true to the domain expert process, more accurate, and highly interpretable due to its fine-grained scoring system. The main idea behind [Formula: see text] is to specify signal properties of fidgety movements that are measurable and quantifiable. In particular, we measure the movement direction variability of joints of interest, for movements of small amplitude in short video segments. [Formula: see text] also comprises a strategy to reduce those measurements to a single score that quantifies the quality of an infant's general movements; the strategy is a direct translation of the qualitative procedure domain experts use to assess infants. This brings [Formula: see text] closer to the process a domain expert applies to decide whether an infant produced enough fidgety movements. We evaluated [Formula: see text] on the largest clinical dataset reported, where it showed to be interpretable and more accurate than many methods published to date.

13.
BMJ Open ; 13(4): e066249, 2023 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-37116996

RESUMO

INTRODUCTION: Meta-analytical evidence confirms a range of interventions, including mindfulness, physical activity and sleep hygiene, can reduce psychological distress in university students. However, it is unclear which intervention is most effective. Artificial intelligence (AI)-driven adaptive trials may be an efficient method to determine what works best and for whom. The primary purpose of the study is to rank the effectiveness of mindfulness, physical activity, sleep hygiene and an active control on reducing distress, using a multiarm contextual bandit-based AI-adaptive trial method. Furthermore, the study will explore which interventions have the largest effect for students with different levels of baseline distress severity. METHODS AND ANALYSIS: The Vibe Up study is a pragmatically oriented, decentralised AI-adaptive group sequential randomised controlled trial comparing the effectiveness of one of three brief, 2-week digital self-guided interventions (mindfulness, physical activity or sleep hygiene) or active control (ecological momentary assessment) in reducing self-reported psychological distress in Australian university students. The adaptive trial methodology involves up to 12 sequential mini-trials that allow for the optimisation of allocation ratios. The primary outcome is change in psychological distress (Depression, Anxiety and Stress Scale, 21-item version, DASS-21 total score) from preintervention to postintervention. Secondary outcomes include change in physical activity, sleep quality and mindfulness from preintervention to postintervention. Planned contrasts will compare the four groups (ie, the three intervention and control) using self-reported psychological distress at prespecified time points for interim analyses. The study aims to determine the best performing intervention, as well as ranking of other interventions. ETHICS AND DISSEMINATION: Ethical approval was sought and obtained from the UNSW Sydney Human Research Ethics Committee (HREC A, HC200466). A trial protocol adhering to the requirements of the Guideline for Good Clinical Practice was prepared for and approved by the Sponsor, UNSW Sydney (Protocol number: HC200466_CTP). TRIAL REGISTRATION NUMBER: ACTRN12621001223820.


Assuntos
Atenção Plena , Angústia Psicológica , Humanos , Universidades , Inteligência Artificial , Austrália , Atenção Plena/métodos , Estudantes/psicologia , Estresse Psicológico/prevenção & controle , Estresse Psicológico/psicologia , Ensaios Clínicos Controlados Aleatórios como Assunto
14.
Int J Data Sci Anal ; : 1-16, 2022 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-36440369

RESUMO

Discovering new medicines is the hallmark of the human endeavor to live a better and longer life. Yet the pace of discovery has slowed down as we need to venture into more wildly unexplored biomedical space to find one that matches today's high standard. Modern AI-enabled by powerful computing, large biomedical databases, and breakthroughs in deep learning offers a new hope to break this loop as AI is rapidly maturing, ready to make a huge impact in the area. In this paper, we review recent advances in AI methodologies that aim to crack this challenge. We organize the vast and rapidly growing literature on AI for drug discovery into three relatively stable sub-areas: (a) representation learning over molecular sequences and geometric graphs; (b) data-driven reasoning where we predict molecular properties and their binding, optimize existing compounds, generate de novo molecules, and plan the synthesis of target molecules; and (c) knowledge-based reasoning where we discuss the construction and reasoning over biomedical knowledge graphs. We will also identify open challenges and chart possible research directions for the years to come.

15.
Artigo em Inglês | MEDLINE | ID: mdl-34197324

RESUMO

Predicting the interaction between a compound and a target is crucial for rapid drug repurposing. Deep learning has been successfully applied in drug-target affinity (DTA)problem. However, previous deep learning-based methods ignore modeling the direct interactions between drug and protein residues. This would lead to inaccurate learning of target representation which may change due to the drug binding effects. In addition, previous DTA methods learn protein representation solely based on a small number of protein sequences in DTA datasets while neglecting the use of proteins outside of the DTA datasets. We propose GEFA (Graph Early Fusion Affinity), a novel graph-in-graph neural network with attention mechanism to address the changes in target representation because of the binding effects. Specifically, a drug is modeled as a graph of atoms, which then serves as a node in a larger graph of residues-drug complex. The resulting model is an expressive deep nested graph neural network. We also use pre-trained protein representation powered by the recent effort of learning contextualized protein representation. The experiments are conducted under different settings to evaluate scenarios such as novel drugs or targets. The results demonstrate the effectiveness of the pre-trained protein embedding and the advantages our GEFA in modeling the nested graph for drug-target interaction.


Assuntos
Desenvolvimento de Medicamentos , Redes Neurais de Computação , Sequência de Aminoácidos , Desenvolvimento de Medicamentos/métodos , Reposicionamento de Medicamentos , Proteínas/química
16.
IEEE J Biomed Health Inform ; 25(10): 3911-3920, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-33956636

RESUMO

The absence or abnormality of fidgety movements of joints or limbs is strongly indicative of cerebral palsy in infants. Developing computer-based methods for assessing infant movements in videos is pivotal for improved cerebral palsy screening. Most existing methods use appearance-based features and are thus sensitive to strong but irrelevant signals caused by background clutter or a moving camera. Moreover, these features are computed over the whole frame, thus they measure gross whole body movements rather than specific joint/limb motion. Addressing these challenges, we develop and validate a new method for fidgety movement assessment from consumer-grade videos using human poses extracted from short clips. Human poses capture only relevant motion profiles of joints and limbs and are thus free from irrelevant appearance artifacts. The dynamics and coordination between joints are modeled using spatio-temporal graph convolutional networks. Frames and body parts that contain discriminative information about fidgety movements are selected through a spatio-temporal attention mechanism. We validate the proposed model on the cerebral palsy screening task using a real-life consumer-grade video dataset collected at an Australian hospital through the Cerebral Palsy Alliance, Australia. Our experiments show that the proposed method achieves the ROC-AUC score of 81.87%, significantly outperforming existing competing methods with better interpretability.


Assuntos
Paralisia Cerebral , Movimento , Austrália , Paralisia Cerebral/diagnóstico , Humanos , Lactente
17.
IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2841-2847, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33909569

RESUMO

The classification of clinical samples based on gene expression data is an important part of precision medicine. In this manuscript, we show how transforming gene expression data into a set of personalized (sample-specific) networks can allow us to harness existing graph-based methods to improve classifier performance. Existing approaches to personalized gene networks have the limitation that they depend on other samples in the data and must get re-computed whenever a new sample is introduced. Here, we propose a novel method, called Personalized Annotation-based Networks (PAN), that avoids this limitation by using curated annotation databases to transform gene expression data into a graph. Unlike competing methods, PANs are calculated for each sample independent of the population, making it a more efficient way to obtain single-sample networks. Using three breast cancer datasets as a case study, we show that PAN classifiers not only predict cancer relapse better than gene features alone, but also outperform PPI (protein-protein interactions) and population-level graph-based classifiers. This work demonstrates the practical advantages of graph-based classification for high-dimensional genomic data, while offering a new approach to making sample-specific networks. Supplementary information: PAN and the baselines are implemented in Python. Source code and data are available at https://github.com/thinng/PAN.


Assuntos
Neoplasias da Mama , Genômica/métodos , Anotação de Sequência Molecular/métodos , Recidiva Local de Neoplasia , Medicina de Precisão/métodos , Algoritmos , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Neoplasias da Mama/patologia , Bases de Dados Genéticas , Feminino , Humanos , Recidiva Local de Neoplasia/diagnóstico , Recidiva Local de Neoplasia/genética , Recidiva Local de Neoplasia/metabolismo , Recidiva Local de Neoplasia/patologia , Mapas de Interação de Proteínas/genética , Software , Transcriptoma/genética
18.
Front Neurol ; 12: 670379, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34646226

RESUMO

Aim: To use available electronic administrative records to identify data reliability, predict discharge destination, and identify risk factors associated with specific outcomes following hospital admission with stroke, compared to stroke specific clinical factors, using machine learning techniques. Method: The study included 2,531 patients having at least one admission with a confirmed diagnosis of stroke, collected from a regional hospital in Australia within 2009-2013. Using machine learning (penalized regression with Lasso) techniques, patients having their index admission between June 2009 and July 2012 were used to derive predictive models, and patients having their index admission between July 2012 and June 2013 were used for validation. Three different stroke types [intracerebral hemorrhage (ICH), ischemic stroke, transient ischemic attack (TIA)] were considered and five different comparison outcome settings were considered. Our electronic administrative record based predictive model was compared with a predictive model composed of "baseline" clinical features, more specific for stroke, such as age, gender, smoking habits, co-morbidities (high cholesterol, hypertension, atrial fibrillation, and ischemic heart disease), types of imaging done (CT scan, MRI, etc.), and occurrence of in-hospital pneumonia. Risk factors associated with likelihood of negative outcomes were identified. Results: The data was highly reliable at predicting discharge to rehabilitation and all other outcomes vs. death for ICH (AUC 0.85 and 0.825, respectively), all discharge outcomes except home vs. rehabilitation for ischemic stroke, and discharge home vs. others and home vs. rehabilitation for TIA (AUC 0.948 and 0.873, respectively). Electronic health record data appeared to provide improved prediction of outcomes over stroke specific clinical factors from the machine learning models. Common risk factors associated with a negative impact on expected outcomes appeared clinically intuitive, and included older age groups, prior ventilatory support, urinary incontinence, need for imaging, and need for allied health input. Conclusion: Electronic administrative records from this cohort produced reliable outcome prediction and identified clinically appropriate factors negatively impacting most outcome variables following hospital admission with stroke. This presents a means of future identification of modifiable factors associated with patient discharge destination. This may potentially aid in patient selection for certain interventions and aid in better patient and clinician education regarding expected discharge outcomes.

19.
BMC Med Genomics ; 13(Suppl 3): 20, 2020 02 24.
Artigo em Inglês | MEDLINE | ID: mdl-32093737

RESUMO

BACKGROUND: Breast cancer is a collection of multiple tissue pathologies, each with a distinct molecular signature that correlates with patient prognosis and response to therapy. Accurately differentiating between breast cancer sub-types is an important part of clinical decision-making. Although this problem has been addressed using machine learning methods in the past, there remains unexplained heterogeneity within the established sub-types that cannot be resolved by the commonly used classification algorithms. METHODS: In this paper, we propose a novel deep learning architecture, called DeepTRIAGE (Deep learning for the TRactable Individualised Analysis of Gene Expression), which uses an attention mechanism to obtain personalised biomarker scores that describe how important each gene is in predicting the cancer sub-type for each sample. We then perform a principal component analysis of these biomarker scores to visualise the sample heterogeneity, and use a linear model to test whether the major principal axes associate with known clinical phenotypes. RESULTS: Our model not only classifies cancer sub-types with good accuracy, but simultaneously assigns each patient their own set of interpretable and individualised biomarker scores. These personalised scores describe how important each feature is in the classification of any patient, and can be analysed post-hoc to generate new hypotheses about latent heterogeneity. CONCLUSIONS: We apply the DeepTRIAGE framework to classify the gene expression signatures of luminal A and luminal B breast cancer sub-types, and illustrate its use for genes as well as the GO and KEGG gene sets. Using DeepTRIAGE, we calculate personalised biomarker scores that describe the most important features for classifying an individual patient as luminal A or luminal B. In doing so, DeepTRIAGE simultaneously reveals heterogeneity within the luminal A biomarker scores that significantly associate with tumour stage, placing all luminal samples along a continuum of severity.


Assuntos
Biomarcadores Tumorais/análise , Neoplasias da Mama/classificação , Aprendizado Profundo , Neoplasias da Mama/genética , Feminino , Humanos , Cinetocoros , Modelos Biológicos , RNA Neoplásico , RNA-Seq , Transcriptoma
20.
Transl Psychiatry ; 10(1): 162, 2020 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-32448868

RESUMO

Precision psychiatry is attracting increasing attention lately as a recognized priority. One of the goals of precision psychiatry is to develop tools capable of aiding a clinically informed psychiatric diagnosis objectively. Cognitive, inflammatory and immunological factors are altered in both bipolar disorder (BD) and schizophrenia (SZ), however, most of these alterations do not respect diagnostic boundaries from a phenomenological perspective and possess great variability in different individuals with the same phenotypic diagnosis and, consequently, none so far has proven to have the ability of reliably aiding in the differential diagnosis of BD and SZ. We developed a probabilistic multi-domain data integration model consisting of immune and inflammatory biomarkers in peripheral blood and cognitive biomarkers using machine learning to predict diagnosis of BD and SZ. A total of 416 participants, being 323, 372, and 279 subjects for blood, cognition and combined biomarkers analysis, respectively. Our multi-domain model performances for the BD vs. control (sensitivity 80% and specificity 71%) and for the SZ vs. control (sensitivity 84% and specificity 81%) pairs were high in general, however, our multi-domain model had only moderate performance for the differential diagnosis of BD and SZ (sensitivity 71% and specificity 73%). In conclusion, our results show that the diagnosis of BD and of SZ, and that the differential diagnosis of BD and SZ can be predicted with possible clinical utility by a computational machine learning algorithm employing blood and cognitive biomarkers, and that their integration in a multi-domain outperforms algorithms based in only one domain. Independent studies are needed to validate these findings.


Assuntos
Transtorno Bipolar , Psiquiatria , Esquizofrenia , Biomarcadores , Transtorno Bipolar/diagnóstico , Cognição , Humanos , Aprendizado de Máquina , Testes Neuropsicológicos , Esquizofrenia/diagnóstico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA