RESUMO
BACKGROUND: Psoriasis is associated with elevated risk of heart attack and increased accumulation of subclinical noncalcified coronary burden by coronary computed tomography angiography (CCTA). Machine learning algorithms have been shown to effectively analyze well-characterized data sets. OBJECTIVE: In this study, we used machine learning algorithms to determine the top predictors of noncalcified coronary burden by CCTA in psoriasis. METHODS: The analysis included 263 consecutive patients with 63 available variables from the Psoriasis Atherosclerosis Cardiometabolic Initiative. The random forest algorithm was used to determine the top predictors of noncalcified coronary burden by CCTA. We evaluated our results using linear regression models. RESULTS: Using the random forest algorithm, we found that the top 10 predictors of noncalcified coronary burden were body mass index, visceral adiposity, total adiposity, apolipoprotein A1, high-density lipoprotein, erythrocyte sedimentation rate, subcutaneous adiposity, small low-density lipoprotein particle, cholesterol efflux capacity and the absolute granulocyte count. Linear regression of noncalcified coronary burden yielded results consistent with our machine learning output. LIMITATION: We were unable to provide external validation and did not study cardiovascular events. CONCLUSION: Machine learning methods identified the top predictors of noncalcified coronary burden in psoriasis. These factors were related to obesity, dyslipidemia, and inflammation, showing that these are important targets when treating comorbidities in psoriasis.
Assuntos
Doença da Artéria Coronariana/epidemiologia , Aprendizado de Máquina , Psoríase/complicações , Adulto , Comorbidade , Doença da Artéria Coronariana/sangue , Doença da Artéria Coronariana/diagnóstico , Doença da Artéria Coronariana/imunologia , Vasos Coronários/diagnóstico por imagem , Dislipidemias/sangue , Dislipidemias/epidemiologia , Dislipidemias/imunologia , Feminino , Humanos , Inflamação/sangue , Inflamação/epidemiologia , Inflamação/imunologia , Masculino , Pessoa de Meia-Idade , Obesidade/sangue , Obesidade/epidemiologia , Obesidade/imunologia , Estudos Prospectivos , Psoríase/sangue , Psoríase/epidemiologia , Psoríase/imunologia , Medição de Risco/métodos , Fatores de Risco , Tomografia Computadorizada por Raios XRESUMO
BACKGROUND: Molecular simulations are used to provide insight into protein structure and dynamics, and have the potential to provide important context when predicting the impact of sequence variation on protein function. In addition to understanding molecular mechanisms and interactions on the atomic scale, translational applications of those approaches include drug screening, development of novel molecular therapies, and targeted treatment planning. Supporting the continued development of these applications, we have developed the SNP2SIM workflow that generates reproducible molecular dynamics and molecular docking simulations for downstream functional variant analysis. The Python workflow utilizes molecular dynamics software (NAMD (Phillips et al., J Comput Chem 26(16):1781-802, 2005), VMD (Humphrey et al., J Mol Graph 14(1):33-8, 27-8, 1996)) to generate variant specific scaffolds for simulated small molecule docking (AutoDock Vina (Trott and Olson, J Comput Chem 31(2):455-61, 2010)). RESULTS: SNP2SIM is composed of three independent modules that can be used sequentially to generate the variant scaffolds of missense protein variants from the wildtype protein structure. The workflow first generates the mutant structure and configuration files required to execute molecular dynamics simulations of solvated protein variant structures. The resulting trajectories are clustered based on the structural diversity of residues involved in ligand binding to produce one or more variant scaffolds of the protein structure. Finally, these unique structural conformations are bound to small molecule ligand libraries to predict variant induced changes to drug binding relative to the wildtype protein structure. CONCLUSIONS: SNP2SIM provides a platform to apply molecular simulation based functional analysis of sequence variation in the protein targets of small molecule therapies. In addition to simplifying the simulation of variant specific drug interactions, the workflow enables large scale computational mutagenesis by controlling the parameterization of molecular simulations across multiple users or distributed computing infrastructures. This enables the parallelization of the computationally intensive molecular simulations to be aggregated for downstream functional analysis, and facilitates comparing various simulation options, such as the specific residues used to define structural variant clusters. The Python scripts that implement the SNP2SIM workflow are available (SNP2SIM Repository. https://github.com/mccoymd/SNP2SIM , Accessed 2019 February ), and individual SNP2SIM modules are available as apps on the Seven Bridges Cancer Genomics Cloud (Lau et al., Cancer Res 77(21):e3-e6, 2017; Cancer Genomics Cloud [ www.cancergenomicscloud.org ; Accessed 2018 November]).
Assuntos
Simulação de Acoplamento Molecular/métodos , Proteínas Mutantes/química , Humanos , Ligantes , Simulação de Dinâmica Molecular , Mutação de Sentido Incorreto , Conformação Proteica , Software , Fluxo de TrabalhoRESUMO
Cellular molecules interact with one another in a structured manner, defining a regulatory network topology that describes cellular mechanisms. Genetic mutations alter these networks' pathways, generating complex disorders such as autism spectrum disorder (ASD). Boolean models have assisted in understanding biological system dynamics since Kauffman's 1969 discovery, and various analytical tools for regulatory networks have been developed. This study examined the protein-protein interaction network created in our previous publication of four ASD patients using the SPIDDOR R package, a Boolean model-based method. The aim is to examine how patients' genetic variations in INTS6L, USP9X, RSK4, FGF5, FLNA, SUMF1, and IDS affect mTOR and Wnt cell signaling convergence. The Boolean network analysis revealed abnormal activation levels of essential proteins such as ß-catenin, MTORC1, RPS6, eIF4E, Cadherin, and SMAD. These proteins affect gene expression, translation, cell adhesion, shape, and migration. Patients 1 and 2 showed consistent patterns of increased ß-catenin activity and decreased MTORC1, RPS6, and eIF4E activity. However, patient 2 had an independent decrease in Cadherin and SMAD activity due to the FLNA mutation. Patients 3 and 4 have an abnormal activation of the mTOR pathway, which includes the MTORC1, RPS6, and eIF4E genes. The shared mTOR pathway behavior in these patients is explained by a shared mutation in two closely related proteins (SUMF1 and IDS). Diverse activities in ß-catenin, MTORC1, RPS6, eIF4E, Cadherin, and SMAD contributed to the reported phenotype in these individuals. Furthermore, it unveiled the potential therapeutic options that could be suggested to these individuals.
RESUMO
Background: Ovarian cancer (OC) is the most lethal gynecological cancer in the United States. Among the different types of OC, serous ovarian cancer (SOC) stands out as the most prevalent. Transcriptomics techniques generate extensive gene expression data, yet only a few of these genes are relevant to clinical diagnosis. Methods: Methods for feature selection (FS) address the challenges of high dimensionality in extensive datasets. This study proposes a computational framework that applies FS techniques to identify genes highly associated with platinum-based chemotherapy response on SOC patients. Using SOC datasets from the Gene Expression Omnibus (GEO) database, LASSO and varSelRF FS methods were employed. Machine learning classification algorithms such as random forest (RF) and support vector machine (SVM) were also used to evaluate the performance of the models. Results: The proposed framework has identified biomarkers panels with 9 and 10 genes that are highly correlated with platinum-paclitaxel and platinum-only response in SOC patients, respectively. The predictive models have been trained using the identified gene signatures and accuracy of above 90% was achieved. Conclusions: In this study, we propose that applying multiple feature selection methods not only effectively reduces the number of identified biomarkers, enhancing their biological relevance, but also corroborates the efficacy of drug response prediction models in cancer treatment.
RESUMO
Text mining methods are being developed to assimilate the volume of biomedical textual materials that are continually expanding. Understanding protein-protein interaction (PPI) deficits would assist in explaining the genesis of diseases. In this study, we designed an automated system to extract PPIs from the biomedical literature that uses a deep learning sentence classification model, a pretrained word embedding, and a BiLSTM recurrent neural network with additional layers, a conditional random field (CRF) named entity recognition (NER) model, and shortest-dependency path (SDP) model using the SpaCy library in Python. The automated system ensures that it targets sentences that contain PPIs and not just these proteins mentioned in the framework of disease discovery or other context. Our first model achieved 13% greater precision on the Aimed/BioInfr benchmark corpus than the previous state-of-the-art BiLSTM neural network models. The NER model presented in this study achieved 98% precision on the Aimed/BioInfr corpus over previous models. In order to facilitate the production of an accurate representation of the PPI network, the processes were developed to systematically map the protein interactions in the texts. Overall, evaluating our system through the use of 6027 abstracts pertaining to seven proteins associated with Autism Spectrum Disorder completed the manually curated PPI network for these proteins. When it comes to complicated diseases, these networks would assist in understanding how PPI deficits contribute to disease development while also emphasizing the influence of interactions on protein function and biological processes.
RESUMO
Introduction: FOLFOX and FOLFIRI chemotherapy are considered standard first-line treatment options for colorectal cancer (CRC). However, the criteria for selecting the appropriate treatments have not been thoroughly analyzed. Methods: A newly developed machine learning model was applied on several gene expression data from the public repository GEO database to identify molecular signatures predictive of efficacy of 5-FU based combination chemotherapy (FOLFOX and FOLFIRI) in patients with CRC. The model was trained using 5-fold cross validation and multiple feature selection methods including LASSO and VarSelRF methods. Random Forest and support vector machine classifiers were applied to evaluate the performance of the models. Results and Discussion: For the CRC GEO dataset samples from patients who received either FOLFOX or FOLFIRI, validation and test sets were >90% correctly classified (accuracy), with specificity and sensitivity ranging between 85%-95%. In the datasets used from the GEO database, 28.6% of patients who failed the treatment therapy they received are predicted to benefit from the alternative treatment. Analysis of the gene signature suggests the mechanistic difference between colorectal cancers that respond and those that do not respond to FOLFOX and FOLFIRI. Application of this machine learning approach could lead to improvements in treatment outcomes for patients with CRC and other cancers after additional appropriate clinical validation.
RESUMO
Calsequestrin Type 2 (CASQ2) is a high-capacity, low-affinity, Ca2+-binding protein expressed in the sarcoplasmic reticulum (SR) of the cardiac myocyte. Mutations in CASQ2 have been linked to the arrhythmia catecholaminergic polymorphic ventricular tachycardia (CPVT2) that occurs with acute emotional stress or exercise can result in sudden cardiac death (SCD). CASQ2G112+5X is a 16 bp (339-354) deletion CASQ2 mutation that prevents the protein expression due to premature stop codon. Understanding the subcellular mechanisms of CPVT2 is experimentally challenging because the occurrence of arrhythmia is rare. To obtain an insight into the characteristics of this rare disease, simulation studies using a local control stochastic computational model of the Guinea pig ventricular myocyte investigated how the mutant CASQ2s may be responsible for the development of an arrhythmogenic episode under the condition of ß-adrenergic stimulation or in the slowing of heart rate afterward once ß-adrenergic stimulation ceases. Adjustment of the computational model parameters based upon recent experiments explore the functional changes caused by the CASQ2 mutation. In the simulation studies under rapid pacing (6 Hz), electromechanically concordant cellular alternans appeared under ß-adrenergic stimulation in the CPVT mutant but not in the wild-type nor in the non-ß-stimulated mutant. Similarly, the simulations of accelerating pacing from slow to rapid and back to the slow pacing did not display alternans but did generate early afterdepolarizations (EADs) during the period of second slow pacing subsequent acceleration of rapid pacing.
Assuntos
Calsequestrina , Miócitos Cardíacos , Animais , Cobaias , Miócitos Cardíacos/metabolismo , Calsequestrina/genética , Calsequestrina/metabolismo , Mutação , Arritmias Cardíacas/genética , Adrenérgicos/metabolismoRESUMO
Calcium sparks are the elementary Ca2+ release events in excitation-contraction coupling that underlie the Ca2+ transient. The frequency-dependent contractile force generated by cardiac myocytes depends upon the characteristics of the Ca2+ transients. A stochastic computational local control model of a guinea pig ventricular cardiomyocyte was developed, to gain insight into mechanisms of force-frequency relationship (FFR). This required the creation of a new three-state RyR2 model that reproduced the adaptive behavior of RyR2, in which the RyR2 channels transition into a different state when exposed to prolonged elevated subspace [Ca2+]. The model simulations agree with previous experimental and modeling studies on interval-force relations. Unlike previous common pool models, this local control model displayed stable action potential trains at 7 Hz. The duration and the amplitude of the [Ca2+]myo transients increase in pacing rates consistent with the experiments. The [Ca2+]myo transient reaches its peak value at 4 Hz and decreases afterward, consistent with experimental force-frequency curves. The model predicts, in agreement with previous modeling studies of Jafri and co-workers, diastolic sarcoplasmic reticulum, [Ca2+]sr, and RyR2 adaptation increase with the increased stimulation frequency, producing rising, rather than falling, amplitude of the myoplasmic [Ca2+] transients. However, the local control model also suggests that the reduction of the L-type Ca2+ current, with an increase in pacing frequency due to Ca2+-dependent inactivation, also plays a role in the negative slope of the FFR. In the simulations, the peak Ca2+ transient in the FFR correlated with the highest numbers of SR Ca2+ sparks: the larger average amplitudes of those sparks, and the longer duration of the Ca2+ sparks.
Assuntos
Miócitos Cardíacos , Canal de Liberação de Cálcio do Receptor de Rianodina , Cobaias , Animais , Miócitos Cardíacos/metabolismo , Canal de Liberação de Cálcio do Receptor de Rianodina/metabolismo , Cálcio/metabolismo , Retículo Sarcoplasmático/metabolismo , Sinalização do Cálcio/fisiologiaRESUMO
Cardiovascular disease is the leading cause of death worldwide due in a large part to arrhythmia. In order to understand how calcium dynamics play a role in arrhythmogenesis, normal and dysfunctional Ca2+ signaling in a subcellular, cellular, and tissued level is examined using cardiac ventricular myocytes at a high temporal and spatial resolution using multiscale computational modeling. Ca2+ sparks underlie normal excitation-contraction coupling. However, under pathological conditions, Ca2+ sparks can combine to form Ca2+ waves. These propagating elevations of (Ca2+)i can activate an inward Na+-Ca2+ exchanger current (INCX) that contributes to early after-depolarization (EADs) and delayed after-depolarizations (DADs). However, how cellular currents lead to full depolarization of the myocardium and how they initiate extra systoles is still not fully understood. This study explores how many myocytes must be entrained to initiate arrhythmogenic depolarizations in biophysically detailed computational models. The model presented here suggests that only a small number of myocytes must activate in order to trigger an arrhythmogenic propagating action potential. These conditions were examined in 1-D, 2-D, and 3-D considering heart geometry. The depolarization of only a few hundred ventricular myocytes is required to trigger an ectopic depolarization. The number decreases under disease conditions such as heart failure. Furthermore, in geometrically restricted parts of the heart such as the thin muscle strands found in the trabeculae and papillary muscle, the number of cells needed to trigger a propagating depolarization falls even further to less than ten myocytes.
Assuntos
Sinalização do Cálcio , Acoplamento Excitação-Contração , Animais , Arritmias Cardíacas/metabolismo , Sinalização do Cálcio/fisiologia , Miócitos Cardíacos/metabolismo , Ratos , Trocador de Sódio e Cálcio/metabolismoRESUMO
Protein phosphorylation is a post-translational modification that enables various cellular activities and plays essential roles in protein interactions. Phosphorylation is an important process for the replication of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). To shed more light on the effects of phosphorylation, we used an ensemble of neural networks to predict potential kinases that might phosphorylate SARS-CoV-2 nonstructural proteins (nsps) and molecular dynamics (MD) simulations to investigate the effects of phosphorylation on nsps structure, which could be a potential inhibitory target to attenuate viral replication. Eight target candidate sites were found as top-ranked phosphorylation sites of SARS-CoV-2. During the process of molecular dynamics (MD) simulation, the root-mean-square deviation (RMSD) analysis was used to measure conformational changes in each nsps. Root-mean-square fluctuation (RMSF) was employed to measure the fluctuation in each residue of 36 systems considered, allowing us to evaluate the most flexible regions. These analysis shows that there are significant structural deviations in the residues namely nsp1 THR 72, nsp2 THR 73, nsp3 SER 64, nsp4 SER 81, nsp4 SER 455, nsp5 SER284, nsp6 THR 238, and nsp16 SER 132. The identified list of residues suggests how phosphorylation affects SARS-CoV-2 nsps function and stability. This research also suggests that kinase inhibitors could be a possible component for evaluating drug binding studies, which are crucial in therapeutic discovery research.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Simulação de Dinâmica Molecular , Proteínas não Estruturais Virais/metabolismo , Fosforilação , Replicação ViralRESUMO
Alzheimer's disease, the most common form of dementia, currently has no cure. There are only temporary treatments that reduce symptoms and the progression of the disease. Alzheimer's disease is characterized by the prevalence of plaques of aggregated amyloid ß (Aß) peptide. Recent treatments to prevent plaque formation have provided little to relieve disease symptoms. Although there have been numerous molecular simulation studies on the mechanisms of Aß aggregation, the signaling role has been less studied. In this study, a total of over 38,000 simulated structures, generated from molecular dynamics (MD) simulations, exploring different conformations of the Aß42 mutants and wild-type peptides were used to examine the relationship between Aß torsion angles and disease measures. Unique methods characterized the data set and pinpointed residues that were associated in aggregation and others associated with signaling. Machine learning techniques were applied to characterize the molecular simulation data and classify how much each residue influenced the predicted variant of Alzheimer's Disease. Orange3 data mining software provided the ability to use these techniques to generate tables and rank the data. The test and score module coupled with the confusion matrix module analyzed data with calculations of specificity and sensitivity. These methods evaluating frequency and rank allowed us to analyze and predict important residues associated with different phenotypic measures. This research has the potential to help understand which specific residues of Aß should be targeted for drug development.
Assuntos
Doença de Alzheimer/genética , Precursor de Proteína beta-Amiloide/genética , Angiopatia Amiloide Cerebral/genética , Fragmentos de Peptídeos/genética , Idade de Início , Idoso , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/patologia , Aminoácidos/genética , Peptídeos beta-Amiloides/genética , Peptídeos beta-Amiloides/metabolismo , Angiopatia Amiloide Cerebral/tratamento farmacológico , Angiopatia Amiloide Cerebral/patologia , Mineração de Dados , Feminino , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Simulação de Dinâmica Molecular , Fragmentos de Peptídeos/metabolismoRESUMO
Cardiac alternans is characterized by alternating weak and strong beats of the heart. This signaling at the cellular level may appear as alternating long and short action potentials (APs) that occur in synchrony with alternating large and small calcium transients, respectively. Previous studies have suggested that alternans manifests itself through either a voltage dependent mechanism based upon action potential restitution or as a calcium dependent mechanism based on refractoriness of calcium release. We use a novel model of cardiac excitation-contraction (EC) coupling in the rat ventricular myocyte that includes 20,000 calcium release units (CRU) each with 49 ryanodine receptors (RyR2s) and 7 L-type calcium channels that are all stochastically gated. The model suggests that at the cellular level in the case of alternans produced by rapid pacing, the mechanism requires a synergy of voltage- and calcium-dependent mechanisms. The rapid pacing reduces AP duration and magnitude reducing the number of L-type calcium channels activating individual CRUs during each AP and thus increases the population of CRUs that can be recruited stochastically. Elevated myoplasmic and sarcoplasmic reticulum (SR) calcium, [Ca2+]myo and [Ca2+]SR respectively, increases ryanodine receptor open probability (Po) according to our model used in this simulation and this increased the probability of activating additional CRUs. A CRU that opens in one beat is less likely to open the subsequent beat due to refractoriness caused by incomplete refilling of the junctional sarcoplasmic reticulum (jSR). Furthermore, the model includes estimates of changes in Na+ fluxes and [Na+]i and thus provides insight into how changes in electrical activity, [Na+]i and sodium-calcium exchanger activity can modulate alternans. The model thus tracks critical elements that can account for rate-dependent changes in [Na+]i and [Ca2+]myo and how they contribute to the generation of Ca2+ signaling alternans in the heart.
RESUMO
Calcium (Ca2+) plays a central role in the excitation and contraction of cardiac myocytes. Experiments have indicated that calcium release is stochastic and regulated locally suggesting the possibility of spatially heterogeneous calcium levels in the cells. This spatial heterogeneity might be important in mediating different signaling pathways. During more than 50 years of computational cell biology, the computational models have been advanced to incorporate more ionic currents, going from deterministic models to stochastic models. While periodic increases in cytoplasmic Ca2+ concentration drive cardiac contraction, aberrant Ca2+ release can underly cardiac arrhythmia. However, the study of the spatial role of calcium ions has been limited due to the computational expense of using a three-dimensional stochastic computational model. In this paper, we introduce a three-dimensional stochastic computational model for rat ventricular myocytes at the whole-cell level that incorporate detailed calcium dynamics, with (1) non-uniform release site placement, (2) non-uniform membrane ionic currents and membrane buffers, (3) stochastic calcium-leak dynamics and (4) non-junctional or rogue ryanodine receptors. The model simulates spark-induced spark activation and spark-induced Ca2+ wave initiation and propagation that occur under conditions of calcium overload at the closed-cell condition, but not when Ca2+ levels are normal. This is considered important since the presence of Ca2+ waves contribute to the activation of arrhythmogenic currents.
RESUMO
Biological processes are incredibly complex-integrating molecular signaling networks involved in multicellular communication and function, thus maintaining homeostasis. Dysfunction of these processes can result in the disruption of homeostasis, leading to the development of several disease processes including atherosclerosis. We have significantly advanced our understanding of bioprocesses in atherosclerosis, and in doing so, we are beginning to appreciate the complexities, intricacies, and heterogeneity atherosclerosi. We are also now better equipped to acquire, store, and process the vast amount of biological data needed to shed light on the biological circuitry involved. Such data can be analyzed within machine learning frameworks to better tease out such complex relationships. Indeed, there has been an increasing number of studies applying machine learning methods for patient risk stratification based on comorbidities, multi-modality image processing, and biomarker discovery pertaining to atherosclerotic plaque formation. Here, we focus on current applications of machine learning to provide insight into atherosclerotic plaque formation and better understand atherosclerotic plaque progression in patients with cardiovascular disease.