RESUMEN
Diabetic nephropathy (DN), the leading cause of end-stage renal disease, has become a massive global health burden. Despite considerable efforts, the underlying mechanisms have not yet been comprehensively understood. In this study, a systematic approach was utilized to identify the microRNA signature in DN and to introduce novel drug targets (DTs) in DN. Using microarray profiling followed by qPCR confirmation, 13 and 6 differentially expressed (DE) microRNAs were identified in the kidney cortex and medulla, respectively. The microRNA-target interaction networks for each anatomical compartment were constructed and central nodes were identified. Moreover, enrichment analysis was performed to identify key signaling pathways. To develop a strategy for DT prediction, the human proteome was annotated with 65 biochemical characteristics and 23 network topology parameters. Furthermore, all proteins targeted by at least one FDA-approved drug were identified. Next, mGMDH-AFS, a high-performance machine learning algorithm capable of tolerating massive imbalanced size of the classes, was developed to classify DT and non-DT proteins. The sensitivity, specificity, accuracy, and precision of the proposed method were 90%, 86%, 88%, and 89%, respectively. Moreover, it significantly outperformed the state-of-the-art (P-value ≤ 0.05) and showed very good diagnostic accuracy and high agreement between predicted and observed class labels. The cortex and medulla networks were then analyzed with this validated machine to identify potential DTs. Among the high-rank DT candidates are Egfr, Prkce, clic5, Kit, and Agtr1a which is a current well-known target in DN. In conclusion, a combination of experimental and computational approaches was exploited to provide a holistic insight into the disorder for introducing novel therapeutic targets.
Asunto(s)
Nefropatías Diabéticas/tratamiento farmacológico , Aprendizaje Automático , Biología de Sistemas , Algoritmos , Animales , Química Farmacéutica/métodos , Análisis por Conglomerados , Biología Computacional/métodos , Diseño de Fármacos , Epigénesis Genética , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , Salud Global , Humanos , Corteza Renal/efectos de los fármacos , Médula Renal/efectos de los fármacos , Modelos Lineales , Masculino , Ratones , Ratones Endogámicos DBA , MicroARNs/genética , Análisis por Micromatrices , Análisis de Secuencia por Matrices de Oligonucleótidos , Análisis de Componente Principal , Análisis de Regresión , Transducción de Señal , Máquina de Vectores de SoporteRESUMEN
Coronavirus disease-2019, also known as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was a disaster in 2020. Accurate and early diagnosis of coronavirus disease-2019 (COVID-19) is still essential for health policymaking. Reverse transcriptase-polymerase chain reaction (RT-PCR) has been performed as the operational gold standard for COVID-19 diagnosis. We aimed to design and implement a reliable COVID-19 diagnosis method to provide the risk of infection using demographics, symptoms and signs, blood markers, and family history of diseases to have excellent agreement with the results obtained by the RT-PCR and CT-scan. Our study primarily used sample data from a 1-year hospital-based prospective COVID-19 open-cohort, the Khorshid COVID Cohort (KCC) study. A sample of 634 patients with COVID-19 and 118 patients with pneumonia with similar characteristics whose RT-PCR and chest CT scan were negative (as the control group) (dataset 1) was used to design the system and for internal validation. Two other online datasets, namely, some symptoms (dataset 2) and blood tests (dataset 3), were also analyzed. A combination of one-hot encoding, stability feature selection, over-sampling, and an ensemble classifier was used. Ten-fold stratified cross-validation was performed. In addition to gender and symptom duration, signs and symptoms, blood biomarkers, and comorbidities were selected. Performance indices of the cross-validated confusion matrix for dataset 1 were as follows: sensitivity of 96% [confidence interval, CI, 95%: 94-98], specificity of 95% [90-99], positive predictive value (PPV) of 99% [98-100], negative predictive value (NPV) of 82% [76-89], diagnostic odds ratio (DOR) of 496 [198-1,245], area under the ROC (AUC) of 0.96 [0.94-0.97], Matthews Correlation Coefficient (MCC) of 0.87 [0.85-0.88], accuracy of 96% [94-98], and Cohen's Kappa of 0.86 [0.81-0.91]. The proposed algorithm showed excellent diagnosis accuracy and class-labeling agreement, and fair discriminant power. The AUC on the datasets 2 and 3 was 0.97 [0.96-0.98] and 0.92 [0.91-0.94], respectively. The most important feature was white blood cell count, shortness of breath, and C-reactive protein for datasets 1, 2, and 3, respectively. The proposed algorithm is, thus, a promising COVID-19 diagnosis method, which could be an amendment to simple blood tests and screening of symptoms. However, the RT-PCR and chest CT-scan, performed as the gold standard, are not 100% accurate.
RESUMEN
Background: Developing simplified risk assessment model based on non-laboratory risk factors that could determine cardiovascular risk as accurately as laboratory-based one can be valuable, particularly in developing countries where there are limited resources. Objective: To develop a simplified non-laboratory cardiovascular disease risk assessment chart based on previously reported laboratory-based chart and evaluate internal and external validation, and recalibration of both risk models to assess the performance of risk scoring tools in other population. Methods: A 10-year non-laboratory-based risk prediction chart was developed for fatal and non-fatal CVD using Cox Proportional Hazard regression. Data from the Isfahan Cohort Study (ICS), a population-based study among 6504 adults aged ≥ 35 years, followed-up for at least ten years was used for the non-laboratory-based model derivation. Participants were followed up until the occurrence of CVD events. Tehran Lipid and Glucose Study (TLGS) data was used to evaluate the external validity of both non-laboratory and laboratory risk assessment models in other populations rather than one used in the model derivation. Results: The discrimination and calibration analysis of the non-laboratory model showed the following values of Harrell's C: 0.73 (95% CI 0.71-0.74), and Nam-D'Agostino χ2:11.01 (p = 0.27), respectively. The non-laboratory model was in agreement and classified high risk and low risk patients as accurately as the laboratory one. Both non-laboratory and laboratory risk prediction models showed good discrimination in the external validation, with Harrell's C of 0.77 (95% CI 0.75-0.78) and 0.78 (95% CI 0.76-0.79), respectively. Conclusions: Our simplified risk assessment model based on non-laboratory risk factors could determine cardiovascular risk as accurately as laboratory-based one. This approach can provide simple risk assessment tool where laboratory testing is unavailable, inconvenient, and costly.
Asunto(s)
Enfermedades Cardiovasculares , Adulto , Enfermedades Cardiovasculares/diagnóstico , Enfermedades Cardiovasculares/epidemiología , Estudios de Cohortes , Factores de Riesgo de Enfermedad Cardiaca , Humanos , Irán , Laboratorios , Medición de Riesgo , Factores de RiesgoRESUMEN
Identifying the possible factors of psychiatric symptoms among children can reduce the risk of adverse psychosocial outcomes in adulthood. We designed a classification tool to examine the association between modifiable risk factors and psychiatric symptoms, defined based on the Persian version of the WHO-GSHS questionnaire in a developing country. Ten thousand three hundred fifty students, aged 6-18 years from all Iran provinces, participated in this study. We used feature discretization and encoding, stability selection, and regularized group method of data handling (GMDH) to classify the a priori specific factors (e.g., demographic, sleeping-time, life satisfaction, and birth-weight) to psychiatric symptoms. Self-rated health was the most critical feature. The selected modifiable factors were eating breakfast, screentime, salty snack for depression symptom, physical activity, salty snack for worriedness symptom, (abdominal) obesity, sweetened beverage, and sleep-hour for mild-to-moderate emotional symptoms. The area under the ROC curve of the GMDH was 0.75 (CI 95% 0.73-0.76) for the analyzed psychiatric symptoms using threefold cross-validation. It significantly outperformed the state-of-the-art (adjusted p < 0.05; McNemar's test). In this study, the association of psychiatric risk factors and the importance of modifiable nutrition and lifestyle factors were emphasized. However, as a cross-sectional study, no causality can be inferred.
Asunto(s)
Trastornos Mentales/clasificación , Estudiantes/psicología , Adolescente , Niño , Estudios Transversales , Ejercicio Físico/psicología , Conducta Alimentaria/psicología , Humanos , Irán/epidemiología , Estilo de Vida , Trastornos Mentales/epidemiología , Obesidad/psicología , Curva ROC , Factores de Riesgo , Encuestas y Cuestionarios , Violencia/psicologíaRESUMEN
Non-invasive determination of leaf nitrogen (N) and water contents is essential for ensuring the healthy growth of the plants. However, most of the existing methods to measure them are expensive. In this paper, a low-cost, portable multispectral sensor system is proposed to determine N and water contents in the leaves, non-invasively. Four different species of plants-canola, corn, soybean, and wheat-are used as test plants to investigate the utility of the proposed device. The sensor system comprises two multispectral sensors, visible (VIS) and near-infrared (NIR), detecting reflectance at 12 wavelengths (six from each sensor). Two separate experiments were performed in a controlled greenhouse environment, including N and water experiments. Spectral data were collected from 307 leaves (121 for N and 186 for water experiment), and the rational quadratic Gaussian process regression (GPR) algorithm was applied to correlate the reflectance data with actual N and water content. By performing five-fold cross-validation, the N estimation showed a coefficient of determination () of 63.91% for canola, 80.05% for corn, 82.29% for soybean, and 63.21% for wheat. For water content estimation, canola showed an of 18.02%, corn showed an of 68.41%, soybean showed an of 46.38%, and wheat showed an of 64.58%. The result reveals that the proposed low-cost sensor with an appropriate regression model can be used to determine N content. However, further investigation is needed to improve the water estimation results using the proposed device.
Asunto(s)
Técnicas Biosensibles/economía , Técnicas Biosensibles/instrumentación , Análisis Costo-Beneficio , Productos Agrícolas/metabolismo , Nitrógeno/análisis , Dispositivos Ópticos/economía , Hojas de la Planta/metabolismo , Agua/análisis , Luz , Suelo/químicaRESUMEN
Despite the progress in understanding of neural codes, the studies of the cortico-muscular coupling still largely rely on interferential electromyographic (EMG) signal or its rectification for the assessment of motor neuron pool behavior. This assessment is non-trivial and should be used with precaution. Direct analysis of neural codes by decomposing the EMG, also known as neural decoding, is an alternative to EMG amplitude estimation. In this study, we propose a fully-deterministic hybrid surface EMG (sEMG) decomposition approach that combines the advantages of both template-based and Blind Source Separation (BSS) decomposition approaches, a.k.a. guided source separation (GSS), to identify motor unit (MU) firing patterns. We use the single-pass density-based clustering algorithm to identify possible cluster representatives in different sEMG channels. These cluster representatives are then used as initial points of modified gradient Convolution Kernel Compensation (gCKC) algorithm. Afterwards, we use the Kalman filter to reduce the noise impact and increase convergence rate of MU filter identification by gCKC. Moreover, we designed an adaptive soft-thresholding method to identify MU firing times out of estimated MU spike trains. We tested the proposed algorithm on a set of synthetic sEMG signals with known MU firing patterns. A grid of 9 × 10 monopolar surface electrodes with 5-mm inter-electrode distances in both directions was simulated. Muscle excitation was set to 10, 30, and 50%. Colored Gaussian zero-mean noise with the signal-to-noise ratio (SNR) of 10, 20, and 30 dB, respectively, was added to 16 s long sEMG signals that were sampled at 4,096 Hz. Overall, 45 simulated signals were analyzed. Our decomposition approach was compared with gCKC algorithm. Overall, in our algorithm, the average numbers of identified MUs and Rate-of-Agreement (RoA) were 16.41 ± 4.18 MUs and 84.00 ± 0.06%, respectively, whereas the gCKC identified 12.10 ± 2.32 MUs with the average RoA of 90.78 ± 0.08%. Therefore, the proposed GSS method identified more MUs than the gCKC, with comparable performance. Its performance was dependent on the signal quality but not the signal complexity at different force levels. The proposed algorithm is a promising new offline tool in clinical neurophysiology.
RESUMEN
Dyslipidemia, the disorder of lipoprotein metabolism resulting in high lipid profile, is an important modifiable risk factor for coronary heart diseases. It is associated with more than four million worldwide deaths per year. Half of the children with dyslipidemia have hyperlipidemia during adulthood, and its prediction and screening are thus critical. We designed a new dyslipidemia diagnosis system. The sample size of 725 subjects (age 14.66⯱â¯2.61 years; 48% male; dyslipidemia prevalence of 42%) was selected by multistage random cluster sampling in Iran. Single nucleotide polymorphisms (rs1801177, rs708272, rs320, rs328, rs2066718, rs2230808, rs5880, rs5128, rs2893157, rs662799, and Apolipoprotein-E2/E3/E4), and anthropometric, life-style attributes, and family history of diseases were analyzed. A framework for classifying mixed-type data in imbalanced datasets was proposed. It included internal feature mapping and selection, re-sampling, optimized group method of data handling using convex and stochastic optimizations, a new cost function for imbalanced data and an internal validation. Its performance was assessed using hold-out and 4-foldcross-validation. Four other classifiers namely as supported vector machines, decision tree, and multilayer perceptron neural network and multiple logistic regression were also used. The average sensitivity, specificity, precision and accuracy of the proposed system were 93%, 94%, 94% and 92%, respectively in cross validation. It significantly outperformed the other classifiers and also showed excellent agreement and high correlation with the gold standard. A non-invasive economical version of the algorithm was also implemented suitable for low- and middle-income countries. It is thus a promising new tool for the prediction of dyslipidemia.
RESUMEN
[This corrects the article DOI: 10.1371/journal.pone.0189389.].
RESUMEN
This study was designed to develop a risk assessment chart for the clinical management and prevention of the risk of cardiovascular disease (CVD) in Iranian population, which is vital for developing national prevention programs. The Isfahan Cohort Study (ICS) is a population-based prospective study of 6504 Iranian adults ≥35 years old, followed-up for ten years, from 2001 to 2010. Behavioral and cardiometabolic risk factors were examined every five years, while biennial follow-ups for the occurrence of the events was performed by phone calls or by verbal autopsy. Among these participants, 5432 (2784 women, 51.3%) were CVD free at baseline examination and had at least one follow-up. Cox proportional hazard regression was used to predict the risk of ischemic CVD events, including sudden cardiac death due to unstable angina, myocardial infarction, and stroke. The model fit statistics such as area under the receiver-operating characteristic (AUROC), calibration chi-square and the overall bias were used to assess the model performance. We also tested the Framingham model for comparison. Seven hundred and five CVD events occurred during 49452.8 person-years of follow-up. The event probabilities were calculated and presented color-coded on each gender-specific PARS chart. The AUROC and Harrell's C indices were 0.74 (95% CI, 0.72-0.76) and 0.73, respectively. In the calibration, the Nam-D'Agostino χ2 was 10.82 (p = 0.29). The overall bias of the proposed model was 95.60%. PARS model was also internally validated using cross-validation. The Android app and the Web-based risk assessment tool were also developed as to have an impact on public health. In comparison, the refitted and recalibrated Framingham models, estimated the CVD incidence with the overall bias of 149.60% and 128.23% for men, and 222.70% and 176.07% for women, respectively. In conclusion, the PARS risk assessment chart is a simple, accurate, and well-calibrated tool for predicting a 10-year risk of CVD occurrence in Iranian population and can be used in an attempt to develop national guidelines for the CVD management.