RESUMEN
The Water Quality Index (WQI) provides comprehensive assessments in river systems; however, its calculation involves numerous water quality parameters, costly in sample collection and laboratory analysis. The study aimed to determine key water parameters and the most reliable models, considering seasonal variations in the water environment, to maximize the precision of WQI prediction by a minimal set of water parameters. Ten statistical or machine learning models were developed to predict the WQI over four seasons using water quality dataset collected in a coastal city adjacent to the Yellow Sea in China, based on which the key water parameters were identified and the variations were assessed by the Seasonal-Trend decomposition procedure based on Loess (STL). Results indicated that model performance generally improved with adding more input variables except Self-Organizing Map (SOM). Tree-based ensemble methods like Extreme Gradient Boosting (XGB) and Random Forest (RF) demonstrated the highest accuracy, particularly in winter. Nutrients (Ammonia Nitrogen (AN) and Total Phosphorus (TP)), Dissolved Oxygen (DO), and turbidity were determined as key water parameters, based on which, the prediction accuracy for Medium and Low grades was perfect while it was over 80% for the Good grade in spring and winter and dropped to around 70% in summer and autumn. Nutrient concentrations were higher at inland stations; however, it worsened at coastal stations, especially in summer. The study underscores the importance of reliable WQI prediction models in water quality assessment, especially when data is limited, which are crucial for managing water resources effectively.
Asunto(s)
Monitoreo del Ambiente , Aprendizaje Automático , Estaciones del Año , Calidad del Agua , Monitoreo del Ambiente/métodos , China , Ciudades , Contaminantes Químicos del Agua/análisis , Fósforo/análisis , Nitrógeno/análisis , Contaminación Química del Agua/estadística & datos numéricos , Ríos/químicaRESUMEN
An increasing body of evidence suggests that acylphosphatase-2 (ACYP2) polymorphisms are correlated with an increased susceptibility to a range of malignancies. Nevertheless, its potential functions, molecular mechanisms in hepatocellular carcinoma (HCC) and whether it can be act as a therapeutic target remain uninvestigated. Herein, ACYP2 was found to be lowly expressed in HCC and was negatively correlated with tumor size, tumor differentiation, microvascular invasion and the prognosis of HCC patients. Functional investigations revealed that overexpression of ACYP2 inhibited the proliferation and metastasis of HCC cells while promoting apoptosis; knockdown of ACYP2 had the exact opposite effect. Additionally, it was observed that ACYP2 was distributed in both the cytoplasm and nucleus of HCC cells. According to the mechanistic studies, the expression of potassium calcium-activated channel subfamily N member 4 (KCNN4) was negatively regulated by cytoplasmic ACYP2, resulting in the inhibition of K+ outflow and subsequent inactivation of the ERK pathway, which impeded the growth and metastasis of HCC. Furthermore, the activity of telomerase reverse transcriptase (TERT) was inhibited by nuclear ACYP2, leading to the reduction in length of telomeres and consequent reversal of HCC cell immortalization. Additionally, a novel targeted nanotherapy strategy was developed wherein the pcDNA-ACYP2 vector was encapsulated within polyetherimide nanoparticles (PEI/NPs), which were subsequently coated with HCC cell membranes (namely pcDNA/PEI/NPs@M). Safety and targeting characteristics abound for these nanocomposites, in both subcutaneous graft tumor models and orthotopic mouse models, they inhibited the progression of HCC by impeding TERT activity and the KCNN4/ERK pathway. In conclusion, our research identifies novel molecular mechanisms involving cytoplasmic and nuclear ACYP2 that inhibit the progression of HCC. Moreover, pcDNA/PEI/NPs@M represents a targeted therapeutic strategy for HCC that holds great promising.
Asunto(s)
Carcinoma Hepatocelular , Proliferación Celular , Canales de Potasio de Conductancia Intermedia Activados por el Calcio , Neoplasias Hepáticas , Sistema de Señalización de MAP Quinasas , Telomerasa , Humanos , Carcinoma Hepatocelular/tratamiento farmacológico , Carcinoma Hepatocelular/patología , Neoplasias Hepáticas/tratamiento farmacológico , Neoplasias Hepáticas/patología , Telomerasa/metabolismo , Telomerasa/genética , Animales , Línea Celular Tumoral , Proliferación Celular/efectos de los fármacos , Ratones , Masculino , Canales de Potasio de Conductancia Intermedia Activados por el Calcio/metabolismo , Canales de Potasio de Conductancia Intermedia Activados por el Calcio/antagonistas & inhibidores , Canales de Potasio de Conductancia Intermedia Activados por el Calcio/genética , Sistema de Señalización de MAP Quinasas/efectos de los fármacos , Ratones Desnudos , Apoptosis/efectos de los fármacos , Femenino , Progresión de la Enfermedad , Ratones Endogámicos BALB C , Nanopartículas/química , Persona de Mediana EdadRESUMEN
The water quality index (WQI) is a widely used tool for comprehensive assessment of river environments. However, its calculation involves numerous water quality parameters, making sample collection and laboratory analysis time-consuming and costly. This study aimed to identify key water parameters and the most reliable prediction models that could provide maximum accuracy using minimal indicators. Water quality from 2020 to 2023 were collected including nine biophysical and chemical indicators in seventeen rivers in Yancheng and Nantong, two coastal cities in Jiangsu Province, China, adjacent to the Yellow Sea. Linear regression and seven machine learning models (Artificial Neural Network (ANN), Self-Organizing Maps (SOM), K-Nearest Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF), Extreme Gradient Boosting (XGB) and Stochastic Gradient Boosting (SGB)) were developed to predict WQI using different groups of input variables based on correlation analysis. The results indicated that water quality improved from 2020 to 2022 but deteriorated in 2023, with inland stations exhibiting better conditions than coastal ones, particularly in terms of turbidity and nutrients. The water environment was comparatively better in Nantong than in Yancheng, with mean WQI values of approximately 55.3-72.0 and 56.4-67.3, respectively. The classifications "Good" and "Medium" accounted for 80 % of the records, with no instances of "Excellent" and 2 % classified as "Bad". The performance of all prediction models, except for SOM, improved with the addition of input variables, achieving R2 values higher than 0.99 in models such as SVM, RF, XGB, and SGB. The most reliable models were RF and XGB with key parameters of total phosphorus (TP), ammonia nitrogen (AN), and dissolved oxygen (DO) (R2 = 0.98 and 0.91 for training and testing phase) for predicting WQI values, and RF using TP and AN (accuracy higher than 85 %) for WQI grades. The prediction accuracy for "Medium" and "Low" water quality grades was highest at 90 %, followed by the "Good" level at 70 %. The model results could contribute to efficient water quality evaluation by identifying key water parameters and facilitating effective water quality management in river basins.
RESUMEN
Sunspots play a crucial role in both weather forecasting and the monitoring of solar storms. In this work, we propose a novel combined model for sunspot prediction using improved gated recurrent units (GRU) guided by pinball loss for probabilistic forecasts. Specifically, we optimize the GRU parameters using the slime mould algorithm and employ a seasonal-trend decomposition procedure based on loess to tackle challenges related to sequence prediction, such as self-correlations and non-stationarity. To address prediction uncertainty, we replace the traditional l 2 -norm loss with pinball loss. This modification extends the conventional GRU-based point forecasting to a probabilistic framework expressed as quantiles. We apply our proposed model to analyze a well-established historical sunspot dataset for both single- and multi-step ahead forecasting. The results demonstrate the effectiveness of our combined model in predicting sunspot values, surpassing the performance of other existing methods.
RESUMEN
The water body's suspended concentration reflects many coastal environmental indicators, which is important for predicting ecological hazards. The modeling of any concentration in water requires solving the settling-diffusion equation (SDE), and the values of several key input parameters therein (settling velocity ws, eddy diffusivity Ds, and erosion rates p(t)) directly determine the prediction performance. The time-consuming large-scale simulations would benefit if the parameter values could be estimated through available observations in the target sea area. The present work proposes a new optimization method for synchronously estimating the three parameters from limited concentration observations. First, an analytical solution to the one-dimensional vertical (1DV) SDE for suspended concentrations in an unsteady scenario is derived. Second, the near bottom suspended sediment concentration (SSC) profiles are measured with high-resolution observation. Third, the key parameters are optimized through the best fit of the measured SSC profiles and those modeled with the unsteady solution. Nonlinear least square fitting (NLSF) is introduced to judge the best fits automatically. The high-resolution concentration measurements in a specially-designed cylindrical tank experiment using the Yellow River Delta sediments test the proposed method. The method performs well in the initial period of turbulence generation when sediment resuspension is significant. It optimizes p(t), ws, and Ds with reasonable values and uniqueness of their combination. The proposed theory is a practical tool for quickly estimating key substance transport parameters from limited observations; it also has the potential to construct local parametric models to benefit the 3D modeling of coastal substance transport. Although the present work takes SSC as an example, it can be extended to any suspended particulate concentration in the water.
Asunto(s)
Sedimentos Geológicos , Agua , Ríos , Movimientos del Agua , Monitoreo del Ambiente/métodosRESUMEN
Addressing the profound impact of Tapping Panel Dryness (TPD) on yield and quality in the global rubber industry, this study introduces a cutting-edge Otsu threshold segmentation technique, enhanced by Dung Beetle Optimization (DBO-Otsu). This innovative approach optimizes the segmentation threshold combination by accelerating convergence and diversifying search methodologies. Following initial segmentation, TPD severity levels are meticulously assessed using morphological characteristics, enabling precise determination of optimal thresholds for final segmentation. The efficacy of DBO-Otsu is rigorously evaluated against mainstream benchmarks like Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Feature Similarity Index (FSIM), and compared with six contemporary swarm intelligence algorithms. The findings reveal that DBO-Otsu substantially surpasses its counterparts in image segmentation quality and processing speed. Further empirical analysis on a dataset comprising TPD cases from level 1 to 5 underscores the algorithm's practical utility, achieving an impressive 80% accuracy in severity level identification and underscoring its potential for TPD image segmentation and recognition tasks.
Asunto(s)
Hevea , Goma , Algoritmos , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
The detection of epidermal growth factor receptor (EGFR) mutation L858R in circulating tumor DNA (ctDNA) is beneficial for the clinical diagnosis and personalized therapy of non-small cell lung cancer (NSCLC). Herein, for the first time, the combination of the primer exchange reaction (PER) and clustered regularly interspaced short palindromic repeats (CRISPR) and its associated nucleases (Cas) 14a was used in electrochemical biosensor construction for the detection of ctDNA EGFR L858R. EGFR L858R, as the target, induced the isothermal amplification of the PER reaction, and then the CRISPR/Cas14a system was activated; subsequently, the substrate ssDNA-MB was cleaved and the electron on the surface of the gold electrode transferred, resulting in the fluctuation of the electrochemical redox signal on the electrode surface, whereas the electrochemical signal will be stable when EGFR L858R is absent. Therefore, the concentration of EGFR L858R can be quantified by electrochemical signal analysis. The low detection limit is 0.34 fM and the dynamic detection range is from 1 fM to 1 µM in this work. The PER-CRISPR/Cas14a electrochemical biosensor greatly improved the analytical sensitivity. In addition, this platform also exhibited excellent specificity, reproducibility, stability and good recovery. This study provides an efficient and novel strategy for the detection of ctDNA EGFR L858R, which has great potential for application in the diagnosis and treatment of NSCLC.
Asunto(s)
Técnicas Biosensibles , Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Carcinoma de Pulmón de Células no Pequeñas/diagnóstico , Carcinoma de Pulmón de Células no Pequeñas/genética , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Reproducibilidad de los Resultados , Técnicas Biosensibles/métodos , Receptores ErbB/genéticaRESUMEN
With the increasing number of sequenced species, phylogenetic profiling (PP) has become a powerful method to predict functional genes based on co-evolutionary information. However, its potential in plant genomics has not yet been fully explored. In this context, we combined the power of machine learning and PP to identify salt stress-related genes in a halophytic grass, Spartina alterniflora, using evolutionary information generated from 365 plant species. Our results showed that the genes highly co-evolved with known salt stress-related genes are enriched in biological processes of ion transport, detoxification and metabolic pathways. For ion transport, five identified genes coding two sodium and three potassium transporters were validated to be able to uptake Na+. In addition, we identified two orthologs of trichome-related AtR3-MYB genes, SaCPC1 and SaCPC2, which may be involved in salinity responses. Genes co-evolved with SaCPCs were enriched in functions related to the circadian rhythm and abiotic stress responses. Overall, this work demonstrates the feasibility of mining salt stress-related genes using evolutionary information, highlighting the potential of PP as a valuable tool for plant functional genomics. Supplementary Information: The online version contains supplementary material available at 10.1007/s42994-023-00125-5.
RESUMEN
Optimization problems are ubiquitous in engineering and scientific research, with a large number of such problems requiring resolution. Meta-heuristics offer a promising approach to solving optimization problems. The firefly algorithm (FA) is a swarm intelligence meta-heuristic that emulates the flickering patterns and behaviour of fireflies. Although FA has been significantly enhanced to improve its performance, it still exhibits certain deficiencies. To overcome these limitations, this study presents the Q-learning based on the adaptive logarithmic spiral-Levy flight firefly algorithm (QL-ADIFA). The Q-learning technique empowers the improved firefly algorithm to leverage the firefly's environmental awareness and memory while in flight, allowing further refinement of the enhanced firefly. Numerical experiments demonstrate that QL-ADIFA outperforms existing methods on 15 benchmark optimization functions and twelve engineering problems: cantilever arm design, pressure vessel design, three-bar truss design problem, and 9 constrained optimization problems in CEC2020.
RESUMEN
Psychrophilic yeasts are distributed widely on Earth and have developed adaptation strategies to overcome the effect of low temperatures. They can adapt to low temperatures better than bacteriophyta. However, to date, their whole-genome sequences have been limited to the analysis of single strains of psychrophilic yeasts, which cannot be used to reveal their possible psychrophilic mechanisms to adapt to low temperatures accurately and comprehensively. This study aimed to compare different sources of psychrophilic yeasts at the genomic level and investigate their cold-adaptability mechanisms in a comprehensive manner. Nine genomes of known psychrophilic yeasts and three representative genomes of mesophilic yeasts were collected and annotated. Comparative genomic analysis was performed to compare the differences in their signaling pathways, metabolic regulations, evolution, and psychrophilic genes. The results showed that fatty acid desaturase coding genes are universal and diverse in psychophilic yeasts, and different numbers of these genes exist (delta 6, delta 9, delta 12, and delta 15) in the genomes of various psychrophilic yeasts. Therefore, they can synthesize polyunsaturated fatty acids (PUFAs) in a variety of ways and may be able to enhance the fluidity of cell membranes at low temperatures by synthesizing C18:3 or C18:4 PUFAs, thereby ensuring their ability to adapt to low-temperature environments. However, mesophilic yeasts have lost most of these genes. In this study, psychrophilic yeasts could adapt to low temperatures primarily by synthesizing PUFAs and diverse antifreeze proteins. A comparison of more psychrophilic yeasts' genomes will be useful for the study of their psychrophilic mechanisms, given the presence of additional potential psychrophilic-related genes in the genomes of psychrophilic yeasts. This study provides a reference for the study of the psychrophilic mechanisms of psychrophilic yeasts.
Asunto(s)
Frío , Levaduras , Levaduras/genética , Ácidos Grasos Insaturados , Adaptación Fisiológica/genética , Aclimatación/genéticaRESUMEN
China implemented a strict lockdown policy to prevent the spread of COVID-19 in the worst-affected regions, including Wuhan and Shanghai. This study aims to investigate impact of these lockdowns on air quality index (AQI) using a deep learning framework. In addition to historical pollutant concentrations and meteorological factors, we incorporate social and spatio-temporal influences in the framework. In particular, spatial autocorrelation (SAC), which combines temporal autocorrelation with spatial correlation, is adopted to reflect the influence of neighbouring cities and historical data. Our deep learning analysis obtained the estimates of the lockdown effects as - 25.88 in Wuhan and - 20.47 in Shanghai. The corresponding prediction errors are reduced by about 47% for Wuhan and by 67% for Shanghai, which enables much more reliable AQI forecasts for both cities.
Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , COVID-19 , Aprendizaje Profundo , Humanos , Contaminantes Atmosféricos/análisis , COVID-19/epidemiología , COVID-19/prevención & control , Material Particulado/análisis , Pandemias/prevención & control , China/epidemiología , Control de Enfermedades Transmisibles , Contaminación del Aire/análisis , Ciudades , Análisis Espacial , Monitoreo del AmbienteRESUMEN
Centrality has always been used in transportation networks to estimate the status and importance of a node in the networks, especially in the shipping networks. However, most of the studies only take the shipping network as an unweighted network or only considering the tie weights in the weighted networks, ignoring the truth that both the number of ties and tie weights contribute to the centrality in weighted shipping networks. Therefore, we proposed a new method combining both the number of ties and tie weights to assess the node centrality based on effective distance by integrating the studies of Opsahl et al., (2010) and Du et al., (2015). An empirical analysis of shipping network at the country level for the 21st-centrtury Maritime Silk Road (MSR) was performed. The result of correlation analysis between country's degree centrality and the Liner Shipping Connectivity Index (LSCI) published by the United Nations Conference on Trade and Development (UNCTAD) proved the superiority of our method compared to the traditional centrality metrics. In weighted networks, both the number of ties the tie weights should be considered by adjusting the parameters. The method proposed in this study can also be used to nodes' status and importance estimation of various networks in other fields.
RESUMEN
An in-situ monitoring of water quality (suspended sediment concentration, SSC) and concurrent hydrodynamics was conducted in the subaqueous Yellow River Delta in China. Empirical mode decomposition and spectral analysis on the SSC time series reveal the different periodicities of each physical mechanism that contribute to the SSC variations. Based on this physical understanding, the decomposed SSC time series were trained separately with a newly-proposed augmented lncosh ridge regression, in which (1) a lncosh function was incorporated in traditional ridge regression for handling outliers in original data, and (2) the temporal auto-correlation in the decomposed SSC series was used for augmented regression. Finally, the trained sub-series were added up as the final prediction. The advantages of this decomposition-ensemble framework is that it depends on SSC only, superior to the normal process-based models which need the concurrent hydrodynamics for estimating bed shear stress. This will not only reduce the measurement uncertainties of the input when training the data-driven model, but also save the prediction cost as no other parameters than SSC need to be measured and input for running the model. The framework realized 6-hour-ahead high-accuracy forecasting with mean relative errors of 5.80-9.44% in the present case study. The proposed framework can be extended to forecast any signal that is superposed by components with various timescales (periodicities) which is common in nature.
Asunto(s)
Ríos , Calidad del Agua , Monitoreo del Ambiente , Predicción , Sedimentos Geológicos/análisis , FísicaRESUMEN
Selecting the minimal best subset out of a huge number of factors for influencing the response is a fundamental and very challenging NP-hard problem because the presence of many redundant genes results in over-fitting easily while missing an important gene can more detrimental impact on predictions, and computation is prohibitive for exhaust search. We propose a modified memetic algorithm (MA) based on an improved splicing method to overcome the problems in the traditional genetic algorithm exploitation capability and dimension reduction in the predictor variables. The new algorithm accelerates the search in identifying the minimal best subset of genes by incorporating it into the new local search operator and hence improving the splicing method. The improvement is also due to another two novel aspects: (a) updating subsets of genes iteratively until the no more reduction in the loss function by splicing and increasing the probability of selecting the true subsets of genes; and (b) introducing add and del operators based on backward sacrifice into the splicing method to limit the size of gene subsets. Additionally, according to the experimental results, our proposed optimizer can obtain a better minimal subset of genes with a few iterations, compared with all considered algorithms. Moreover, the mutation operator is replaced by it to enhance exploitation capability and initial individuals are improved by it to enhance efficiency of search. A dataset of the body weight of Hu sheep was used to evaluate the superiority of the modified MA against the genetic algorithm. According to our experimental results, our proposed optimizer can obtain a better minimal subset of genes with a few iterations, compared with all considered algorithms including the most advanced adaptive best-subset selection algorithm.
RESUMEN
KEY MESSAGE: We identified 1.844 million barley pan-genome sequence anchors from 12,306 genotypes using genetic mapping and machine learning. There is increasing evidence that genes from a given crop genotype are far to cover all genes in that species; thus, building more comprehensive pan-genomes is of great importance in genetic research and breeding. Obtaining a thousand-genotype scale pan-genome using deep-sequencing data is currently impractical for species like barley which has a huge and highly repetitive genome. To this end, we attempted to identify barley pan-genome sequence anchors from a large quantity of genotype-by-sequencing (GBS) datasets by combining genetic mapping and machine learning algorithms. Based on the GBS sequences from 11,166 domesticated and 1140 wild barley genotypes, we identified 1.844 million pan-genome sequence anchors. Of them, 532,253 were identified as presence/absence variation (PAV) tags. Through aligning these PAV tags to the genome of hulless barley genotype Zangqing320, our analysis resulted in a validation of 83.6% of them from the domesticated genotypes and 88.6% from the wild barley genotypes. Association analyses against flowering time, plant height and kernel size showed that the relative importance of the PAV and non-PAV tags varied for different traits. The pan-genome sequence anchors based on GBS tags can facilitate the construction of a comprehensive pan-genome and greatly assist various genetic studies including identification of structural variation, genetic mapping and breeding in barley.
Asunto(s)
Mapeo Cromosómico , Genoma de Planta , Hordeum/genética , Aprendizaje Automático , Algoritmos , Genotipo , Desequilibrio de LigamientoRESUMEN
Air pollution in China is becoming more serious especially for the particular matter (PM) because of rapid economic growth and fast expansion of urbanization. To solve the growing environment problems, daily PM2.5 and PM10 concentration data form January 1, 2015, to August 23, 2016, in Kunming and Yuxi (two important cities in Yunnan Province, China) are used to present a new hybrid model CI-FPA-SVM to forecast air PM2.5 and PM10 concentration in this paper. The proposed model involves two parts. Firstly, due to its deficiency to assess the possible correlation between different variables, the cointegration theory is introduced to get the input-output relationship and then obtain the nonlinear dynamical system with support vector machine (SVM), in which the parameters c and g are optimized by flower pollination algorithm (FPA). Six benchmark models, including FPA-SVM, CI-SVM, CI-GA-SVM, CI-PSO-SVM, CI-FPA-NN, and multiple linear regression model, are considered to verify the superiority of the proposed hybrid model. The empirical study results demonstrate that the proposed model CI-FPA-SVM is remarkably superior to all considered benchmark models for its high prediction accuracy, and the application of the model for forecasting can give effective monitoring and management of further air quality.