Pesquisa | BVS Violência e Saúde

1.

High fidelity predictions of diffusion in the brain microenvironment.

Schimek, Nels; Wood, Thomas R; Beck, David A C; McKenna, Michael; Toghani, Ali; Nance, Elizabeth.

Biophys J ; 2024 Oct 09.

Artigo em Inglês | MEDLINE | ID: mdl-39390745

RESUMO

Multiple particle tracking (MPT) is a microscopy technique capable of simultaneously tracking hundreds to thousands of nanoparticles in a biological sample and has been used extensively to characterize biological microenvironments, including the brain extracellular space (ECS). Machine learning techniques have been applied to MPT datasets to predict the diffusion mode of nanoparticle trajectories as well as more complex biological variables, such age biological age. In this study, we develop a machine learning pipeline to predict and investigate changes to the brain ECS due to injury using supervised classification and feature importance calculations. We first validate the pipeline on three related but distinct MPT datasets from the living brain ECS - age differences, region differences, and enzymatic degradation of ECS structure. We predict three ages with 86% accuracy, three regions with 90% accuracy, and healthy versus enzyme-treated tissue with 69% accuracy. Since injury across groups is normally compared with traditional statistical approaches, we first used linear mixed effects models to compare features between healthy control conditions and injury induced by two different oxygen glucose deprivation [1] exposure times. We then used machine learning to predict injury state using MPT features. We show that the pipeline predicts between the healthy control, 0.5-hour OGD treatment, and 1.5-hour OGD treatment with 59% accuracy in the cortex and 66% in the striatum and identifies nonlinear relationships between trajectory features that were not evident from traditional linear models. Our work demonstrates that machine learning applied to MPT data is effective across multiple experimental conditions and can find unique biologically relevant features of nanoparticle diffusion.

2.

Crop type discrimination using Geo-Stat Endmember Extraction and machine learning algorithms.

Singh, Prachi; Srivastava, Prashant K; Shah, Dharambhai; Pandey, Manish K; Anand, Akash; Prasad, Rajendra; Dave, Rucha; Verrelst, Jochem; Bhattacharya, Bimal K; Raghubanshi, A S.

Adv Space Res ; 73(2): 1331-1348, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38250579

RESUMO

The identification of crop diversity in today's world is very crucial to ensure adaptation of the crop with changing climate for better productivity as well as food security. Towards this, Hyperspectral Remote Sensing (HRS) is an efficient technique based on imaging spectroscopy that offers the opportunity to discriminate crop types based on morphological as well as physiological features due to availability of contiguous spectral bands. The current work utilized the benefits of Airborne Visible Infrared Imaging spectrometer- New Generation (AVIRIS-NG) data and explored the techniques for classification and identification of crop types. The endmembers were identified using the Geo-Stat Endmember Extraction (GSEE) algorithm for pure pixels identification and to generate the spectral library of the different crop types. Spectral feature comparison was done among AVIRIS-NG, Analytical Spectral Device (ASD)-Spectroradiometer and Continuum Removed (CR) spectra. The best-fit spectra obtained with the Reference ASD-Spectroradiometer and Pure Pixel spectral library were then used for crop discrimination using the ten supervised classifiers namely Spectral Angle Mapper (SAM), Spectral Information Divergence (SID), Support Vector Machine (SVM), Minimum Distance Classifier (MDC), Binary Encoding, deep learning-based Convolution Neural Network (CNN) and different algorithms of Ensemble learning such as Tree Bag, AdaBoost (Adaptive Boosting), Discriminant and RUSBoost (Random Under Sampling). In total, nine crop types were identified, namely, wheat, maize, tobacco, sorghum, linseed, castor, pigeon pea, fennel and chickpea. The performance evaluation of the classifiers was made using various metrics like Overall Accuracy, Kappa Coefficient, Precision, Recall and F1 score. The classifier 2D-CNN was found to be the best with Overall Accuracy, Kappa Coefficient, Precision, Recall and F1 score values of 89.065 %, 0.871,87.565%, 89.541% and 88.678% respectively. The output of this work can be utilized for large scale mapping of crop types at the species level in a short interval of time of a large area with high accuracy.

3.

Spatially Explicit Active Learning for Crop-Type Mapping from Satellite Image Time Series.

Kaijage, Beatrice; Belgiu, Mariana; Bijker, Wietske.

Sensors (Basel) ; 24(7)2024 Mar 26.

Artigo em Inglês | MEDLINE | ID: mdl-38610320

RESUMO

The availability of a sufficient number of annotated samples is one of the main challenges of the supervised methods used to classify crop types from remote sensing images. Creating these samples is time-consuming and costly. Active Learning (AL) offers a solution by streamlining sample annotation, resulting in more efficient training with less effort. Unfortunately, most of the developed AL methods overlook spatial information inherent in remote sensing images. We propose a novel spatially explicit AL that uses the semi-variogram to identify and discard redundant, spatially adjacent samples. It was evaluated using Random Forest (RF) and Sentinel-2 Satellite Image Time Series in two study areas from the Netherlands and Belgium. In the Netherlands, the spatially explicit AL selected 97 samples achieving an overall accuracy of 80%, compared to traditional AL selecting 169 samples with 82% overall accuracy. In Belgium, spatially explicit AL selected 223 samples and obtained 60% overall accuracy, while traditional AL selected 327 samples and obtained an overall accuracy of 63%. We concluded that the developed AL method helped RF achieve a good performance mostly for the classes consisting of individual crops with a relatively distinctive growth pattern such as sugar beets or cereals. Aggregated classes such as 'fruits and nuts' posed, however, a challenge.

4.

Targeting Transcutaneous Spinal Cord Stimulation Using a Supervised Machine Learning Approach Based on Mechanomyography.

Spieker, Eira Lotta; Dvorani, Ardit; Salchow-Hömmen, Christina; Otto, Carolin; Ruprecht, Klemens; Wenger, Nikolaus; Schauer, Thomas.

Sensors (Basel) ; 24(2)2024 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-38276326

RESUMO

Transcutaneous spinal cord stimulation (tSCS) provides a promising therapy option for individuals with injured spinal cords and multiple sclerosis patients with spasticity and gait deficits. Before the therapy, the examiner determines a suitable electrode position and stimulation current for a controlled application. For that, amplitude characteristics of posterior root muscle (PRM) responses in the electromyography (EMG) of the legs to double pulses are examined. This laborious procedure holds potential for simplification due to time-consuming skin preparation, sensor placement, and required expert knowledge. Here, we investigate mechanomyography (MMG) that employs accelerometers instead of EMGs to assess muscle activity. A supervised machine-learning classification approach was implemented to classify the acceleration data into no activity and muscular/reflex responses, considering the EMG responses as ground truth. The acceleration-based calibration procedure achieved a mean accuracy of up to 87% relative to the classical EMG approach as ground truth on a combined cohort of 11 healthy subjects and 11 patients. Based on this classification, the identified current amplitude for the tSCS therapy was in 85%, comparable to the EMG-based ground truth. In healthy subjects, where both therapy current and position have been identified, 91% of the outcome matched well with the EMG approach. We conclude that MMG has the potential to make the tuning of tSCS feasible in clinical practice and even in home use.

Assuntos

Traumatismos da Medula Espinal , Estimulação da Medula Espinal , Humanos , Estimulação da Medula Espinal/métodos , Medula Espinal/fisiologia , Eletromiografia , Músculo Esquelético/fisiologia , Aprendizado de Máquina Supervisionado

5.

Multi-decadal coastal change detection using remote sensing: the Mediterranean coast of Egypt between El-Dabaa and Ras El-Hekma.

El-Masry, Esraa A; Magdy, Asmaa; El-Gamal, Ayman; Mahmoud, Baher; El-Sayed, Mahmoud Kh.

Environ Monit Assess ; 196(2): 182, 2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38252360

RESUMO

A key source of information for many decision support systems is identifying land use and land cover (LULC) based on remote sensing data. Land conservation, sustainable development, and water resource management all benefit from the knowledge obtained from detecting changes in land use and land cover. The present study aims to investigate the multi-decadal coastal change detection for Ras El-Hekma and El-Dabaa area along the Mediterranean coast of Egypt, a multi-sectoral development area. Besides, the superiority of the area is highly dependent on its proximity to three development projects: the tourism and urban growth pole at Ras El-Hekma, the beachfront Alamain New Mega City, and the Nuclear Power Plant at El Dabaa. This study utilized multi-spectral Landsat satellite images covering 1990, 2010, and 2020 to perceive the post-classification change detection analysis of the land use and land cover changes (LULCC) over 30 years. The results of the supervised classification from 1990 to 2020 showed a 47.33 km2 (4.13%) expansion of the agricultural land area, whereas the bare soil land area shrunk to 73.13 km2 (6.24%). On the other hand, the built-up activities in the area launched in 2010 and escalated to 20.51 km2(1.77%) in 2020. The change in land use reveals the shift in the economic growth pattern in the last decade toward tourism and urban development. Meanwhile, it indicates that no conflict has yet arisen regarding the land use between the expanded socioeconomic main sectors (i.e., agriculture, and tourism). Therefore, the best practices of land use management and active participation of the stakeholders and the local community should be enhanced to achieve sustainability and avoid future conflicts. An area-specific plan including resource conservation measures and the provision of livelihood alternatives should be formulated within the National Integrated Coastal Zone Management (ICZM) plan with the participation of the main stakeholders and beneficiaries. The findings of the present work may be considered useful for sustainable management and supportive to the decision-making process for the sustainable development of this area.

Assuntos

Monitoramento Ambiental , Tecnologia de Sensoriamento Remoto , Egito , Agricultura , Ciclo Celular

6.

Classification of longitudinal profiles using semi-parametric nonlinear mixed models with P-Splines and the SAEM algorithm.

Márquez, Maritza; Meza, Cristian; Lee, Dae-Jin; De la Cruz, Rolando.

Stat Med ; 42(27): 4952-4971, 2023 11 30.

Artigo em Inglês | MEDLINE | ID: mdl-37668286

RESUMO

In this work, we propose an extension of a semiparametric nonlinear mixed-effects model for longitudinal data that incorporates more flexibility with penalized splines (P-splines) as smooth terms. The novelty of the proposed approach consists of the formulation of the model within the stochastic approximation version of the EM algorithm for maximum likelihood, the so-called SAEM algorithm. The proposed approach takes advantage of the formulation of a P-spline as a mixed-effects model and the use of the computational advantages of the existing software for the SAEM algorithm for the estimation of the random effects and the variance components. Additionally, we developed a supervised classification method for these non-linear mixed models using an adaptive importance sampling scheme. To illustrate our proposal, we consider two studies on pregnant women where two biomarkers are used as indicators of changes during pregnancy. In both studies, information about the women's pregnancy outcomes is known. Our proposal provides a unified framework for the classification of longitudinal profiles that may have important implications for the early detection and monitoring of pregnancy-related changes and contribute to improved maternal and fetal health outcomes. We show that the proposed models improve the analysis of this type of data compared to previous studies. These improvements are reflected both in the fit of the models and in the classification of the groups.

Assuntos

Algoritmos , Software , Feminino , Humanos , Gravidez , Resultado da Gravidez , Modelos Estatísticos , Estudos Longitudinais

7.

Machine Learning-Based Hazard-Driven Prioritization of Features in Nontarget Screening of Environmental High-Resolution Mass Spectrometry Data.

Arturi, Katarzyna; Hollender, Juliane.

Environ Sci Technol ; 57(46): 18067-18079, 2023 Nov 21.

Artigo em Inglês | MEDLINE | ID: mdl-37279189

RESUMO

Nontarget high-resolution mass spectrometry screening (NTS HRMS/MS) can detect thousands of organic substances in environmental samples. However, new strategies are needed to focus time-intensive identification efforts on features with the highest potential to cause adverse effects instead of the most abundant ones. To address this challenge, we developed MLinvitroTox, a machine learning framework that uses molecular fingerprints derived from fragmentation spectra (MS2) for a rapid classification of thousands of unidentified HRMS/MS features as toxic/nontoxic based on nearly 400 target-specific and over 100 cytotoxic endpoints from ToxCast/Tox21. Model development results demonstrated that using customized molecular fingerprints and models, over a quarter of toxic endpoints and the majority of the associated mechanistic targets could be accurately predicted with sensitivities exceeding 0.95. Notably, SIRIUS molecular fingerprints and xboost (Extreme Gradient Boosting) models with SMOTE (Synthetic Minority Oversampling Technique) for handling data imbalance were a universally successful and robust modeling configuration. Validation of MLinvitroTox on MassBank spectra showed that toxicity could be predicted from molecular fingerprints derived from MS2 with an average balanced accuracy of 0.75. By applying MLinvitroTox to environmental HRMS/MS data, we confirmed the experimental results obtained with target analysis and narrowed the analytical focus from tens of thousands of detected signals to 783 features linked to potential toxicity, including 109 spectral matches and 30 compounds with confirmed toxic activity.

Assuntos

Aprendizado de Máquina , Espectrometria de Massas

8.

Unraveling the mechanisms underlying drug-induced cholestatic liver injury: identifying key genes using machine learning techniques on human in vitro data sets.

Jiang, Jian; van Ertvelde, Jonas; Ertaylan, Gökhan; Peeters, Ralf; Jennen, Danyel; de Kok, Theo M; Vinken, Mathieu.

Arch Toxicol ; 97(11): 2969-2981, 2023 11.

Artigo em Inglês | MEDLINE | ID: mdl-37603094

RESUMO

Drug-induced intrahepatic cholestasis (DIC) is a main type of hepatic toxicity that is challenging to predict in early drug development stages. Preclinical animal studies often fail to detect DIC in humans. In vitro toxicogenomics assays using human liver cells have become a practical approach to predict human-relevant DIC. The present study was set up to identify transcriptomic signatures of DIC by applying machine learning algorithms to the Open TG-GATEs database. A total of nine DIC compounds and nine non-DIC compounds were selected, and supervised classification algorithms were applied to develop prediction models using differentially expressed features. Feature selection techniques identified 13 genes that achieved optimal prediction performance using logistic regression combined with a sequential backward selection method. The internal validation of the best-performing model showed accuracy of 0.958, sensitivity of 0.941, specificity of 0.978, and F1-score of 0.956. Applying the model to an external validation set resulted in an average prediction accuracy of 0.71. The identified genes were mechanistically linked to the adverse outcome pathway network of DIC, providing insights into cellular and molecular processes during response to chemical toxicity. Our findings provide valuable insights into toxicological responses and enhance the predictive accuracy of DIC prediction, thereby advancing the application of transcriptome profiling in designing new approach methodologies for hazard identification.

Assuntos

Rotas de Resultados Adversos , Doença Hepática Induzida por Substâncias e Drogas , Colestase , Animais , Humanos , Colestase/induzido quimicamente , Colestase/genética , Doença Hepática Induzida por Substâncias e Drogas/genética , Aprendizado de Máquina

9.

Semi-Supervised Classification of PolSAR Images Based on Co-Training of CNN and SVM with Limited Labeled Samples.

Zhao, Mingjun; Cheng, Yinglei; Qin, Xianxiang; Yu, Wangsheng; Wang, Peng.

Sensors (Basel) ; 23(4)2023 Feb 13.

Artigo em Inglês | MEDLINE | ID: mdl-36850703

RESUMO

Recently, convolutional neural networks (CNNs) have shown significant advantages in the tasks of image classification; however, these usually require a large number of labeled samples for training. In practice, it is difficult and costly to obtain sufficient labeled samples of polarimetric synthetic aperture radar (PolSAR) images. To address this problem, we propose a novel semi-supervised classification method for PolSAR images in this paper, using the co-training of CNN and a support vector machine (SVM). In our co-training method, an eight-layer CNN with residual network (ResNet) architecture is designed as the primary classifier, and an SVM is used as the auxiliary classifier. In particular, the SVM is used to enhance the performance of our algorithm in the case of limited labeled samples. In our method, more and more pseudo-labeled samples are iteratively yielded for training through a two-stage co-training of CNN and SVM, which gradually improves the performance of the two classifiers. The trained CNN is employed as the final classifier due to its strong classification capability with enough samples. We carried out experiments on two C-band airborne PolSAR images acquired by the AIRSAR systems and an L-band spaceborne PolSAR image acquired by the GaoFen-3 system. The experimental results demonstrate that the proposed method can effectively integrate the complementary advantages of SVM and CNN, providing overall classification accuracy of more than 97%, 96% and 93% with limited labeled samples (10 samples per class) for the above three images, respectively, which is superior to the state-of-the-art semi-supervised methods for PolSAR image classification.

10.

Semi-supervised classification of fundus images combined with CNN and GCN.

Duan, Sixu; Huang, Pu; Chen, Min; Wang, Ting; Sun, Xiaolei; Chen, Meirong; Dong, Xueyuan; Jiang, Zekun; Li, Dengwang.

J Appl Clin Med Phys ; 23(12): e13746, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-35946866

RESUMO

PURPOSE: Diabetic retinopathy (DR) is one of the most serious complications of diabetes, which is a kind of fundus lesion with specific changes. Early diagnosis of DR can effectively reduce the visual damage caused by DR. Due to the variety and different morphology of DR lesions, automatic classification of fundus images in mass screening can greatly save clinicians' diagnosis time. To alleviate these problems, in this paper, we propose a novel framework-graph attentional convolutional neural network (GACNN). METHODS AND MATERIALS: The network consists of convolutional neural network (CNN) and graph convolutional network (GCN). The global and spatial features of fundus images are extracted by using CNN and GCN, and attention mechanism is introduced to enhance the adaptability of GCN to topology map. We adopt semi-supervised method for classification, which greatly improves the generalization ability of the network. RESULTS: In order to verify the effectiveness of the network, we conducted comparative experiments and ablation experiments. We use confusion matrix, precision, recall, kappa score, and accuracy as evaluation indexes. With the increase of the labeling rates, the classification accuracy is higher. Particularly, when the labeling rate is set to 100%, the classification accuracy of GACNN reaches 93.35%. Compared with DenseNet121, the accuracy rate is improved by 6.24%. CONCLUSIONS: Semi-supervised classification based on attention mechanism can effectively improve the classification performance of the model, and attain preferable results in classification indexes such as accuracy and recall. GACNN provides a feasible classification scheme for fundus images, which effectively reduces the screening human resources.

Assuntos

Retinopatia Diabética , Redes Neurais de Computação , Humanos , Fundo de Olho , Retinopatia Diabética/diagnóstico por imagem

11.

Assessment of Three-Dimensional Kinematics of High- and Low-Calibre Hockey Skaters on Synthetic ice Using Wearable Sensors.

Khandan, Aminreza; Fathian, Ramin; Carey, Jason P; Rouhani, Hossein.

Sensors (Basel) ; 23(1)2022 Dec 28.

Artigo em Inglês | MEDLINE | ID: mdl-36616932

RESUMO

Hockey skating objective assessment can help coaches detect players' performance drop early and avoid fatigue-induced injuries. This study aimed to calculate and experimentally validate the 3D angles of lower limb joints of hockey skaters obtained by inertial measurement units and explore the effectiveness of the on-ice distinctive features measured using these wearable sensors in differentiating low- and high-calibre skaters. Twelve able-bodied individuals, six high-calibre and six low-calibre skaters, were recruited to skate forward on a synthetic ice surface. Five IMUs were placed on their dominant leg and pelvis. The 3D lower-limb joint angles were obtained by IMUs and experimentally validated against those obtained by a motion capture system with a maximum root mean square error of 5 deg. Additionally, among twelve joint angle-based distinctive features identified in other on-ice studies, only three were significantly different (p-value < 0.05) between high- and low-calibre skaters in this synthetic ice experiment. This study thus indicated that skating on synthetic ice alters the skating patterns such that the on-ice distinctive features can no longer differentiate between low- and high-calibre skating joint angles. This wearable technology has the potential to help skating coaches keep track of the players' progress by assessing the skaters' performance, wheresoever.

Assuntos

Hóquei , Dispositivos Eletrônicos Vestíveis , Humanos , Fenômenos Biomecânicos , Gelo , Hóquei/lesões , Extremidade Inferior

12.

MALDI Mass Spectrometry Imaging for the Distinction of Adenocarcinomas of the Pancreas and Biliary Tree.

Bollwein, Christine; GonÒ«alves, Juliana Pereira Lopes; Utpatel, Kirsten; Weichert, Wilko; Schwamborn, Kristina.

Molecules ; 27(11)2022 May 27.

Artigo em Inglês | MEDLINE | ID: mdl-35684402

RESUMO

Pancreatic ductal adenocarcinoma and cholangiocarcinoma constitute two aggressive tumor types that originate from the epithelial lining of the excretory ducts of the pancreatobiliary tract. Given their close histomorphological resemblance, a correct diagnosis can be challenging and almost impossible without clinical information. In this study, we investigated whether mass spectrometric peptide features could be employed to distinguish pancreatic ductal adenocarcinoma from cholangiocarcinoma. Three tissue microarrays of formalin-fixed and paraffin-embedded material (FFPE) comprising 41 cases of pancreatic ductal adenocarcinoma and 41 cases of cholangiocarcinoma were analyzed by matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI-MSI). The derived peptide features and respective intensities were used to build different supervised classification algorithms: gradient boosting (GB), support vector machine (SVM), and k-nearest neighbors (KNN). On a pixel-by-pixel level, a classification accuracy of up to 95% could be achieved. The tentative identification of discriminative tryptic peptide signatures revealed proteins that are involved in the epigenetic regulation of the genome and tumor microenvironment. Despite their histomorphological similarities, mass spectrometry imaging represents an efficient and reliable approach for the distinction of PDAC from CC, offering a promising complementary or alternative approach to the existing tools used in diagnostics such as immunohistochemistry.

Assuntos

Adenocarcinoma , Sistema Biliar , Carcinoma Ductal Pancreático , Colangiocarcinoma , Neoplasias Pancreáticas , Adenocarcinoma/diagnóstico por imagem , Adenocarcinoma/metabolismo , Sistema Biliar/metabolismo , Sistema Biliar/patologia , Colangiocarcinoma/diagnóstico por imagem , Epigênese Genética , Humanos , Pâncreas/metabolismo , Neoplasias Pancreáticas/diagnóstico por imagem , Neoplasias Pancreáticas/metabolismo , Inclusão em Parafina , Peptídeos/metabolismo , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Microambiente Tumoral , Neoplasias Pancreáticas

13.

Identification of Areas Highly Vulnerable to Land Conversion: A Case Study From Southern Thailand.

Tantipisanuh, Naruemon; Gale, George A.

Environ Manage ; 69(2): 323-332, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-34850250

RESUMO

Land conversion is having major impacts on wildlife globally, and thus understanding and predicting patterns of land conversion is an important component of conservation planning. Southeast Asia is undergoing rapid habitat conversion; however, most countries in the region have very limited human resources devoted to planning, and typically land-cover trend assessments are often challenging. Here we demonstrate a rapid method for land-cover change quantification for areas of terrestrial, mangrove and peat swamp forests at high risk from land conversion that can be quickly and simply predicted using southern Thailand as an example. Land-cover maps from two time periods (1995/1996 and 2015/2016) were produced and compared to determine changes between the two time periods. Five land-cover categories (terrestrial forest, mangrove forest, peat swamp forest, human settlement, agriculture) were estimated along with land-cover changes. Hot spots of high percentage change for human settlement and agriculture were identified, and vulnerable habitats were mapped including terrestrial forest, mangrove forest and peat swamp forest. Between 1996 and 2016, 22.1% of terrestrial forests, 26.2% of mangrove forests and 55% of peat swamp forests were lost. The losses of these natural habitats were clearly associated with agricultural expansion. Approximately 10.6%, 14.3% and 33% of terrestrial, mangrove and peat swamp forest remaining were identified as highly vulnerable, of which the majority were at the boundaries between natural and human-dominated areas. The technique offers promise for rapidly identifying high priority areas for more detailed analysis and potential conservation interventions.

Assuntos

Conservação dos Recursos Naturais , Florestas , Agricultura , Conservação dos Recursos Naturais/métodos , Ecossistema , Tailândia , Áreas Alagadas

14.

Self-Supervised Node Classification with Strategy and Actively Selected Labeled Set.

Kang, Yi; Liu, Ke; Cao, Zhiyuan; Zhang, Jiacai.

Entropy (Basel) ; 25(1)2022 Dec 23.

Artigo em Inglês | MEDLINE | ID: mdl-36673172

RESUMO

To alleviate the impact of insufficient labels in less-labeled classification problems, self-supervised learning improves the performance of graph neural networks (GNNs) by focusing on the information of unlabeled nodes. However, none of the existing self-supervised pretext tasks perform optimally on different datasets, and the choice of hyperparameters is also included when combining self-supervised and supervised tasks. To select the best-performing self-supervised pretext task for each dataset and optimize the hyperparameters with no expert experience needed, we propose a novel auto graph self-supervised learning framework and enhance this framework with a one-shot active learning method. Experimental results on three real world citation datasets show that training GNNs with automatically optimized pretext tasks can achieve or even surpass the classification accuracy obtained with manually designed pretext tasks. On this basis, compared with using randomly selected labeled nodes, using actively selected labeled nodes can further improve the classification performance of GNNs. Both the active selection and the automatic optimization contribute to semi-supervised node classification.

15.

Machine learning techniques to predict different levels of hospital care of CoVid-19.

Hernández-Pereira, Elena; Fontenla-Romero, Oscar; Bolón-Canedo, Verónica; Cancela-Barizo, Brais; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo.

Appl Intell (Dordr) ; 52(6): 6413-6431, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-34764619

RESUMO

In this study, we analyze the capability of several state of the art machine learning methods to predict whether patients diagnosed with CoVid-19 (CoronaVirus disease 2019) will need different levels of hospital care assistance (regular hospital admission or intensive care unit admission), during the course of their illness, using only demographic and clinical data. For this research, a data set of 10,454 patients from 14 hospitals in Galicia (Spain) was used. Each patient is characterized by 833 variables, two of which are age and gender and the other are records of diseases or conditions in their medical history. In addition, for each patient, his/her history of hospital or intensive care unit (ICU) admissions due to CoVid-19 is available. This clinical history will serve to label each patient and thus being able to assess the predictions of the model. Our aim is to identify which model delivers the best accuracies for both hospital and ICU admissions only using demographic variables and some structured clinical data, as well as identifying which of those are more relevant in both cases. The results obtained in the experimental study show that the best models are those based on oversampling as a preprocessing phase to balance the distribution of classes. Using these models and all the available features, we achieved an area under the curve (AUC) of 76.1% and 80.4% for predicting the need of hospital and ICU admissions, respectively. Furthermore, feature selection and oversampling techniques were applied and it has been experimentally verified that the relevant variables for the classification are age and gender, since only using these two features the performance of the models is not degraded for the two mentioned prediction problems.

16.

GEOlimma: differential expression analysis and feature selection using pre-existing microarray data.

Lu, Liangqun; Townsend, Kevin A; Daigle, Bernie J.

BMC Bioinformatics ; 22(1): 44, 2021 Feb 03.

Artigo em Inglês | MEDLINE | ID: mdl-33535967

RESUMO

BACKGROUND: Differential expression and feature selection analyses are essential steps for the development of accurate diagnostic/prognostic classifiers of complicated human diseases using transcriptomics data. These steps are particularly challenging due to the curse of dimensionality and the presence of technical and biological noise. A promising strategy for overcoming these challenges is the incorporation of pre-existing transcriptomics data in the identification of differentially expressed (DE) genes. This approach has the potential to improve the quality of selected genes, increase classification performance, and enhance biological interpretability. While a number of methods have been developed that use pre-existing data for differential expression analysis, existing methods do not leverage the identities of experimental conditions to create a robust metric for identifying DE genes. RESULTS: In this study, we propose a novel differential expression and feature selection method-GEOlimma-which combines pre-existing microarray data from the Gene Expression Omnibus (GEO) with the widely-applied Limma method for differential expression analysis. We first quantify differential gene expression across 2481 pairwise comparisons from 602 curated GEO Datasets, and we convert differential expression frequencies to DE prior probabilities. Genes with high DE prior probabilities show enrichment in cell growth and death, signal transduction, and cancer-related biological pathways, while genes with low prior probabilities were enriched in sensory system pathways. We then applied GEOlimma to four differential expression comparisons within two human disease datasets and performed differential expression, feature selection, and supervised classification analyses. Our results suggest that use of GEOlimma provides greater experimental power to detect DE genes compared to Limma, due to its increased effective sample size. Furthermore, in a supervised classification analysis using GEOlimma as a feature selection method, we observed similar or better classification performance than Limma given small, noisy subsets of an asthma dataset. CONCLUSIONS: Our results demonstrate that GEOlimma is a more effective method for differential gene expression and feature selection analyses compared to the standard Limma method. Due to its focus on gene-level differential expression, GEOlimma also has the potential to be applied to other high-throughput biological datasets.

Assuntos

Biologia Computacional , Perfilação da Expressão Gênica , Teorema de Bayes , Criança , Feminino , Humanos , Masculino , Análise de Sequência com Séries de Oligonucleotídeos , Tamanho da Amostra

17.

Identification of plant leaf phosphorus content at different growth stages based on hyperspectral reflectance.

Siedliska, Anna; Baranowski, Piotr; Pastuszka-Wozniak, Joanna; Zubik, Monika; Krzyszczak, Jaromir.

BMC Plant Biol ; 21(1): 28, 2021 Jan 07.

Artigo em Inglês | MEDLINE | ID: mdl-33413120

RESUMO

BACKGROUND: Modern agriculture strives to sustainably manage fertilizer for both economic and environmental reasons. The monitoring of any nutritional (phosphorus, nitrogen, potassium) deficiency in growing plants is a challenge for precision farming technology. A study was carried out on three species of popular crops, celery (Apium graveolens L., cv. Neon), sugar beet (Beta vulgaris L., cv. Tapir) and strawberry (Fragaria × ananassa Duchesne, cv. Honeoye), fertilized with four different doses of phosphorus (P) to deliver data for non-invasive detection of P content. RESULTS: Data obtained via biochemical analysis of the chlorophyll and carotenoid contents in plant material showed that the strongest effect of P availability for plants was in the diverse total chlorophyll content in sugar beet and celery compared to that in strawberry, in which P affects a variety of carotenoid contents in leaves. The measurements performed using hyperspectral imaging, obtained in several different stages of plant development, were applied in a supervised classification experiment. A machine learning algorithm (Backpropagation Neural Network, Random Forest, Naive Bayes and Support Vector Machine) was developed to classify plants from four variants of P fertilization. The lowest prediction accuracy was obtained for the earliest measured stage of plant development. Statistical analyses showed correlations between leaf biochemical constituents, phosphorus fertilization and the mass of the leaf/roots of the plants. CONCLUSIONS: Obtained results demonstrate that hyperspectral imaging combined with artificial intelligence methods has potential for non-invasive detection of non-homogenous phosphorus fertilization on crop levels.

Assuntos

Apium/química , Beta vulgaris/química , Produção Agrícola/métodos , Fertilizantes , Fragaria/química , Fósforo/análise , Folhas de Planta/química , Apium/crescimento & desenvolvimento , Beta vulgaris/crescimento & desenvolvimento , Carotenoides/análise , Clorofila/análise , Produtos Agrícolas/química , Fragaria/crescimento & desenvolvimento , Imageamento Hiperespectral/métodos

18.

Kernel principal components based cascade forest towards disease identification with human microbiota.

Zhou, Jiayu; Ye, Yanqing; Jiang, Jiang.

BMC Med Inform Decis Mak ; 21(1): 360, 2021 12 23.

Artigo em Inglês | MEDLINE | ID: mdl-34949186

RESUMO

BACKGROUND: Numerous pieces of clinical evidence have shown that many phenotypic traits of human disease are related to their gut microbiome, i.e., inflammation, obesity, HIV, and diabetes. Through supervised classification, it is feasible to determine the human disease states by revealing the intestinal microbiota compositional information. However, the abundance matrix of microbiome data is so sparse, an interpretable deep model is crucial to further represent and mine the data for expansion, such as the deep forest model. What's more, overfitting can still exist in the original deep forest model when dealing with such "large p, small n" biology data. Feature reduction is considered to improve the ensemble forest model especially towards the disease identification in the human microbiota. METHODS: In this work, we propose the kernel principal components based cascade forest method, so-called KPCCF, to classify the disease states of patients by using taxonomic profiles of the microbiome at the family level. In detail, the kernel principal components analysis method is first used to reduce the original dimension of human microbiota datasets. Besides, the processed data is fed into the cascade forest to preliminarily discriminate against the disease state of the samples. RESULTS: The proposed KPCCF algorithm can represent the small-scale and high-dimension human microbiota datasets with the sparse feature matrix. Systematic comparison experiments demonstrate that our method consistently outperforms the state-of-the-art methods with the comparative study on 4 datasets. CONCLUSION: Despite sharing some common characteristics, a one-size-fits-all solution does not exist in any space. The traditional depth model has limitations in the biological application of the unbalanced scale between small samples and high dimensions. KPCCF distinguishes from the standard deep forest model for its excellent performance in the microbiota field. Additionally, compared to other dimensionality reduction methods, the kernel principal components analysis method is more suitable for microbiota datasets.

Assuntos

Microbioma Gastrointestinal , Microbiota , Algoritmos , Humanos , Análise de Componente Principal

19.

Quantitative Detection of Chromium Pollution in Biochar Based on Matrix Effect Classification Regression Model.

Guo, Mei; Zhu, Rongguang; Zhang, Lixin; Zhang, Ruoyu; Huang, Guangqun; Duan, Hongwei.

Molecules ; 26(7)2021 Apr 03.

Artigo em Inglês | MEDLINE | ID: mdl-33916837

RESUMO

Returning biochar to farmland has become one of the nationally promoted technologies for soil remediation and improvement in China. Rapid detection of heavy metals in biochar derived from varied materials can provide a guarantee for contaminated soil, avoiding secondary pollution. This work aims first to apply laser-induced breakdown spectroscopy (LIBS) for the quantitative detection of Cr in biochar. Learning from the principles of traditional matrix effect correction methods, calibration samples were divided into 1-3 classifications by an unsupervised hierarchical clustering method based on the main elemental LIBS data in biochar. The prediction samples were then divided into diverse classifications of calibration samples by a supervised K-nearest neighbor (KNN) algorithm. By comparing the effects of multiple partial least squares regression (PLSR) models, the results show that larger numbered classifications have a lower averaged relative standard deviations of cross-validation (ARSDCV) value, signifying a better calibration performance. Therefore, the 3 classification regression model was employed in this study, which had a better prediction performance with a lower averaged relative standard deviations of prediction (ARSDP) value of 8.13%, in comparison with our previous research and related literature results. The LIBS technology combined with matrix effect classification regression model can weaken the influence of the complex matrix effect of biochar and achieve accurate quantification of contaminated metal Cr in biochar.

Assuntos

Carvão Vegetal/química , Cromo/análise , Modelos Teóricos , Poluentes do Solo/análise , Calibragem , Análise de Regressão , Espectrometria por Raios X

20.

Identification and mapping of Algerian island vegetation using high-resolution images (Pléiades and SPOT 6/7) and random forest modeling.

Hamimeche, Mohamed; Niculescu, Simona; Billey, Antoine; Moulaï, Riadh.

Environ Monit Assess ; 193(9): 617, 2021 Sep 02.

Artigo em Inglês | MEDLINE | ID: mdl-34476646

RESUMO

Despite their proximity to the coast, few studies have focused on identifying and mapping the vegetation of Algerian islands and islets. To fill this lacuna, our work, using satellite images and machine learning methods, is mainly aimed at identifying and mapping the main vegetation groups on a few islands, while evaluating the effectiveness of the random forest classifier, which is effectively used in the study of the vegetation of large areas. However, despite the high heterogeneity of their vegetation cover, the use of very high-resolution images (Pléaides and SPOT 6/7), through the fusion bands and derived bands (NDVI), has allowed the elaboration of a fairly precise vegetation map that can be used for the preparation of management and protection plans for these habitats. Our methodological approach revealed very satisfactory results, having allowed the identification of the plant communities inventoried in the field, while showing high accuracy values, ranging from 0.642 for the halophilic group of Asteriscus to 1 for the endemic Chasmophyte group of the Habibas archipelago (Pléiades images). The groups identified from SPOT 6/7 images show accuracy values between 0.67 for the Mediterranean cliff formations on Garlic Islet and 1 for the two formations (shrubby and herbaceous) of the Skikda islands. Our methodological approach, and notwithstanding the great heterogeneity and the very small surface areas of our islands and islets, has led to very satisfactory results, reflected with good overall accuracy and kappa index values (for Pléiades: overall accuracy > 92% and kappa index > 0.90; for SPOT 6/7: overall accuracy > 83% and kappa index > 0.80).

Assuntos

Ecossistema , Monitoramento Ambiental , Plantas

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA