RESUMO
SignificanceHuman sleep phenotypes are diversified by genetic and environmental factors, and a quantitative classification of sleep phenotypes would lead to the advancement of biomedical mechanisms underlying human sleep diversity. To achieve that, a pipeline of data analysis, including a state-of-the-art sleep/wake classification algorithm, the uniform manifold approximation and projection (UMAP) dimension reduction method, and the density-based spatial clustering of applications with noise (DBSCAN) clustering method, was applied to the 100,000-arm acceleration dataset. This revealed 16 clusters, including seven different insomnia-like phenotypes. This kind of quantitative pipeline of sleep analysis is expected to promote data-based diagnosis of sleep disorders and psychiatric disorders that tend to be complicated by sleep disorders.
Assuntos
Bancos de Espécimes Biológicos , Transtornos do Sono-Vigília , Aceleração , Humanos , Fenótipo , Sono , Reino UnidoRESUMO
BACKGROUND: The application of Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and visualization has revolutionized the analysis of single-cell RNA expression and population genetics. However, its potential in single-cell DNA sequencing data analysis, particularly for visualizing gene mutation information, has not been fully explored. RESULTS: We introduce Mugen-UMAP, a novel Python-based program that extends UMAP's utility to single-cell DNA sequencing data. This innovative tool provides a comprehensive pipeline for processing gene annotation files of single-cell somatic single-nucleotide variants and metadata to the visualization of UMAP projections for identifying clusters, along with various statistical analyses. Employing Mugen-UMAP, we analyzed whole-exome sequencing data from 365 single-cell samples across 12 non-small cell lung cancer (NSCLC) patients, revealing distinct clusters associated with histological subtypes of NSCLC. Moreover, to demonstrate the general utility of Mugen-UMAP, we applied the program to 9 additional single-cell WES datasets from various cancer types, uncovering interesting patterns of cell clusters that warrant further investigation. In summary, Mugen-UMAP provides a quick and effective visualization method to uncover cell cluster patterns based on the gene mutation information from single-cell DNA sequencing data. CONCLUSIONS: The application of Mugen-UMAP demonstrates its capacity to provide valuable insights into the visualization and interpretation of single-cell DNA sequencing data. Mugen-UMAP can be found at https://github.com/tengchn/Mugen-UMAP.
Assuntos
Mutação , Análise de Célula Única , Software , Análise de Célula Única/métodos , Humanos , Análise de Sequência de DNA/métodos , Análise por Conglomerados , Carcinoma Pulmonar de Células não Pequenas/genética , Neoplasias Pulmonares/genéticaRESUMO
The architecture of the brain is too complex to be intuitively surveyable without the use of compressed representations that project its variation into a compact, navigable space. The task is especially challenging with high-dimensional data, such as gene expression, where the joint complexity of anatomical and transcriptional patterns demands maximum compression. The established practice is to use standard principal component analysis (PCA), whose computational felicity is offset by limited expressivity, especially at great compression ratios. Employing whole-brain, voxel-wise Allen Brain Atlas transcription data, here we systematically compare compressed representations based on the most widely supported linear and non-linear methods-PCA, kernel PCA, non-negative matrix factorisation (NMF), t-stochastic neighbour embedding (t-SNE), uniform manifold approximation and projection (UMAP), and deep auto-encoding-quantifying reconstruction fidelity, anatomical coherence, and predictive utility across signalling, microstructural, and metabolic targets, drawn from large-scale open-source MRI and PET data. We show that deep auto-encoders yield superior representations across all metrics of performance and target domains, supporting their use as the reference standard for representing transcription patterns in the human brain.
Assuntos
Encéfalo , Imageamento por Ressonância Magnética , Transcrição Gênica , Humanos , Encéfalo/diagnóstico por imagem , Encéfalo/metabolismo , Transcrição Gênica/fisiologia , Tomografia por Emissão de Pósitrons , Processamento de Imagem Assistida por Computador/métodos , Análise de Componente Principal , Compressão de Dados/métodos , Atlas como AssuntoRESUMO
BACKGROUND AND HYPOTHESIS: Specific urinary peptides hold information on disease pathophysiology, which, in combination with artificial intelligence, could enable non-invasive assessment of chronic kidney disease (CKD) aetiology. Existing approaches are generally specific for the diagnosis of single aetiologies. We present the development of models able to simultaneously distinguish and spatially visualize multiple CKD aetiologies. METHODS: The urinary peptide data of 1850 healthy control (HC) and CKD [diabetic kidney disease (DKD), immunoglobulin A nephropathy (IgAN) and vasculitis] participants were extracted from the Human Urinary Proteome Database. Uniform manifold approximation and projection (UMAP) coupled to a support vector machine algorithm was used to generate multi-peptide models to perform binary (DKD, HC) and multiclass (DKD, HC, IgAN, vasculitis) classifications. This pipeline was compared with the current state-of-the-art single-aetiology CKD urinary peptide models. RESULTS: In an independent test set, the developed models achieved 90.35% and 70.13% overall predictive accuracies, respectively, for the binary and the multiclass classifications. Omitting the UMAP step led to improved predictive accuracies (96.14% and 85.06%, respectively). As expected, the HC class was distinguished with the highest accuracy. The different classes displayed a tendency to form distinct clusters in the 3D space based on their disease state. CONCLUSION: Urinary peptide data present an effective basis for CKD aetiology differentiation using machine learning models. Although adding the UMAP step to the models did not improve prediction accuracy, it may provide a unique visualization advantage. Additional studies are warranted to further validate the pipeline's clinical potential as well as to expand it to other CKD aetiologies and also other diseases.
Assuntos
Glomerulonefrite por IGA , Insuficiência Renal Crônica , Vasculite , Humanos , Biomarcadores , Diagnóstico Diferencial , Inteligência Artificial , Glomerulonefrite por IGA/complicações , Biópsia Líquida/efeitos adversos , Peptídeos , ProteômicaRESUMO
PURPOSE: The previous studies that examined the effectiveness of unsupervised machine learning methods versus traditional methods in assessing dietary patterns and their association with incident hypertension showed contradictory results. Consequently, our aim is to explore the correlation between the incidence of hypertension and overall dietary patterns that were extracted using unsupervised machine learning techniques. METHODS: Data were obtained from Japanese male participants enrolled in a prospective cohort study between August 2008 and August 2010. A final dataset of 447 male participants was used for analysis. Dimension reduction using uniform manifold approximation and projection (UMAP) and subsequent K-means clustering was used to derive dietary patterns. In addition, multivariable logistic regression was used to evaluate the association between dietary patterns and the incidence of hypertension. RESULTS: We identified four dietary patterns: 'Low-protein/fiber High-sugar,' 'Dairy/vegetable-based,' 'Meat-based,' and 'Seafood and Alcohol.' Compared with 'Seafood and Alcohol' as a reference, the protective dietary patterns for hypertension were 'Dairy/vegetable-based' (OR 0.39, 95% CI 0.19-0.80, P = 0.013) and the 'Meat-based' (OR 0.37, 95% CI 0.16-0.86, P = 0.022) after adjusting for potential confounding factors, including age, body mass index, smoking, education, physical activity, dyslipidemia, and diabetes. An age-matched sensitivity analysis confirmed this finding. CONCLUSION: This study finds that relative to the 'Seafood and Alcohol' pattern, the 'Dairy/vegetable-based' and 'Meat-based' dietary patterns are associated with a lower risk of hypertension among men.
Assuntos
Dieta , Hipertensão , Aprendizado de Máquina , Humanos , Masculino , Hipertensão/epidemiologia , Japão/epidemiologia , Incidência , Pessoa de Meia-Idade , Estudos Prospectivos , Dieta/métodos , Dieta/estatística & dados numéricos , Estudos de Coortes , Adulto , Fatores de Risco , Comportamento Alimentar , Padrões Dietéticos , População do Leste AsiáticoRESUMO
This study used an odor sensing system with a 16-channel electrochemical sensor array to measure beef odors, aiming to distinguish odors under different storage days and processing temperatures for quality monitoring. Six storage days ranged from purchase (D0) to eight days (D8), with three temperature conditions: no heat (RT), boiling (100 °C), and frying (180 °C). Gas chromatography-mass spectrometry (GC-MS) analysis showed that odorants in the beef varied under different conditions. Compounds like acetoin and 1-hexanol changed significantly with the storage days, while pyrazines and furans were more detectable at higher temperatures. The odor sensing system data were visualized using principal component analysis (PCA) and uniform manifold approximation and projection (UMAP). PCA and unsupervised UMAP clustered beef odors by storage days but struggled with the processing temperatures. Supervised UMAP accurately clustered different temperatures and dates. Machine learning analysis using six classifiers, including support vector machine, achieved 57% accuracy for PCA-reduced data, while unsupervised UMAP reached 49.1% accuracy. Supervised UMAP significantly enhanced the classification accuracy, achieving over 99.5% with the dimensionality reduced to three or above. Results suggest that the odor sensing system can sufficiently enhance non-destructive beef quality and safety monitoring. This research advances electronic nose applications and explores data downscaling techniques, providing valuable insights for future studies.
Assuntos
Cromatografia Gasosa-Espectrometria de Massas , Odorantes , Análise de Componente Principal , Temperatura , Odorantes/análise , Bovinos , Animais , Cromatografia Gasosa-Espectrometria de Massas/métodos , Armazenamento de Alimentos/métodos , Nariz Eletrônico , Carne Vermelha/análise , Máquina de Vetores de SuporteRESUMO
Continuous smoking leads to adaptive regulation and physiological changes in lung tissue and cells, and is an inductive factor for many diseases, making smokers face the risk of malignant and nonmalignant diseases. The impact of research in this area is getting more and more in-depth, but the stimulant effect, mechanism of action and response mechanism of the main cells in the lungs caused by smoke components have not yet been fully elucidated, and the early diagnosis and identification of various diseases induced by smoke toxins have not yet formed a systematic relationship method. In this study, single-cell transcriptome data were generated from three lung samples of smokers and nonsmokers through scRNA-seq technology, revealing the influence of smoking on lung tissue and cells and the changes in immune response. The results show that: through UMAP cell clustering, 16 intermediate cell states of 23 cell clusters of the four main cell types in the lung are revealed, the differences of the main cell groups between smokers and nonsmokers are explained, and the human lung cells are clarified. Components and their marker genes, screen for new marker genes that can be used in the evolution of intermediate-state cells, and at the same time, the analysis of lung cell subgroups reveals the changes in the intermediate state of cells under smoke stimulation, forming a subtype intermediate state cell map. Pseudo-time ordering analysis, to determine the pattern of dynamic processes experienced by cells, differential expression analysis of different branch cells, to clarify the expression rules of cells at different positions, to clarify the evolution process of the intermediate state of cells, and to clarify the response of lung tissue and cells to smoke components mechanism. The development of this study provides new diagnosis and treatment ideas for early disease detection, identification, disease prevention and treatment of patients with smoking-related diseases, and lays a theoretical foundation based on cell and molecular regulation.
RESUMO
A central question in sensory neuroscience is how neurons represent complex natural stimuli. This process involves multiple steps of feature extraction to obtain a condensed, categorical representation useful for classification and behaviour. It has previously been shown that central auditory neurons in the starling have composite receptive fields composed of multiple features. Whether this property is an idiosyncratic characteristic of songbirds, a group of highly specialized vocal learners or a generic property of sensory processing is unknown. To address this question, we have recorded responses from auditory cortical neurons in mice, and characterized their receptive fields using mouse ultrasonic vocalizations (USVs) as a natural and ethologically relevant stimulus and pitch-shifted starling songs as a natural but ethologically irrelevant control stimulus. We have found that these neurons display composite receptive fields with multiple excitatory and inhibitory subunits. Moreover, this was the case with either the conspecific or the heterospecific vocalizations. We then trained the sparse filtering algorithm on both classes of natural stimuli to obtain statistically optimal features, and compared the natural and artificial features using UMAP, a dimensionality-reduction algorithm previously used to analyse mouse USVs and birdsongs. We have found that the receptive-field features obtained with both types of the natural stimuli clustered together, as did the sparse-filtering features. However, the natural and artificial receptive-field features clustered mostly separately. Based on these results, our general conclusion is that composite receptive fields are not a unique characteristic of specialized vocal learners but are likely a generic property of central auditory systems. KEY POINTS: Auditory cortical neurons in the mouse have composite receptive fields with several excitatory and inhibitory features. Receptive-field features capture temporal and spectral modulations of natural stimuli. Ethological relevance of the stimulus affects the estimation of receptive-field dimensionality.
Assuntos
Córtex Auditivo , Animais , Camundongos , Córtex Auditivo/fisiologia , Percepção Auditiva/fisiologia , Estimulação Acústica/métodos , Neurônios/fisiologia , InterneurôniosRESUMO
Messenger ribonucleic acid (mRNA) vaccination against coronavirus disease 2019 (COVID-19) is an effective prevention strategy, despite a limited understanding of the molecular mechanisms underlying the host immune system and individual heterogeneity of the variable effects of mRNA vaccination. We assessed the time-series changes in the comprehensive gene expression profiles of 200 vaccinated healthcare workers by performing bulk transcriptome and bioinformatics analyses, including dimensionality reduction utilizing the uniform manifold approximation and projection (UMAP) technique. For these analyses, blood samples, including peripheral blood mononuclear cells (PBMCs), were collected from 214 vaccine recipients before vaccination (T1) and on Days 22 (T2, after second dose), 90, 180 (T3, before a booster dose), and 360 (T4, after a booster dose) after receiving the first dose of BNT162b2 vaccine (UMIN000043851). UMAP successfully visualized the main cluster of gene expression at each time point in PBMC samples (T1-T4). Through differentially expressed gene (DEG) analysis, we identified genes that showed fluctuating expression levels and gradual increases in expression levels from T1 to T4, as well as genes with increased expression levels at T4 alone. We also succeeded in dividing these cases into five types based on the changes in gene expression levels. High-throughput and temporal bulk RNA-based transcriptome analysis is a useful approach for inclusive, diverse, and cost-effective large-scale clinical studies.
Assuntos
Vacinas contra COVID-19 , COVID-19 , Humanos , Transcriptoma , Leucócitos Mononucleares , SARS-CoV-2/genética , Vacina BNT162 , COVID-19/prevenção & controle , RNA Mensageiro/genética , Perfilação da Expressão Gênica , Vacinação , Anticorpos Antivirais , Vacinas de mRNARESUMO
Declines in soil multifunctionality (e.gsoil capacity to provide food and energy) are closely related to changes in the soil microbiome (e.g., diversity) Determining ecological drivers promoting such microbiome changes is critical knowledge for protecting soil functions. However, soil-microbe interactions are highly variable within environmental gradients and may not be consistent across studies. Here we propose that analysis of community dissimilarity (ß-diversity) is a valuable tool for overviewing soil microbiome spatiotemporal changes. Indeed, ß-diversity studies at larger scales (modelling and mapping) simplify complex multivariate interactions and refine our understanding of ecological drivers by also giving the possibility of expanding the environmental scenarios. This study represents the first spatial investigation of ß-diversity in the soil microbiome of New South Wales (800,642 km2 ), Australia. We used metabarcoding soil data (16S rRNA and ITS genes) as exact sequence variants (ASVs) and UMAP (Uniform Manifold Approximation and Projection) as the distance metric. ß-Diversity maps (1000-m resolution)-concordance correlations of 0.91-0.96 and 0.91-0.95 for bacteria and fungi, respectively-showed soil biome dissimilarities driven primarily by soil chemistry-pH and effective cation exchange capacity (ECEC)-and cycles of soil temperature-land surface temperature (LST-phase and LST-amplitude). Regionally, the spatial patterns of microbes parallel the distribution of soil classes (e.g., Vertosols) beyond spatial distances and rainfall, for example. Soil classes can be valuable discriminants for monitoring approaches, for example pedogenons and pedophenons. Ultimately, cultivated soils exhibited lower richness due to declines in rare microbes which might compromise soil functions over time.
Assuntos
Microbiota , Solo , Austrália , Temperatura , RNA Ribossômico 16S/genética , Microbiologia do Solo , Microbiota/genéticaRESUMO
The mechanisms involved in the homogeneous perception of odorant mixtures remain largely unknown. With the aim of enhancing knowledge about blending and masking mixture perceptions, we focused on structure-odor relationships by combining the classification and pharmacophore approaches. We built a dataset of about 5000 molecules and their related odors and reduced the multidimensional space defined by 1014 fingerprints representing the structures to a tridimensional 3D space using uniform manifold approximation and projection (UMAP). The self-organizing map (SOM) classification was then performed using the 3D coordinates in the UMAP space that defined specific clusters. We explored the allocating in these clusters of the components of two aroma mixtures: a blended mixture (red cordial (RC) mixture, 6 molecules) and a masking binary mixture (isoamyl acetate/whiskey-lactone [IA/WL]). Focusing on clusters containing the components of the mixtures, we looked at the odor notes carried by the molecules belonging to these clusters and also at their structural features by pharmacophore modeling (PHASE). The obtained pharmacophore models suggest that WL and IA could have a common binding site(s) at the peripheral level, but that would be excluded for the components of RC. In vitro experiments will soon be carried out to assess these hypotheses.
Assuntos
Percepção Olfatória , Odorantes , Farmacóforo , Algoritmos , OlfatoRESUMO
INTRODUCTION: Periprosthetic joint infection (PJI) is a serious complication after total joint arthroplasty. It is important to accurately identify PJI and monitor postoperative blood biochemical marker changes for the appropriate treatment strategy. In this study, we aimed to monitor the postoperative blood biochemical characteristics of PJI by contrasting with non-PJI joint replacement cases to understand how the characteristics change postoperatively. MATERIALS AND METHODS: A total of 144 cases (52 of PJI and 92 of non-PJI) were reviewed retrospectively and split into development and validation cohorts. After exclusion of 11 cases, a total of 133 (PJI: 50, non-PJI: 83) cases were enrolled finally. An RF classifier was developed to discriminate between PJI and non-PJI cases based on 18 preoperative blood biochemical tests. We evaluated the similarity/dissimilarity between cases based on the RF model and embedded the cases in a two-dimensional space by Uniform Manifold Approximation and Projection (UMAP). The RF model developed based on preoperative data was also applied to the same 18 blood biochemical tests at 3, 6, and 12 months after surgery to analyze postoperative pathological changes in PJI and non-PJI. A Markov chain model was applied to calculate the transition probabilities between the two clusters after surgery. RESULTS: PJI and non-PJI were discriminated with the RF classifier with the area under the receiver operating characteristic curve of 0.778. C-reactive protein, total protein, and blood urea nitrogen were identified as the important factors that discriminates between PJI and non-PJI patients. Two clusters corresponding to the high- and low-risk populations of PJI were identified in the UMAP embedding. The high-risk cluster, which included a high proportion of PJI patients, was characterized by higher CRP and lower hemoglobin. The frequency of postoperative recurrence to the high-risk cluster was higher in PJI than in non-PJI. CONCLUSIONS: Although there was overlap between PJI and non-PJI, we were able to identify subgroups of PJI in the UMAP embedding. The machine-learning-based analytical approach is promising in consecutive monitoring of diseases such as PJI with a low incidence and long-term course.
Assuntos
Artrite Infecciosa , Artroplastia de Quadril , Artroplastia do Joelho , Infecções Relacionadas à Prótese , Humanos , Artroplastia do Joelho/efeitos adversos , Estudos Retrospectivos , Infecções Relacionadas à Prótese/diagnóstico , Infecções Relacionadas à Prótese/etiologia , Biomarcadores , Proteína C-Reativa/análise , Artrite Infecciosa/etiologia , Artroplastia de Quadril/efeitos adversosRESUMO
The outbreak of the COVID-19 pandemic has transpired the global media to gallop with reports and news on the novel Coronavirus. The intensity of the news chatter on various aspects of the pandemic, in conjunction with the sentiment of the same, accounts for the uncertainty of investors linked to financial markets. In this research, Artificial Intelligence (AI) driven frameworks have been propounded to gauge the proliferation of COVID-19 news towards Indian stock markets through the lens of predictive modelling. Two hybrid predictive frameworks, UMAP-LSTM and ISOMAP-GBR, have been constructed to accurately forecast the daily stock prices of 10 Indian companies of different industry verticals using several systematic media chatter indices related to the COVID-19 pandemic alongside several orthodox technical indicators and macroeconomic variables. The outcome of the rigorous predictive exercise rationalizes the utility of monitoring relevant media news worldwide and in India. Additional model interpretation using Explainable AI (XAI) methodologies indicates that a high quantum of overall media hype, media coverage, fake news, etc., leads to bearish market regimes.
RESUMO
Pediatric SARS-CoV-2 infection is often mild or asymptomatic and the immune responses of children are understudied compared to adults. Here, we present and evaluate the performance of a two-panel (16- and 17 parameter) flow cytometry-based approach for immune phenotypic analysis of cryopreserved PBMC samples from children after SARS-CoV-2 infection. The panels were optimized based on previous SARS-CoV-2 related studies for the pediatric immune system. PBMC samples from seven SARS-CoV-2 seropositive children from early 2020 and five age-matched healthy controls were stained for analysis of T-cells (panel T), B and innate immune cells (panel B). Performance of the panels was evaluated in two parallel approaches, namely classical manual gating of known subpopulations and unbiased clustering using the R-based algorithm PhenoGraph. Using manual gating we clearly identified 14 predefined subpopulations of interest for panel T and 19 populations in panel B in low-volume pediatric samples. PhenoGraph found 18 clusters within the T-cell panel and 21 clusters within the innate and B-cell panel that could be unmistakably annotated. Combining the data of the two panels and analysis approaches, we found expected differentially abundant clusters in SARS-CoV-2 seropositive children compared to healthy controls, underscoring the value of these two panels for the analysis of immune response to SARS-CoV-2. We established a two-panel flow cytometry approach that can be used with limited amounts of cryopreserved pediatric samples. Our workflow allowed for a rapid, comprehensive, and robust pediatric immune phenotyping with comparable performance in manual gating and unbiased clustering. These panels may be adapted for large multi-center cohort studies to investigate the pediatric immune response to emerging virus variants in the ongoing and future pandemics.
Assuntos
COVID-19 , SARS-CoV-2 , Criança , Citometria de Fluxo , Humanos , Imunidade , Leucócitos MononuclearesRESUMO
As the size and complexity of high-dimensional (HD) cytometry data continue to expand, comprehensive, scalable, and methodical computational analysis approaches are essential. Yet, contemporary clustering and dimensionality reduction tools alone are insufficient to analyze or reproduce analyses across large numbers of samples, batches, or experiments. Moreover, approaches that allow for the integration of data across batches or experiments are not well incorporated into computational toolkits to allow for streamlined workflows. Here we present Spectre, an R package that enables comprehensive end-to-end integration and analysis of HD cytometry data from different batches or experiments. Spectre streamlines the analytical stages of raw data pre-processing, batch alignment, data integration, clustering, dimensionality reduction, visualization, and population labelling, as well as quantitative and statistical analysis. Critically, the fundamental data structures used within Spectre, along with the implementation of machine learning classifiers, allow for the scalable analysis of very large HD datasets, generated by flow cytometry, mass cytometry, or spectral cytometry. Using open and flexible data structures, Spectre can also be used to analyze data generated by single-cell RNA sequencing or HD imaging technologies, such as Imaging Mass Cytometry. The simple, clear, and modular design of analysis workflows allow these tools to be used by bioinformaticians and laboratory scientists alike. Spectre is available as an R package or Docker container. R code is available on Github (https://github.com/immunedynamics/spectre).
Assuntos
Algoritmos , Análise de Célula Única , Análise por Conglomerados , Citometria de Fluxo/métodos , SoftwareRESUMO
BACKGROUND: The manual detection, analysis and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighbourhood-based dimensionality reduction of spectrograms to produce a latent space representation of calls stands out for its conceptual simplicity and effectiveness. Goal of the study/what was done: Using a dataset of manually annotated meerkat Suricata suricatta vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyse strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabelled calls. What this means: All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.
Assuntos
Herpestidae , Vocalização Animal , AnimaisRESUMO
Hierarchical density-based spatial clustering of applications with noise (HDBSCAN) and uniform manifold approximation and projection (UMAP), two new state-of-the-art algorithms for clustering analysis, and dimensionality reduction, respectively, are proposed for the segmentation of core-loss electron energy loss spectroscopy (EELS) spectrum images. The performances of UMAP and HDBSCAN are systematically compared to the other clustering analysis approaches used in EELS in the literature using a known synthetic dataset. Better results are found for these new approaches. Furthermore, UMAP and HDBSCAN are showcased in a real experimental dataset from a coreshell nanoparticle of iron and manganese oxides, as well as the triple combination nonnegative matrix factorizationUMAPHDBSCAN. The results obtained indicate how the complementary use of different combinations may be beneficial in a real-case scenario to attain a complete picture, as different algorithms highlight different aspects of the dataset studied.
RESUMO
BACKGROUND: Due to continued advances in sequencing technology, the limitation in understanding biological systems through an "-omics" lens is no longer the generation of data, but the ability to analyze it. Importantly, much of this rich -omics data is publicly available waiting to be further investigated. Although many code-based pipelines exist, there is a lack of user-friendly and accessible applications that enable rapid analysis or visualization of data. RESULTS: GECO (Gene Expression Clustering Optimization; http://www.theGECOapp.com ) is a minimalistic GUI app that utilizes non-linear reduction techniques to rapidly visualize expression trends in many types of biological data matrices (such as bulk RNA-seq or proteomics). The required input is a data matrix with samples and any type of expression level of genes/protein/other with a unique ID. The output is an interactive t-SNE or UMAP analysis that clusters genes (or proteins/other unique IDs) based on their expression patterns across the multiple samples enabling visualization of expression trends. Customizable settings for dimensionality reduction, data normalization, along with visualization parameters including coloring and filters, ensure adaptability to a variety of user uploaded data. CONCLUSION: This local and cloud-hosted web browser app enables investigation of any -omic data matrix in a rapid and code-independent manner. With the continued growth of available -omic data, the ability to quickly evaluate a dataset, including specific genes of interest, is more important than ever. GECO is intended to supplement traditional statistical analysis methods and is particularly useful when visualizing clusters of genes with similar trajectories across many samples (ex: multiple cell types, time course, dose response). Users will be empowered to investigate -omic data with a new lens of visualization and analysis that has the potential to uncover genes of interest, cohorts of co-regulated genes programs, and previously undetected patterns of expression.
Assuntos
Análise por Conglomerados , Visualização de Dados , Expressão Gênica , Análise de Sequência de RNA , SoftwareRESUMO
Poor translatability of animal disease models has hampered the development of new inflammatory bowel disorder (IBD) therapeutics. We describe a preclinical, ex vivo system using freshly obtained and well-characterized human colorectal tissue from patients with ulcerative colitis (UC) and healthy control (HC) participants to test potential therapeutics for efficacy and target engagement, using the JAK/STAT inhibitor tofacitinib (TOFA) as a model therapeutic. Colorectal biopsies from HC participants and patients with UC were cultured and stimulated with multiple mitogens ± TOFA. Soluble biomarkers were detected using a 29-analyte multiplex ELISA. Target engagement in CD3+CD4+ and CD3+CD8+ T-cells was determined by flow cytometry in peripheral blood mononuclear cells (PBMCs) and isolated mucosal mononuclear cells (MMCs) following the activation of STAT1/3 phosphorylation. Data were analyzed using linear mixed-effects modeling, t test, and analysis of variance. Biomarker selection was performed using penalized and Bayesian logistic regression modeling, with results visualized using uniform manifold approximation and projection. Under baseline conditions, 27 of 29 biomarkers from patients with UC were increased versus HC participants. Explant stimulation increased biomarker release magnitude, expanding the dynamic range for efficacy and target engagement studies. Logistic regression analyses identified the most representative UC baseline and stimulated biomarkers. TOFA inhibited biomarkers dependent on JAK/STAT signaling. STAT1/3 phosphorylation in T-cells revealed compartmental differences between PBMCs and MMCs. Immunogen stimulation increases biomarker release in similar patterns for HC participants and patients with UC, while enhancing the dynamic range for pharmacological effects. This work demonstrates the power of ex vivo human colorectal tissue as preclinical tools for evaluating target engagement and downstream effects of new IBD therapeutic agents.NEW & NOTEWORTHY Using colorectal biopsy material from healthy volunteers and patients with clinically defined IBD supports translational research by informing the evaluation of therapeutic efficacy and target engagement for the development of new therapeutic entities. Combining experimental readouts from intact and dissociated tissue enhances our understanding of the tissue-resident immune system that contribute to disease pathology. Bayesian logistic regression modeling is an effective tool for predicting ex vivo explant biomarker release patterns.
Assuntos
Colite Ulcerativa/metabolismo , Citocinas/metabolismo , Mucosa Intestinal/efeitos dos fármacos , Piperidinas/farmacologia , Inibidores de Proteínas Quinases/farmacologia , Pirimidinas/farmacologia , Linfócitos T/efeitos dos fármacos , Teorema de Bayes , Biomarcadores , Colite Ulcerativa/patologia , Citocinas/antagonistas & inibidores , Citocinas/genética , Regulação da Expressão Gênica/efeitos dos fármacos , Humanos , Mucosa Intestinal/metabolismo , Mucosa Intestinal/patologia , Janus Quinases/genética , Janus Quinases/metabolismo , Fator de Transcrição STAT1 , Fator de Transcrição STAT3 , Linfócitos T/metabolismoRESUMO
Studies on the interactions between SARS-CoV-2 and humoral immunity are fundamental to elaborate effective therapies including vaccines. We used polychromatic flow cytometry, coupled with unsupervised data analysis and principal component analysis (PCA), to interrogate B cells in untreated patients with COVID-19 pneumonia. COVID-19 patients displayed normal plasma levels of the main immunoglobulin classes, of antibodies against common antigens or against antigens present in common vaccines. However, we found a decreased number of total and naïve B cells, along with decreased percentages and numbers of memory switched and unswitched B cells. On the contrary, IgM+ and IgM- plasmablasts were significantly increased. In vitro cell activation revealed that B lymphocytes showed a normal proliferation index and number of dividing cells per cycle. PCA indicated that B-cell number, naive and memory B cells but not plasmablasts clustered with patients who were discharged, while plasma IgM level, C-reactive protein, D-dimer, and SOFA score with those who died. In patients with pneumonia, the derangement of the B-cell compartment could be one of the causes of the immunological failure to control SARS-Cov2, have a relevant influence on several pathways, organs and systems, and must be considered to develop vaccine strategies.