RESUMEN
BACKGROUND: PVT1, a previously uncharacterized lncRNA, was identified as a critical regulator involved in multiple functions in tumor, including cell proliferation, cell motility, angiogenesis and so on. However, the clinical significance and underlying mechanism of PVT1 was not be fully explored in glioma. METHODS: In this study, 1210 glioma samples with transcriptome data from three independent databases (CGGA RNA-seq, TCGA RNA-seq and GSE16011 cohorts) were enrolled in this study. Clinical information and genomic profiles containing somatic mutations and DNA copy numbers were collected from TCGA cohort. The R software was performed for statistical calculations and graphics. Furthermore, we validated the function of PVT1 in vitro. RESULTS: The results indicated that higher PVT1 expression was associated with aggressive progression of glioma. Cases with higher PVT1 expression always accompanied by PTEN and EGFR alteration. In addition, functional analyses and western blot results suggested that PVT1 inhibited the sensitivity of TMZ chemotherapy via JAK/STAT signaling. Meanwhile, knockdown of PVT1 increased the sensitivity of TZM chemotherapy in vitro. Finally, high PVT1 expression was associated with reduced survival time and may serve as a strong prognostic indicator for gliomas. CONCLUSIONS: This study demonstrated that PVT1 expression strongly correlated with tumor progression and chemo-resistance. PVT1 may become a potential biomarker for the diagnosis and treatment in glioma.
Asunto(s)
Resistencia a Antineoplásicos , Regulación Neoplásica de la Expresión Génica , Glioma , ARN Largo no Codificante , Temozolomida , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo , Temozolomida/farmacología , Temozolomida/uso terapéutico , Resistencia a Antineoplásicos/genética , Técnicas de Silenciamiento del Gen , Carcinogénesis/genética , Regulación Neoplásica de la Expresión Génica/genética , Glioma/tratamiento farmacológico , Glioma/genética , Glioma/fisiopatología , Análisis de Supervivencia , Factores de Transcripción STAT/metabolismo , Quinasas Janus/metabolismo , Línea Celular Tumoral , Humanos , Masculino , Femenino , Adulto , Persona de Mediana Edad , Biomarcadores de Tumor/metabolismoRESUMEN
PURPOSE: Spinal ependymoma (SE) is a rare tumor that is most commonly low-grade and tends to recur when complete tumor resection is not feasible. We investigated the molecular mechanism induces stem cell features in SE. METHODS: Immunohistochemical staining was conducted to analyze the expression of RFX2 in tumor tissues of SE patients at different stages. The expression of tumor stemness markers (Netsin and CD133) was analyzed using western blot analysis and IF, and the efficiency of sphere formation in SE cells was analyzed. The biological activities of SE cells were analyzed by EdU proliferation assay, TUNEL, wound healing, and Transwell assays. The regulatory relationship of RFX2 on PAF1 was verified by ChIP-qPCR and the dual-luciferase assay. SE cells were injected into the spinal cord of nude mice for in vivo assays. RESULTS: RFX2 was higher in the tumor tissues of SE-III patients than in the tumor tissues of SE-I patients. RFX2 knockdown reduced the expression of tumor stemness markers in SE cells and inhibited the sphere formation efficiency. Moreover, RFX2 knockdown ameliorated the malignant progression of SE in nude mice, as manifested by prolonged survival and alleviated SE tumor infiltration. RFX2 bound to the PAF1 promoter to induce its transcription. Overexpression of PAF1 overturned the effects of RFX2 knockdown on stem cell features and biological activities of SE cells, thereby reducing survival in mice. CONCLUSIONS: RFX2 activates PAF1 transcription, which promotes tumor stemness of SE cells and leads to the malignant progression of SE.
Asunto(s)
Ependimoma , Epigénesis Genética , Humanos , Animales , Ratones , Ratones Desnudos , Línea Celular Tumoral , Recurrencia Local de Neoplasia/patología , Ependimoma/patología , Proliferación Celular , Regulación Neoplásica de la Expresión Génica , Células Madre Neoplásicas/patología , Factores de Transcripción del Factor Regulador X/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismoRESUMEN
Uncertainty of scalar values in an ensemble dataset is often represented by the collection of their corresponding isocontours. Various techniques such as contour-boxplot, contour variability plot, glyphs and probabilistic marching-cubes have been proposed to analyze and visualize ensemble isocontours. All these techniques assume that a scalar value of interest is already known to the user. Not much work has been done in guiding users to select the scalar values for such uncertainty analysis. Moreover, analyzing and visualizing a large collection of ensemble isocontours for a selected scalar value has its own challenges. Interpreting the visualizations of such large collections of isocontours is also a difficult task. In this work, we propose a new information-theoretic approach towards addressing these issues. Using specific information measures that estimate the predictability and surprise of specific scalar values, we evaluate the overall uncertainty associated with all the scalar values in an ensemble system. This helps the scientist to understand the effects of uncertainty on different data features. To understand in finer details the contribution of individual members towards the uncertainty of the ensemble isocontours of a selected scalar value, we propose a conditional entropy based algorithm to quantify the individual contributions. This can help simplify analysis and visualization for systems with more members by identifying the members contributing the most towards overall uncertainty. We demonstrate the efficacy of our method by applying it on real-world datasets from material sciences, weather forecasting and ocean simulation experiments.
RESUMEN
Although many deep-learning-based super-resolution approaches have been proposed in recent years, because no ground truth is available in the inference stage, few can quantify the errors and uncertainties of the super-resolved results. For scientific visualization applications, however, conveying uncertainties of the results to scientists is crucial to avoid generating misleading or incorrect information. In this paper, we propose PSRFlow, a novel normalizing flow-based generative model for scientific data super-resolution that incorporates uncertainty quantification into the super-resolution process. PSRFlow learns the conditional distribution of the high-resolution data based on the low-resolution counterpart. By sampling from a Gaussian latent space that captures the missing information in the high-resolution data, one can generate different plausible super-resolution outputs. The efficient sampling in the Gaussian latent space allows our model to perform uncertainty quantification for the super-resolved results. During model training, we augment the training data with samples across various scales to make the model adaptable to data of different scales, achieving flexible super-resolution for a given input. Our results demonstrate superior performance and robust uncertainty quantification compared with existing methods such as interpolation and GAN-based super-resolution networks.
RESUMEN
Implicit Neural representations (INRs) are widely used for scientific data reduction and visualization by modeling the function that maps a spatial location to a data value. Without any prior knowledge about the spatial distribution of values, we are forced to sample densely from INRs to perform visualization tasks like iso-surface extraction which can be very computationally expensive. Recently, range analysis has shown promising results in improving the efficiency of geometric queries, such as ray casting and hierarchical mesh extraction, on INRs for 3D geometries by using arithmetic rules to bound the output range of the network within a spatial region. However, the analysis bounds are often too conservative for complex scientific data. In this paper, we present an improved technique for range analysis by revisiting the arithmetic rules and analyzing the probability distribution of the network output within a spatial region. We model this distribution efficiently as a Gaussian distribution by applying the central limit theorem. Excluding low probability values, we are able to tighten the output bounds, resulting in a more accurate estimation of the value range, and hence more accurate identification of iso-surface cells and more efficient iso-surface extraction on INRs. Our approach demonstrates superior performance in terms of the iso-surface extraction time on four datasets compared to the original range analysis method and can also be generalized to other geometric query tasks.
RESUMEN
Existing deep learning-based surrogate models facilitate efficient data generation, but fall short in uncertainty quantification, efficient parameter space exploration, and reverse prediction. In our work, we introduce SurroFlow, a novel normalizing flow-based surrogate model, to learn the invertible transformation between simulation parameters and simulation outputs. The model not only allows accurate predictions of simulation outcomes for a given simulation parameter but also supports uncertainty quantification in the data generation process. Additionally, it enables efficient simulation parameter recommendation and exploration. We integrate SurroFlow and a genetic algorithm as the backend of a visual interface to support effective user-guided ensemble simulation exploration and visualization. Our framework significantly reduces the computational costs while enhancing the reliability and exploration capabilities of scientific surrogate models.
RESUMEN
Scene representation networks (SRNs) have been recently proposed for compression and visualization of scientific data. However, state-of-the-art SRNs do not adapt the allocation of available network parameters to the complex features found in scientific data, leading to a loss in reconstruction quality. We address this shortcoming with an adaptively placed multi-grid SRN (APMGSRN) and propose a domain decomposition training and inference technique for accelerated parallel training on multi-GPU systems. We also release an open-source neural volume rendering application that allows plug-and-play rendering with any PyTorch-based SRN. Our proposed APMGSRN architecture uses multiple spatially adaptive feature grids that learn where to be placed within the domain to dynamically allocate more neural network resources where error is high in the volume, improving state-of-the-art reconstruction accuracy of SRNs for scientific data without requiring expensive octree refining, pruning, and traversal like previous adaptive models. In our domain decomposition approach for representing large-scale data, we train an set of APMGSRNs in parallel on separate bricks of the volume to reduce training time while avoiding overhead necessary for an out-of-core solution for volumes too large to fit in GPU memory. After training, the lightweight SRNs are used for realtime neural volume rendering in our open-source renderer, where arbitrary view angles and transfer functions can be explored. A copy of this paper, all code, all models used in our experiments, and all supplemental materials and videos are available at https://github.com/skywolf829/APMGSRN.
RESUMEN
In the biomedical domain, visualizing the document embeddings of an extensive corpus has been widely used in informationseeking tasks. However, three key challenges with existing visualizations make it difficult for clinicians to find information efficiently. First, the document embeddings used in these visualizations are generated statically by pretrained language models, which cannot adapt to the user's evolving interest. Second, existing document visualization techniques cannot effectively display how the documents are relevant to users' interest, making it difficult for users to identify the most pertinent information. Third, existing embedding generation and visualization processes suffer from a lack of interpretability, making it difficult to understand, trust and use the result for decision-making. In this paper, we present a novel visual analytics pipeline for user-driven document representation and iterative information seeking (VADIS). VADIS introduces a prompt-based attention model (PAM) that generates dynamic document embedding and document relevance adjusted to the user's query. To effectively visualize these two pieces of information, we design a new document map that leverages a circular grid layout to display documents based on both their relevance to the query and the semantic similarity. Additionally, to improve the interpretability, we introduce a corpus-level attention visualization method to improve the user's understanding of the model focus and to enable the users to identify potential oversight. This visualization, in turn, empowers users to refine, update and introduce new queries, thereby facilitating a dynamic and iterative information-seeking experience. We evaluated VADIS quantitatively and qualitatively on a real-world dataset of biomedical research papers to demonstrate its effectiveness.
RESUMEN
Feature grid Scene Representation Networks (SRNs) have been applied to scientific data as compact functional surrogates for analysis and visualization. As SRNs are black-box lossy data representations, assessing the prediction quality is critical for scientific visualization applications to ensure that scientists can trust the information being visualized. Currently, existing architectures do not support inference time reconstruction quality assessment, as coordinate-level errors cannot be evaluated in the absence of ground truth data. By employing the uncertain neural network architecture in feature grid SRNs, we obtain prediction variances during inference time to facilitate confidence-aware data reconstruction. Specifically, we propose a parameter-efficient multi-decoder SRN (MDSRN) architecture consisting of a shared feature grid with multiple lightweight multi-layer perceptron decoders. MDSRN can generate a set of plausible predictions for a given input coordinate to compute the mean as the prediction of the multi-decoder ensemble and the variance as a confidence score. The coordinate-level variance can be rendered along with the data to inform the reconstruction quality, or be integrated into uncertainty-aware volume visualization algorithms. To prevent the misalignment between the quantified variance and the prediction quality, we propose a novel variance regularization loss for ensemble learning that promotes the Regularized multi-decoder SRN (RMDSRN) to obtain a more reliable variance that correlates closely to the true model error. We comprehensively evaluate the quality of variance quantification and data reconstruction of Monte Carlo Dropout (MCD), Mean Field Variational Inference (MFVI), Deep Ensemble (DE), and Predicting Variance (PV) in comparison with our proposed MDSRN and RMDSRN applied to state-of-the-art feature grid SRNs across diverse scalar field datasets. We demonstrate that RMDSRN attains the most accurate data reconstruction and competitive variance-error correlation among uncertain SRNs under the same neural network parameter budgets. Furthermore, we present an adaptation of uncertainty-aware volume rendering and shed light on the potential of incorporating uncertain predictions in improving the quality of volume rendering for uncertain SRNs. Through ablation studies on the regularization strength and decoder count, we show that MDSRN and RMDSRN are expected to perform sufficiently well with a default configuration without requiring customized hyperparameter settings for different datasets.
RESUMEN
Feature related particle data analysis plays an important role in many scientific applications such as fluid simulations, cosmology simulations and molecular dynamics. Compared to conventional methods that use hand-crafted feature descriptors, some recent studies focus on transforming the data into a new latent space, where features are easier to be identified, compared and extracted. However, it is challenging to transform particle data into latent representations, since the convolution neural networks used in prior studies require the data presented in regular grids. In this article, we adopt Geometric Convolution, a neural network building block designed for 3D point clouds, to create latent representations for scientific particle data. These latent representations capture both the particle positions and their physical attributes in the local neighborhood so that features can be extracted by clustering in the latent space, and tracked by applying tracking algorithms such as mean-shift. We validate the extracted features and tracking results from our approach using datasets from three applications and show that they are comparable to the methods that define hand-crafted features for each specific dataset.
RESUMEN
While a multitude of studies have been conducted on graph drawing, many existing methods only focus on optimizing a single aesthetic aspect of graph layouts, which can lead to sub-optimal results. There are a few existing methods that have attempted to develop a flexible solution for optimizing different aesthetic aspects measured by different aesthetic criteria. Furthermore, thanks to the significant advance in deep learning techniques, several deep learning-based layout methods were proposed recently. These methods have demonstrated the advantages of deep learning approaches for graph drawing. However, none of these existing methods can be directly applied to optimizing non-differentiable criteria without special accommodation. In this work, we propose a novel Generative Adversarial Network (GAN) based deep learning framework for graph drawing, called, which can optimize different quantitative aesthetic goals, regardless of their differentiability. To demonstrate the effectiveness and efficiency of, we conducted experiments on minimizing stress, minimizing edge crossing, maximizing crossing angle, maximizing shape-based metrics, and a combination of multiple aesthetics. Compared with several popular graph drawing algorithms, the experimental results show that achieves good performance both quantitatively and qualitatively.
RESUMEN
Deep learning based latent representations have been widely used for numerous scientific visualization applications such as isosurface similarity analysis, volume rendering, flow field synthesis, and data reduction, just to name a few. However, existing latent representations are mostly generated from raw data in an unsupervised manner, which makes it difficult to incorporate domain interest to control the size of the latent representations and the quality of the reconstructed data. In this paper, we present a novel importance-driven latent representation to facilitate domain-interest-guided scientific data visualization and analysis. We utilize spatial importance maps to represent various scientific interests and take them as the input to a feature transformation network to guide latent generation. We further reduced the latent size by a lossless entropy encoding algorithm trained together with the autoencoder, improving the storage and memory efficiency. We qualitatively and quantitatively evaluate the effectiveness and efficiency of latent representations generated by our method with data from multiple scientific visualization applications.
RESUMEN
We present a novel technique for hierarchical super resolution (SR) with neural networks (NNs), which upscales volumetric data represented with an octree data structure to a high-resolution uniform gridwith minimal seam artifacts on octree node boundaries. Our method uses existing state-of-the-art SR models and adds flexibility to upscale input data with varying levels of detail across the domain, instead of only uniform grid data that are supported in previous approaches.The key is to use a hierarchy of SR NNs, each trained to perform 2× SR between two levels of detail, with a hierarchical SR algorithm that minimizes seam artifacts by starting from the coarsest level of detail and working up.We show that our hierarchical approach outperforms baseline interpolation and hierarchical upscaling methods, and demonstrate the usefulness of our proposed approach across three use cases including data reduction using hierarchical downsampling+SR instead of uniform downsampling+SR, computation savings for hierarchical finite-time Lyapunov exponent field calculation, and super-resolving low-resolution simulation results for a high-resolution approximation visualization.
RESUMEN
We explore an online reinforcement learning (RL) paradigm to dynamically optimize parallel particle tracing performance in distributed-memory systems. Our method combines three novel components: (1) a work donation algorithm, (2) a high-order workload estimation model, and (3) a communication cost model. First, we design an RL-based work donation algorithm. Our algorithm monitors workloads of processes and creates RL agents to donate data blocks and particles from high-workload processes to low-workload processes to minimize program execution time. The agents learn the donation strategy on the fly based on reward and cost functions designed to consider processes' workload changes and data transfer costs of donation actions. Second, we propose a workload estimation model, helping RL agents estimate the workload distribution of processes in future computations. Third, we design a communication cost model that considers both block and particle data exchange costs, helping RL agents make effective decisions with minimized communication costs. We demonstrate that our algorithm adapts to different flow behaviors in large-scale fluid dynamics, ocean, and weather simulation data. Our algorithm improves parallel particle tracing performance in terms of parallel efficiency, load balance, and costs of I/O and communication for evaluations with up to 16,384 processors.
RESUMEN
We propose VDL-Surrogate, a view-dependent neural-network-latent-based surrogate model for parameter space exploration of ensemble simulations that allows high-resolution visualizations and user-specified visual mappings. Surrogate-enabled parameter space exploration allows domain scientists to preview simulation results without having to run a large number of computationally costly simulations. Limited by computational resources, however, existing surrogate models may not produce previews with sufficient resolution for visualization and analysis. To improve the efficient use of computational resources and support high-resolution exploration, we perform ray casting from different viewpoints to collect samples and produce compact latent representations. This latent encoding process reduces the cost of surrogate model training while maintaining the output quality. In the model training stage, we select viewpoints to cover the whole viewing sphere and train corresponding VDL-Surrogate models for the selected viewpoints. In the model inference stage, we predict the latent representations at previously selected viewpoints and decode the latent representations to data space. For any given viewpoint, we make interpolations over decoded data at selected viewpoints and generate visualizations with user-specified visual mappings. We show the effectiveness and efficiency of VDL-Surrogate in cosmological and ocean simulations with quantitative and qualitative evaluations. Source code is publicly available at https://github.com/trainsn/VDL-Surrogate.
RESUMEN
Public opinion surveys constitute a widespread, powerful tool to study peoples' attitudes and behaviors from comparative perspectives. However, even global surveys can have limited geographic and temporal coverage, which can hinder the production of comprehensive knowledge. To expand the scope of comparison, social scientists turn to ex-post harmonization of variables from datasets that cover similar topics but in different populations and/or at different times. These harmonized datasets can be analyzed as a single source and accessed through various data portals. However, the Survey Data Recycling (SDR) research project has identified three challenges faced by social scientists when using data portals: the lack of capability to explore data in-depth or query data based on customized needs, the difficulty in efficiently identifying related data for studies, and the incapability to evaluate theoretical models using sliced data. To address these issues, the SDR research project has developed the SDRQuerier, which is applied to the harmonized SDR database. The SDRQuerier includes a BERT-based model that allows for customized data queries through research questions or keywords (Query-by-Question), a visual design that helps users determine the availability of harmonized data for a given research question (Query-by-Condition), and the ability to reveal the underlying relational patterns among substantive and methodological variables in the database (Query-by-Relation), aiding in the rigorous evaluation or improvement of regression models. Case studies with multiple social scientists have demonstrated the usefulness and effectiveness of the SDRQuerier in addressing daily challenges.
Asunto(s)
Gráficos por Computador , Bases de Datos FactualesRESUMEN
The Internet of Food (IoF) is an emerging field in smart foodsheds, involving the creation of a knowledge graph (KG) about the environment, agriculture, food, diet, and health. However, the heterogeneity and size of the KG present challenges for downstream tasks, such as information retrieval and interactive exploration. To address those challenges, we propose an interactive knowledge and learning environment (IKLE) that integrates three programming and modeling languages to support multiple downstream tasks in the analysis pipeline. To make IKLE easier to use, we have developed algorithms to automate the generation of each language. In addition, we collaborated with domain experts to design and develop a dataflow visualization system, which embeds the automatic language generations into components and allows users to build their analysis pipeline by dragging and connecting components of interest. We have demonstrated the effectiveness of IKLE through three real-world case studies in smart foodsheds.
RESUMEN
Viscous and gravitational flow instabilities cause a displacement front to break up into finger-like fluids. The detection and evolutionary analysis of these fingering instabilities are critical in multiple scientific disciplines such as fluid mechanics and hydrogeology. However, previous detection methods of the viscous and gravitational fingers are based on density thresholding, which provides limited geometric information of the fingers. The geometric structures of fingers and their evolution are important yet little studied in the literature. In this article, we explore the geometric detection and evolution of the fingers in detail to elucidate the dynamics of the instability. We propose a ridge voxel detection method to guide the extraction of finger cores from three-dimensional (3D) scalar fields. After skeletonizing finger cores into skeletons, we design a spanning tree based approach to capture how fingers branch spatially from the finger skeletons. Finally, we devise a novel geometric-glyph augmented tracking graph to study how the fingers and their branches grow, merge, and split over time. Feedback from earth scientists demonstrates the usefulness of our approach to performing spatio-temporal geometric analyses of fingers.
RESUMEN
Cdc42, a conserved Rho GTPase, plays a central role in polarity establishment in yeast and animals. Cell polarity is critical for asymmetric cell division, and asymmetric cell division underlies replicative aging of budding yeast. Yet how Cdc42 and other polarity factors impact life span is largely unknown. Here we show by live-cell imaging that the active Cdc42 level is sporadically elevated in wild type during repeated cell divisions but rarely in the long-lived bud8 deletion cells. We find a novel Bud8 localization with cytokinesis remnants, which also recruit Rga1, a Cdc42 GTPase activating protein. Genetic analyses and live-cell imaging suggest that Rga1 and Bud8 oppositely impact life span likely by modulating active Cdc42 levels. An rga1 mutant, which has a shorter life span, dies at the unbudded state with a defect in polarity establishment. Remarkably, Cdc42 accumulates in old cells, and its mild overexpression accelerates aging with frequent symmetric cell divisions, despite no harmful effects on young cells. Our findings implicate that the interplay among these positive and negative polarity factors limits the life span of budding yeast.
Asunto(s)
Saccharomycetales , Polaridad Celular/fisiología , Proteínas Activadoras de GTPasa/metabolismo , Longevidad , Saccharomyces cerevisiae/metabolismo , Saccharomycetales/metabolismo , Regulación hacia Arriba , Proteína de Unión al GTP cdc42/metabolismo , Proteína de Unión al GTP cdc42 de Saccharomyces cerevisiae/metabolismoRESUMEN
Many Information Retrieval (IR) approaches have been proposed to extract relevant information from a large corpus. Among these methods, phrase-based retrieval methods have been proven to capture more concrete and concise information than word-based and paragraph-based methods. However, due to the complex relationship among phrases and a lack of proper visual guidance, achieving user-driven interactive information-seeking and retrieval remains challenging. In this study, we present a visual analytic approach for users to seek information from an extensive collection of documents efficiently. The main component of our approach is a PhraseMap, where nodes and edges represent the extracted keyphrases and their relationships, respectively, from a large corpus. To build the PhraseMap, we extract keyphrases from each document and link the phrases according to word attention determined using modern language models, i.e., BERT. As can be imagined, the graph is complex due to the extensive volume of information and the massive amount of relationships. Therefore, we develop a navigation algorithm to facilitate information seeking. It includes (1) a question-answering (QA) model to identify phrases related to users' queries and (2) updating relevant phrases based on users' feedback. To better present the PhraseMap, we introduce a resource-controlled self-organizing map (RC-SOM) to evenly and regularly display phrases on grid cells while expecting phrases with similar semantics to stay close in the visualization. To evaluate our approach, we conducted case studies with three domain experts in diverse literature. The results and feedback demonstrate its effectiveness, usability, and intelligence.