RESUMEN
Storing analytical provenance generates a knowledge base with a large potential for recalling previous results and guiding users in future analyses. However, without extensive manual creation of meta information and annotations by the users, search and retrieval of analysis states can become tedious. We present KnowledgePearls, a solution for efficient retrieval of analysis states that are structured as provenance graphs containing automatically recorded user interactions and visualizations. As a core component, we describe a visual interface for querying and exploring analysis states based on their similarity to a partial definition of a requested analysis state. Depending on the use case, this definition may be provided explicitly by the user by formulating a search query or inferred from given reference states. We explain our approach using the example of efficient retrieval of demographic analyses by Hans Rosling and discuss our implementation for a fast look-up of previous states. Our approach is independent of the underlying visualization framework. We discuss the applicability for visualizations which are based on the declarative grammar Vega and we use a Vega-based implementation of Gapminder as guiding example. We additionally present a biomedical case study to illustrate how KnowledgePearls facilitates the exploration process by recalling states from earlier analyses.
RESUMEN
Numerous bacterial genetic markers are available for the molecular detection of human sources of fecal pollution in environmental waters. However, widespread application is hindered by a lack of knowledge regarding geographical stability, limiting implementation to a small number of well-characterized regions. This study investigates the geographic distribution of five human-associated genetic markers (HF183/BFDrev, HF183/BacR287, BacHum-UCD, BacH, and Lachno2) in municipal wastewaters (raw and treated) from 29 urban and rural wastewater treatment plants (750-4â¯400â¯000 population equivalents) from 13 countries spanning six continents. In addition, genetic markers were tested against 280 human and nonhuman fecal samples from domesticated, agricultural and wild animal sources. Findings revealed that all genetic markers are present in consistently high concentrations in raw (median log10 7.2-8.0 marker equivalents (ME) 100 mL-1) and biologically treated wastewater samples (median log10 4.6-6.0 ME 100 mL-1) regardless of location and population. The false positive rates of the various markers in nonhuman fecal samples ranged from 5% to 47%. Results suggest that several genetic markers have considerable potential for measuring human-associated contamination in polluted environmental waters. This will be helpful in water quality monitoring, pollution modeling and health risk assessment (as demonstrated by QMRAcatch) to guide target-oriented water safety management across the globe.
Asunto(s)
Aguas Residuales , Contaminación del Agua , Animales , Monitoreo del Ambiente , Heces , Marcadores Genéticos , Humanos , Microbiología del AguaRESUMEN
Balancing accuracy gains with other objectives such as interpretability is a key challenge when building decision trees. However, this process is difficult to automate because it involves know-how about the domain as well as the purpose of the model. This paper presents TreePOD, a new approach for sensitivity-aware model selection along trade-offs. TreePOD is based on exploring a large set of candidate trees generated by sampling the parameters of tree construction algorithms. Based on this set, visualizations of quantitative and qualitative tree aspects provide a comprehensive overview of possible tree characteristics. Along trade-offs between two objectives, TreePOD provides efficient selection guidance by focusing on Pareto-optimal tree candidates. TreePOD also conveys the sensitivities of tree characteristics on variations of selected parameters by extending the tree generation process with a full-factorial sampling. We demonstrate how TreePOD supports a variety of tasks involved in decision tree selection and describe its integration in a holistic workflow for building and selecting decision trees. For evaluation, we illustrate a case study for predicting critical power grid states, and we report qualitative feedback from domain experts in the energy sector. This feedback suggests that TreePOD enables users with and without statistical background a confident and efficient identification of suitable decision trees.
RESUMEN
A common strategy in Multi-Criteria Decision Making (MCDM) is to rank alternative solutions by weighted summary scores. Weights, however, are often abstract to the decision maker and can only be set by vague intuition. While previous work supports a point-wise exploration of weight spaces, we argue that MCDM can benefit from a regional and global visual analysis of weight spaces. Our main contribution is WeightLifter, a novel interactive visualization technique for weight-based MCDM that facilitates the exploration of weight spaces with up to ten criteria. Our technique enables users to better understand the sensitivity of a decision to changes of weights, to efficiently localize weight regions where a given solution ranks high, and to filter out solutions which do not rank high enough for any plausible combination of weights. We provide a comprehensive requirement analysis for weight-based MCDM and describe an interactive workflow that meets these requirements. For evaluation, we describe a usage scenario of WeightLifter in automotive engineering and report qualitative feedback from users of a deployed version as well as preliminary feedback from decision makers in multiple domains. This feedback confirms that WeightLifter increases both the efficiency of weight-based MCDM and the awareness of uncertainty in the ultimate decisions.
RESUMEN
Trends like decentralized energy production lead to an exploding number of time series from sensors and other sources that need to be assessed regarding their data quality (DQ). While the identification of DQ problems for such routinely collected data is typically based on existing automated plausibility checks, an efficient inspection and validation of check results for hundreds or thousands of time series is challenging. The main contribution of this paper is the validated design of Visplause, a system to support an efficient inspection of DQ problems for many time series. The key idea of Visplause is to utilize meta-information concerning the semantics of both the time series and the plausibility checks for structuring and summarizing results of DQ checks in a flexible way. Linked views enable users to inspect anomalies in detail and to generate hypotheses about possible causes. The design of Visplause was guided by goals derived from a comprehensive task analysis with domain experts in the energy sector. We reflect on the design process by discussing design decisions at four stages and we identify lessons learned. We also report feedback from domain experts after using Visplause for a period of one month. This feedback suggests significant efficiency gains for DQ assessment, increased confidence in the DQ, and the applicability of Visplause to summarize indicators also outside the context of DQ.
RESUMEN
3D visibility analysis plays a key role in urban planning for assessing the visual impact of proposed buildings on the cityscape. A call for proposals typically yields around 30 candidate buildings that need to be evaluated with respect to selected viewpoints. Current visibility analysis methods are very time-consuming and limited to a small number of viewpoints. Further, analysts neither have measures to evaluate candidates quantitatively, nor to compare them efficiently. The primary contribution of this work is the design study of Vis-A-Ware, a visualization system to qualitatively and quantitatively evaluate, rank, and compare visibility data of candidate buildings with respect to a large number of viewpoints. Vis-A-Ware features a 3D spatial view of an urban scene and non-spatial views of data derived from visibility evaluations, which are tightly integrated by linked interaction. To enable a quantitative evaluation we developed four metrics in accordance with experts from urban planning. We illustrate the applicability of Vis-A-Ware on the basis of a use case scenario and present results from informal feedback sessions with domain experts from urban planning and development. This feedback suggests that Vis-A-Ware is a valuable tool for visibility analysis allowing analysts to answer complex questions more efficiently and objectively.
RESUMEN
The visual analysis of surface cracks plays an essential role in tunnel maintenance when assessing the condition of a tunnel. To identify patterns of cracks, which endanger the structural integrity of its concrete surface, analysts need an integrated solution for visual analysis of geometric and multivariate data to decide if issuing a repair project is necessary. The primary contribution of this work is a design study, supporting tunnel crack analysis by tightly integrating geometric and attribute views to allow users a holistic visual analysis of geometric representations and multivariate attributes. Our secondary contribution is Visual Analytics and Rendering, a methodological approach which addresses challenges and recurring design questions in integrated systems. We evaluated the tunnel crack analysis solution in informal feedback sessions with experts from tunnel maintenance and surveying. We substantiated the derived methodology by providing guidelines and linking it to examples from the literature.
RESUMEN
State-of-the-art lighting design is based on physically accurate lighting simulations of scenes such as offices. The simulation results support lighting designers in the creation of lighting configurations, which must meet contradicting customer objectives regarding quality and price while conforming to industry standards. However, current tools for lighting design impede rapid feedback cycles. On the one side, they decouple analysis and simulation specification. On the other side, they lack capabilities for a detailed comparison of multiple configurations. The primary contribution of this paper is a design study of LiteVis, a system for efficient decision support in lighting design. LiteVis tightly integrates global illumination-based lighting simulation, a spatial representation of the scene, and non-spatial visualizations of parameters and result indicators. This enables an efficient iterative cycle of simulation parametrization and analysis. Specifically, a novel visualization supports decision making by ranking simulated lighting configurations with regard to a weight-based prioritization of objectives that considers both spatial and non-spatial characteristics. In the spatial domain, novel concepts support a detailed comparison of illumination scenarios. We demonstrate LiteVis using a real-world use case and report qualitative feedback of lighting designers. This feedback indicates that LiteVis successfully supports lighting designers to achieve key tasks more efficiently and with greater certainty.
RESUMEN
An increasing number of interactive visualization tools stress the integration with computational software like MATLAB and R to access a variety of proven algorithms. In many cases, however, the algorithms are used as black boxes that run to completion in isolation which contradicts the needs of interactive data exploration. This paper structures, formalizes, and discusses possibilities to enable user involvement in ongoing computations. Based on a structured characterization of needs regarding intermediate feedback and control, the main contribution is a formalization and comparison of strategies for achieving user involvement for algorithms with different characteristics. In the context of integration, we describe considerations for implementing these strategies either as part of the visualization tool or as part of the algorithm, and we identify requirements and guidelines for the design of algorithmic APIs. To assess the practical applicability, we provide a survey of frequently used algorithm implementations within R regarding the fulfillment of these guidelines. While echoing previous calls for analysis modules which support data exploration more directly, we conclude that a range of pragmatic options for enabling user involvement in ongoing computations exists on both the visualization and algorithm side and should be used.
Asunto(s)
Algoritmos , Gráficos por Computador , Programas Informáticos , Interfaz Usuario-Computador , HumanosRESUMEN
Various case studies in different application domains have shown the great potential of visual parameter space analysis to support validating and using simulation models. In order to guide and systematize research endeavors in this area, we provide a conceptual framework for visual parameter space analysis problems. The framework is based on our own experience and a structured analysis of the visualization literature. It contains three major components: (1) a data flow model that helps to abstractly describe visual parameter space analysis problems independent of their application domain; (2) a set of four navigation strategies of how parameter space analysis can be supported by visualization tools; and (3) a characterization of six analysis tasks. Based on our framework, we analyze and classify the current body of literature, and identify three open research gaps in visual parameter space analysis. The framework and its discussion are meant to support visualization designers and researchers in characterizing parameter space analysis problems and to guide their design and evaluation processes.
RESUMEN
Regression models play a key role in many application domains for analyzing or predicting a quantitative dependent variable based on one or more independent variables. Automated approaches for building regression models are typically limited with respect to incorporating domain knowledge in the process of selecting input variables (also known as feature subset selection). Other limitations include the identification of local structures, transformations, and interactions between variables. The contribution of this paper is a framework for building regression models addressing these limitations. The framework combines a qualitative analysis of relationship structures by visualization and a quantification of relevance for ranking any number of features and pairs of features which may be categorical or continuous. A central aspect is the local approximation of the conditional target distribution by partitioning 1D and 2D feature domains into disjoint regions. This enables a visual investigation of local patterns and largely avoids structural assumptions for the quantitative ranking. We describe how the framework supports different tasks in model building (e.g., validation and comparison), and we present an interactive workflow for feature subset selection. A real-world case study illustrates the step-wise identification of a five-dimensional model for natural gas consumption. We also report feedback from domain experts after two months of deployment in the energy sector, indicating a significant effort reduction for building and improving regression models.
Asunto(s)
Algoritmos , Gráficos por Computador , Modelos Estadísticos , Análisis de Regresión , Interfaz Usuario-Computador , Simulación por Computador , Reproducibilidad de los Resultados , Sensibilidad y EspecificidadRESUMEN
Many application domains deal with multi-variate data that consist of both categorical and numerical information. Smallmultiple displays are a powerful concept for comparing such data by juxtaposition. For comparison by overlay or by explicit encoding of computed differences, however, a specification of references is necessary. In this paper, we present a formal model for defining semantically meaningful comparisons between many categories in a small-multiple display. Based on pivotized data that are hierarchically partitioned by the categories assigned to the x and y axis of the display, we propose two alternatives for structure-based comparison within this hierarchy. With an absolute reference specification, categories are compared to a fixed reference category. With a relative reference specification, in contrast, a semantic ordering of the categories is considered when comparing them either to the previous or subsequent category each. Both reference specifications can be defined at multiple levels of the hierarchy (including aggregated summaries), enabling a multitude of useful comparisons. We demonstrate the general applicability of our model in several application examples using different visualizations that compare data by overlay or explicit encoding of differences.
Asunto(s)
Algoritmos , Gráficos por Computador , Técnicas de Apoyo para la Decisión , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Interfaz Usuario-Computador , Reproducibilidad de los Resultados , Sensibilidad y EspecificidadRESUMEN
During continuous user interaction, it is hard to provide rich visual feedback at interactive rates for datasets containing millions of entries. The contribution of this paper is a generic architecture that ensures responsiveness of the application even when dealing with large data and that is applicable to most types of information visualizations. Our architecture builds on the separation of the main application thread and the visualization thread, which can be cancelled early due to user interaction. In combination with a layer mechanism, our architecture facilitates generating previews incrementally to provide rich visual feedback quickly. To help avoiding common pitfalls of multi-threading, we discuss synchronization and communication in detail. We explicitly denote design choices to control trade-offs. A quantitative evaluation based on the system VISPLORE shows fast visual feedback during continuous interaction even for millions of entries. We describe instantiations of our architecture in additional tools.