RESUMO
Against the general belief that carbonic acid is too unstable for synthesis, it was possible to synthesize the solid[1,2] as well as gas-phase carbonic acid.[3] It was suggested that solid carbonic acid might exist in Earth's upper troposphere and in the harsh environments of other solar bodies,[4] where it undergoes a cycle of synthesis, decomposition, and dimerization.[5] To provide spectroscopic data for probing the existence of extraterrestrial carbonic acid,[2,6] matrix-isolation infrared (MI-IR) spectroscopy has shown to be essential.[3,4,6-8] However, early assignments within the harmonic approximation using scaling factors impeded a full interpretation of the rather complex MI-IR spectrum of H2CO3. Recently, carbonic acid was detected in the Galactic center molecular cloud,[9] triggering new interest in the anharmonic spectrum.[10] In this regard, we substantially reassign our argon MI-IR spectra based on accurate anharmonic calculations. We calculate a four-mode potential energy surface (PES) at the explicitly correlated coupled-cluster theory using up to triple-zeta basis sets, i. e., CCSD(T)-F12/cc-pVTZ-F12. On this PES, we perform vibrational self-consistent field and configuration interaction (VSCF/VCI) calculations to obtain accurate vibrational transition frequencies and resonance analysis of the fundamentals, first overtones, and combination bands. In total, 12 new bands can be assigned, extending the spectral data for carbonic acid and thus simplifying detection in more complex environments. Furthermore, we clarify disputed assignments between the cc- and ct-conformer.
RESUMO
In this work, earlier studies reporting α-H2 CO3 are revised. The cryo-technique pioneered by Hage, Hallbrucker, and Mayer (HHM) is adapted to supposedly prepare carbonic acid from KHCO3 . In methanolic solution, methylation of the salt is found, which upon acidification transforms to the monomethyl ester of carbonic acid (CAME, HO-CO-OCH3 ). Infrared spectroscopy data both of the solid at 210â K and of the evaporated molecules trapped and isolated in argon matrix at 10â K are presented. The interpretation of the observed bands on the basis of carbonic acid [as suggested originally by HHM in their publications from 1993-1997 and taken over by Winkel etâ al., J. Am. Chem. Soc. 2007 and Bernard etâ al., Angew. Chem. Int. Ed. 2011] is inferior compared with the interpretation on the basis of CAME. The assignment relies on isotope substitution experiments, including deuteration of the OH- and CH3 - groups as well as 12 C and 13 C isotope exchange and on variation of the solvents in both preparation steps. The interpretation of the single molecule spectroscopy experiments is aided by a comprehensive calculation of high-level ab initio frequencies for gas-phase molecules and clusters in the harmonic approximation. This analysis provides evidence for the existence of not only single CAME molecules but also CAME dimers and water complexes in the argon matrix. Furthermore, different conformational CAME isomers are identified, where conformational isomerism is triggered in experiments through UV irradiation. In contrast to earlier studies, this analysis allows explanation of almost every single band of the complex spectra in the range between 4000 and 600â cm-1 .
RESUMO
In medical diagnostics of both early disease detection and routine patient care, particle-based contamination of in-vitro diagnostics consumables poses a significant threat to patients. Objective data-driven decision-making on the severity of contamination is key for reducing patient risk, while saving time and cost in quality assessment. Our collaborators introduced us to their quality control process, including particle data acquisition through image recognition, feature extraction, and attributes reflecting the production context of particles. Shortcomings in the current process are limitations in exploring thousands of images, data-driven decision making, and ineffective knowledge externalization. Following the design study methodology, our contributions are a characterization of the problem space and requirements, the development and validation of DaedalusData, a comprehensive discussion of our study's learnings, and a generalizable framework for knowledge externalization. DaedalusData is a visual analytics system that enables domain experts to explore particle contamination patterns, label particles in label alphabets, and externalize knowledge through semi-supervised label-informed data projections. The results of our case study and user study show high usability of DaedalusData and its efficient support of experts in generating comprehensive overviews of thousands of particles, labeling of large quantities of particles, and externalizing knowledge to augment the dataset further. Reflecting on our approach, we discuss insights on dataset augmentation via human knowledge externalization, and on the scalability and trade-offs that come with the adoption of this approach in practice.
RESUMO
OBJECTIVES: This article describes the design and evaluation of MS Pattern Explorer, a novel visual tool that uses interactive machine learning to analyze fitness wearables' data. Applied to a clinical study of multiple sclerosis (MS) patients, the tool addresses key challenges: managing activity signals, accelerating insight generation, and rapidly contextualizing identified patterns. By analyzing sensor measurements, it aims to enhance understanding of MS symptomatology and improve the broader problem of clinical exploratory sensor data analysis. MATERIALS AND METHODS: Following a user-centered design approach, we learned that clinicians have 3 priorities for generating insights for the Barka-MS study data: exploration and search for, and contextualization of, sequences and patterns in patient sleep and activity. We compute meaningful sequences for patients using clustering and proximity search, displaying these with an interactive visual interface composed of coordinated views. Our evaluation posed both closed and open-ended tasks to participants, utilizing a scoring system to gauge the tool's usability, and effectiveness in supporting insight generation across 15 clinicians, data scientists, and non-experts. RESULTS AND DISCUSSION: We present MS Pattern Explorer, a visual analytics system that helps clinicians better address complex data-centric challenges by facilitating the understanding of activity patterns. It enables innovative analysis that leads to rapid insight generation and contextualization of temporal activity data, both within and between patients of a cohort. Our evaluation results indicate consistent performance across participant groups and effective support for insight generation in MS patient fitness tracker data. Our implementation offers broad applicability in clinical research, allowing for potential expansion into cohort-wide comparisons or studies of other chronic conditions. CONCLUSION: MS Pattern Explorer successfully reduces the signal overload clinicians currently experience with activity data, introducing novel opportunities for data exploration, sense-making, and hypothesis generation.
Assuntos
Aprendizado de Máquina , Esclerose Múltipla , Humanos , Esclerose Múltipla/fisiopatologia , Interface Usuário-Computador , Sono , Monitores de Aptidão FísicaRESUMO
Wearable sensor technologies are becoming increasingly relevant in health research, particularly in the context of chronic disease management. They generate real-time health data that can be translated into digital biomarkers, which can provide insights into our health and well-being. Scientific methods to collect, interpret, analyze, and translate health data from wearables to digital biomarkers vary, and systematic approaches to guide these processes are currently lacking. This paper is based on an observational, longitudinal cohort study, BarKA-MS, which collected wearable sensor data on the physical rehabilitation of people living with multiple sclerosis (MS). Based on our experience with BarKA-MS, we provide and discuss ten lessons we learned in relation to digital biomarker development across key study phases. We then summarize these lessons into a guiding framework (DACIA) that aims to informs the use of wearable sensor data for digital biomarker development and chronic disease management for future research and teaching.
RESUMO
Time-stamped event sequences (TSEQs) are time-oriented data without value information, shifting the focus of users to the exploration of temporal event occurrences. TSEQs exist in application domains, such as sleeping behavior, earthquake aftershocks, and stock market crashes. Domain experts face four challenges, for which they could use interactive and visual data analysis methods. First, TSEQs can be large with respect to both the number of sequences and events, often leading to millions of events. Second, domain experts need validated metrics and features to identify interesting patterns. Third, after identifying interesting patterns, domain experts contextualize the patterns to foster sensemaking. Finally, domain experts seek to reduce data complexity by data simplification and machine learning support. We present IVESA, a visual analytics approach for TSEQs. It supports the analysis of TSEQs at the granularities of sequences and events, supported with metrics and feature analysis tools. IVESA has multiple linked views that support overview, sort+filter, comparison, details-on-demand, and metadata relation-seeking tasks, as well as data simplification through feature analysis, interactive clustering, filtering, and motif detection and simplification. We evaluated IVESA with three case studies and a user study with six domain experts working with six different datasets and applications. Results demonstrate the usability and generalizability of IVESA across applications and cases that had up to 1,000,000 events.
RESUMO
Twenty years ago two different polymorphs of carbonic acid, α- and ß-H2CO3, were isolated as thin, crystalline films. They were characterized by infrared and, of late, by Raman spectroscopy. Determination of the crystal structure of these two polymorphs, using cryopowder and thin film X-ray diffraction techniques, has failed so far. Recently, we succeeded in sublimating α-H2CO3 and trapping the vapor phase in a noble gas matrix, which was analyzed by infrared spectroscopy. In the same way we have now investigated the ß-polymorph. Unlike α-H2CO3, ß-H2CO3 was regarded to decompose upon sublimation. Still, we have succeeded in isolation of undecomposed carbonic acid in the matrix and recondensation after removal of the matrix here. This possibility of sublimation and recondensation cycles of ß-H2CO3 adds a new aspect to the chemistry of carbonic acid in astrophysical environments, especially because there is a direct way of ß-H2CO3 formation in space, but none for α-H2CO3. Assignments of the FTIR spectra of the isolated molecules unambiguously reveal two different carbonic acid monomer conformers (C(2v) and C(s)). In contrast to the earlier study on α-H2CO3, we do not find evidence for centrosymmetric (C(2h)) carbonic acid dimers here. This suggests that two monomers are entropically favored at the sublimation temperature of 250 K for ß-H2CO3, whereas they are not at the sublimation temperature of 210 K for α-H2CO3.
Assuntos
Ácido Carbônico/isolamento & purificação , Polímeros/química , VolatilizaçãoRESUMO
We present ManuKnowVis, the result of a design study, in which we contextualize data from multiple knowledge repositories of a manufacturing process for battery modules used in electric vehicles. In data-driven analyses of manufacturing data, we observed a discrepancy between two stakeholder groups involved in serial manufacturing processes: Knowledge providers (e.g., engineers) have domain knowledge about the manufacturing process but have difficulties in implementing data-driven analyses. Knowledge consumers (e.g., data scientists) have no first-hand domain knowledge but are highly skilled in performing data-driven analyses. ManuKnowVis bridges the gap between providers and consumers and enables the creation and completion of manufacturing knowledge. We contribute a multi-stakeholder design study, where we developed ManuKnowVis in three main iterations with consumers and providers from an automotive company. The iterative development led us to a multiple linked view tool, in which, on the one hand, providers can describe and connect individual entities (e.g., stations or produced parts) of the manufacturing process based on their domain knowledge. On the other hand, consumers can leverage this enhanced data to better understand complex domain problems, thus, performing data analyses more efficiently. As such, our approach directly impacts the success of data-driven analyses from manufacturing data. To demonstrate the usefulness of our approach, we carried out a case study with seven domain experts, which demonstrates how providers can externalize their knowledge and consumers can implement data-driven analyses more efficiently.
RESUMO
Graph neural networks (GNNs) are a class of powerful machine learning tools that model node relations for making predictions of nodes or links. GNN developers rely on quantitative metrics of the predictions to evaluate a GNN, but similar to many other neural networks, it is difficult for them to understand if the GNN truly learns characteristics of a graph as expected. We propose an approach to corresponding an input graph to its node embedding (aka latent space), a common component of GNNs that is later used for prediction. We abstract the data and tasks, and develop an interactive multi-view interface called CorGIE to instantiate the abstraction. As the key function in CorGIE, we propose the K-hop graph layout to show topological neighbors in hops and their clustering structure. To evaluate the functionality and usability of CorGIE, we present how to use CorGIE in two usage scenarios, and conduct a case study with five GNN experts. Availability: Open-source code at https://github.com/zipengliu/corgie-ui/, supplemental materials & video at https://osf.io/tr3sb/.
Assuntos
Gráficos por Computador , Redes Neurais de Computação , Análise por Conglomerados , Aprendizado de Máquina , SoftwareRESUMO
In this design study, we present IRVINE, a Visual Analytics (VA) system, which facilitates the analysis of acoustic data to detect and understand previously unknown errors in the manufacturing of electrical engines. In serial manufacturing processes, signatures from acoustic data provide valuable information on how the relationship between multiple produced engines serves to detect and understand previously unknown errors. To analyze such signatures, IRVINE leverages interactive clustering and data labeling techniques, allowing users to analyze clusters of engines with similar signatures, drill down to groups of engines, and select an engine of interest. Furthermore, IRVINE allows to assign labels to engines and clusters and annotate the cause of an error in the acoustic raw measurement of an engine. Since labels and annotations represent valuable knowledge, they are conserved in a knowledge database to be available for other stakeholders. We contribute a design study, where we developed IRVINE in four main iterations with engineers from a company in the automotive sector. To validate IRVINE, we conducted a field study with six domain experts. Our results suggest a high usability and usefulness of IRVINE as part of the improvement of a real-world manufacturing process. Specifically, with IRVINE domain experts were able to label and annotate produced electrical engines more than 30% faster.
RESUMO
Event sequences are central to the analysis of data in domains that range from biology and health, to logfile analysis and people's everyday behavior. Many visualization tools have been created for such data, but people are error-prone when asked to judge the similarity of event sequences with basic presentation methods. This article describes an experiment that investigates whether local and global alignment techniques improve people's performance when judging sequence similarity. Participants were divided into three groups (basic versus local versus global alignment), and each participant judged the similarity of 180 sets of pseudo-randomly generated sequences. Each set comprised a target, a correct choice and a wrong choice. After training, the global alignment group was more accurate than the local alignment group (98 versus 93 percent correct), with the basic group getting 95 percent correct. Participants' response times were primarily affected by the number of event types, the similarity of sequences (measured by the Levenshtein distance) and the edit types (nine combinations of deletion, insertion and substitution). In summary, global alignment is superior and people's performance could be further improved by choosing alignment parameters that explicitly penalize sequence mismatches.
Assuntos
Algoritmos , Gráficos por Computador , Humanos , Alinhamento de SequênciaRESUMO
Classifiers are among the most widely used supervised machine learning algorithms. Many classification models exist, and choosing the right one for a given task is difficult. During model selection and debugging, data scientists need to assess classifiers' performances, evaluate their learning behavior over time, and compare different models. Typically, this analysis is based on single-number performance measures such as accuracy. A more detailed evaluation of classifiers is possible by inspecting class errors. The confusion matrix is an established way for visualizing these class errors, but it was not designed with temporal or comparative analysis in mind. More generally, established performance analysis systems do not allow a combined temporal and comparative analysis of class-level information. To address this issue, we propose ConfusionFlow, an interactive, comparative visualization tool that combines the benefits of class confusion matrices with the visualization of performance characteristics over time. ConfusionFlow is model-agnostic and can be used to compare performances for different model types, model architectures, and/or training and test datasets. We demonstrate the usefulness of ConfusionFlow in a case study on instance selection strategies in active learning. We further assess the scalability of ConfusionFlow and present a use case in the context of neural network pruning.
RESUMO
Electrical engines are a key technology all automotive manufacturers must master to stay competitive. Engineers need to analyze an overwhelming number of engine measurements to improve the manufacturing for this technology. They are hindered in the task of analyzing large numbers of engines, however, by the following challenges: 1) Engines comprise a complex hierarchical structure of subcomponents. 2) Locating the cause of errors along manufacturing processes is a difficult procedure. 3) Large numbers of heterogeneous measurements impair the ability to explain errors in engines. We address these challenges in a design study with automotive engineers and by developing the visual analytics system Manufacturing Explorer (ManEx), which provides interactive interfaces to analyze measurements of engines across the manufacturing process. ManEx was validated by five experts. Our results suggest high usability and usefulness scores and the improvement of a real-world manufacturing process. Specifically, with ManEx, experts reduced scraped parts by over 3%.
RESUMO
In this design study, we present a visualization technique that segments patients' histories instead of treating them as raw event sequences, aggregates the segments using criteria such as the whole history or treatment combinations, and then visualizes the aggregated segments as static dashboards that are arranged in a dashboard network to show longitudinal changes. The static dashboards were developed in nine iterations, to show 15 important attributes from the patients' histories. The final design was evaluated with five non-experts, five visualization experts and four medical experts, who successfully used it to gain an overview of a 2,000 patient dataset, and to make observations about longitudinal changes and differences between two cohorts. The research represents a step-change in the detail of large-scale data that may be successfully visualized using dashboards, and provides guidance about how the approach may be generalized.
Assuntos
Gráficos por Computador , Registros Eletrônicos de Saúde , Informática Médica/métodos , Neoplasias da Próstata , Humanos , Masculino , Neoplasias da Próstata/diagnóstico , Neoplasias da Próstata/patologia , Neoplasias da Próstata/fisiopatologia , Neoplasias da Próstata/cirurgia , Interface Usuário-ComputadorRESUMO
Labeling data instances is an important task in machine learning and visual analytics. Both fields provide a broad set of labeling strategies, whereby machine learning (and in particular active learning) follows a rather model-centered approach and visual analytics employs rather user-centered approaches (visual-interactive labeling). Both approaches have individual strengths and weaknesses. In this work, we conduct an experiment with three parts to assess and compare the performance of these different labeling strategies. In our study, we (1) identify different visual labeling strategies for user-centered labeling, (2) investigate strengths and weaknesses of labeling strategies for different labeling tasks and task complexities, and (3) shed light on the effect of using different visual encodings to guide the visual-interactive labeling process. We further compare labeling of single versus multiple instances at a time, and quantify the impact on efficiency. We systematically compare the performance of visual interactive labeling with that of active learning. Our main findings are that visual-interactive labeling can outperform active learning, given the condition that dimension reduction separates well the class distributions. Moreover, using dimension reduction in combination with additional visual encodings that expose the internal state of the learning model turns out to improve the performance of visual-interactive labeling.
RESUMO
Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.
RESUMO
The monoesters of carbonic acid are deemed to be unstable and decompose to alcohol and carbon dioxide. In spite of this, we here report the isolation of the elusive carbonic acid monoethyl ester (CAEE) as a pure solid from ethanolic solutions of potassium bicarbonate. The hemiester is surprisingly stable in acidic solution and does not experience hydrolysis to carbonic acid. Furthermore, it is also stable in the gas phase, which we demonstrate by subliming the hemiester without decomposition. This could not be achieved in the past for any hemiester of carbonic acid. In the gas phase the hemiester experiences conformational isomerism at 210 K. Interestingly, the thermodynamically favored conformation is only reached for the torsional movement of the terminal ethyl group, but not the terminal hydrogen atom on the millisecond time scale. Accordingly, IR spectra of the gas phase trapped in an argon matrix are best explained on the basis of a 5 : 1 mixture of monomeric conformers. Our findings necessitate reevaluation of claims of the formation of a carbonic acid polymorph in methanolic solution, which is the subject of a forthcoming publication.
RESUMO
BACKGROUND: While the optimal use and timing of secondary therapy after radical prostatectomy (RP) remain controversial, there are limited data on patient-reported outcomes following multimodal therapy. OBJECTIVE: To assess the impact of additional radiation therapy (RT) and/or androgen deprivation therapy (ADT) on urinary continence, potency, and quality of life (QoL) after RP. DESIGN, SETTING, AND PARTICIPANTS: Among 13150 men who underwent RP from 1992 to 2013, 905 received RP + RT, 407 RP + ADT and 688 RP + RT + ADT. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSES: Urinary function, sexual function, and overall QoL were evaluated annually using self-administered validated questionnaires. Propensity score-matched and bootstrap analyses were performed, and the distributions for all functional outcomes were analyzed as a function of time after RP. RESULTS AND LIMITATIONS: Patients who received RP + RT had a 4% higher overall incontinence rate 3 yr after surgery, and 1% higher rate for severe incontinence (>3 pads/24h) compared to matched RP-only patients. ADT further increased the overall and severe incontinence rates by 4% and 3%, respectively, compared to matched RP + RT patients. RP + RT was associated with an 18% lower rate of potency compared to RP alone, while RP + RT + ADT was associated with a further 17% reduction compared to RP + RT. Additional RT reduced QoL by 10% and additional ADT by a further 12% compared to RP only and RP + RT, respectively. The timing of RT after RP had no influence on continence, but adjuvant compared to salvage RT was associated with significantly lower potency (37% vs 45%), but higher QoL (60% vs 56%). Limitations of our study include the observational study design and potential for selection bias in the treatments received. CONCLUSIONS: Secondary RT and ADT after RP have an additive negative influence on urinary function, potency, and QoL. Patients with high-risk disease should be counseled before RP on the potential net impairment of functional outcomes due to multimodal treatment. PATIENT SUMMARY: Men with high-risk disease choosing surgery upfront should be counseled on the potential need for additional radiation and or androgen deprivation, and the potential net impairment of functional outcomes arising from multimodal treatment.