Búsqueda | Portal Regional de la BVS

1.

3DPro: Querying Complex Three-Dimensional Data with Progressive Compression and Refinement.

Teng, Dejun; Liang, Yanhui; Baig, Furqan; Kong, Jun; Hoang, Vo; Wang, Fusheng.

Adv Database Technol ; 25(2): 104-117, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36222820

RESUMEN

Large-scale three-dimensional spatial data has gained increasing attention with the development of self-driving, mineral exploration, CAD, and human atlases. Such 3D objects are often represented with a polygonal model at high resolution to preserve accuracy. This poses major challenges for 3D data management and spatial queries due to the massive amounts of 3D objects, e.g., trillions of 3D cells, and the high complexity of 3D geometric computation. Traditional spatial querying methods in the Filter-Refine paradigm have a major focus on indexing-based filtering using approximations like minimal bounding boxes and largely neglect the heavy computation in the refinement step at the intra-geometry level, which often dominates the cost of query processing. In this paper, we introduce 3DPro, a system that supports efficient spatial queries for complex 3D objects. 3DPro uses progressive compression of 3D objects preserving multiple levels of details, which significantly reduces the size of the objects and has the data fit into memory. Through a novel Filter-Progressive-Refine paradigm, 3DPro can have query results returned early whenever possible to minimize decompression and geometric computations of 3D objects in higher resolution representations. Our experiments demonstrate that 3DPro out-performs the state-of-the-art 3D data processing techniques by up to an order of magnitude for typical spatial queries.

2.

IDEAL: a Vector-Raster Hybrid Model for Efficient Spatial Queries over Complex Polygons.

Teng, Dejun; Baig, Furqan; Sun, Qiheng; Kong, Jun; Wang, Fusheng.

IEEE Int Conf Mob Data Manag ; 20212021 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-34650348

RESUMEN

Geometric computation can be heavy duty for spatial queries, in particular for complex geometries such as polygons with many edges based on a vector-based representation. While many techniques have been provided for spatial partitioning and indexing, they are mainly built on minimal bounding boxes or other approximation methods, which will not mitigate the high cost of geometric computation. In this paper, we propose a novel vector-raster hybrid approach through rasterization, where pixel-centric rich information is preserved to help not only filtering out more candidates but also reducing geometry computation load. Based on the hybrid model, we develop an efficient rasterization based ray casting method for point-in-polygon queries and a circle buffering method for point-to-polygon distance calculation, which is a common operation for distance based queries. Our experiments demonstrate that the hybrid model can boost the performance of spatial queries on complex polygons by up to one order of magnitude.

3.

GPU-based Real-time Contact Tracing at Scale.

Teng, Dejun; Nehe, Akshay; Emanuel, Prajeeth; Baig, Furqan; Kong, Jun; Wang, Fusheng.

Proc ACM SIGSPATIAL Int Conf Adv Inf ; 2021: 1-10, 2021 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-35178539

RESUMEN

Contact tracing is gaining its importance in controlling the spread of COVID-19. However, the enormous volume of the frequently sampled tracing data brings major challenges for real-time processing. In this paper, we propose a GPU-based real-time contact tracing system based on spatial proximity queries with temporal constraints using location data. We provide dynamic indexing of moving objects using an adaptive partitioning schema on GPU with extremely low overhead. Our system optimizes the retrieval of contacted pairs to match both the requirements of contact tracing scenarios and GPU centered parallelism. We propose an efficient contacts evaluation mechanism to keep only the spatially and temporally valid contacts. Our experiments demonstrate that the system can achieve sub-second level response for large-scale contact tracing of tens of millions of people, with two magnitudes of performance boost over CPU based approach.

4.

BayesGaze: A Bayesian Approach to Eye-Gaze Based Target Selection.

Li, Zhi; Zhao, Maozheng; Wang, Yifan; Rashidian, Sina; Baig, Furqan; Liu, Rui; Liu, Wanyu; Beaudouin-Lafon, Michel; Ellison, Brooke; Wang, Fusheng; Bi, Xiaojun.

Proc (Graph Interface) ; 2021: 231-240, 2021 May.

Artículo en Inglés | MEDLINE | ID: mdl-35185272

RESUMEN

Selecting targets accurately and quickly with eye-gaze input remains an open research question. In this paper, we introduce BayesGaze, a Bayesian approach of determining the selected target given an eye-gaze trajectory. This approach views each sampling point in an eye-gaze trajectory as a signal for selecting a target. It then uses the Bayes' theorem to calculate the posterior probability of selecting a target given a sampling point, and accumulates the posterior probabilities weighted by sampling interval to determine the selected target. The selection results are fed back to update the prior distribution of targets, which is modeled by a categorical distribution. Our investigation shows that BayesGaze improves target selection accuracy and speed over a dwell-based selection method, and the Center of Gravity Mapping (CM) method. Our research shows that both accumulating posterior and incorporating the prior are effective in improving the performance of eye-gaze based target selection.

5.

SPEAR: Dynamic Spatio-Temporal Query Processing over High Velocity Data Streams.

Baig, Furqan; Teng, Dejun; Kong, Jun; Wang, Fusheng.

Proc Int Conf Data Eng ; 2021: 2279-2284, 2021 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-35572741

RESUMEN

With the advent of IoT and emerging 5G technology, real-time streaming data are being generated at unprecedented speed and volume, and coming with both temporal and spatial dimensions. Effective analysis at such scale and speed requires support for dynamically adjusting querying capabilities in real-time. In spatio-temporal domain, this warrants for data as well as query optimization strategies especially for objects with changing motion states. Contemporary spatio-temporal data stream management systems in distributed domain are mostly dominated by specified-once-applied-continuously query model. Any modification in query state requires query restart limiting system responsiveness and producing outdated or in worst case erroneous results. In this paper, we propose adaptations of principles from streaming databases, spatial data management and distributed computing to support dynamic spatio-temporal query processing over high velocity big data streams. We first formulate a set of spatio-temporal data types and functions to seamlessly handle changes in distributed query states. We develop a comprehensive set of streaming spatio-temporal querying methods, and propose geohash based dynamic spatial partitioning for effective parallel processing. We implement a prototype on top of Apache Flink, where the in-memory stream processing fits nicely with our spatio-temporal models. Comparative evaluation of our prototype demonstrates the effectiveness our strategy by maintaining high consistent processing rates for both stationary as well as moving queries over high velocity spatio-temporal big data streams.

6.

Accelerating Spatial Cross-Matching on CPU-GPU Hybrid Platform With CUDA and OpenACC.

Baig, Furqan; Gao, Chao; Teng, Dejun; Kong, Jun; Wang, Fusheng.

Front Big Data ; 32020 May.

Artículo en Inglés | MEDLINE | ID: mdl-32954255

RESUMEN

Spatial cross-matching operation over geospatial polygonal datasets is a highly compute-intensive yet an essential task to a wide array of real-world applications. At the same time, modern computing systems are typically equipped with multiple processing units capable of task parallelization and optimization at various levels. This mandates for the exploration of novel strategies in the geospatial domain focusing on efficient utilization of computing resources, such as CPUs and GPUs. In this paper, we present a CPU-GPU hybrid platform to accelerate the cross-matching operation of geospatial datasets. We propose a pipeline of geospatial subtasks that are dynamically scheduled to be executed on either CPU or GPU. To accommodate geospatial datasets processing on GPU using pixelization approach, we convert the floating point-valued vertices into integer-valued vertices with an adaptive scaling factor as a function of the area of minimum bounding box. We present a comparative analysis of GPU enabled cross-matching algorithm implementation in CUDA and OpenACC accelerated C++. We test our implementations over Natural Earth Data and our results indicate that although CUDA based implementations provide better performance, OpenACC accelerated implementations are more portable and extendable while still providing considerable performance gain as compared to CPU. We also investigate the effects of input data size on the IO / computation ratio and note that a larger dataset compensates for IO overheads associated with GPU computations. Finally we demonstrate that an efficient cross-matching comparison can be achieved with a cost-effective GPU.

7.

ACTION-EHR: Patient-Centric Blockchain-Based Electronic Health Record Data Management for Cancer Care.

Dubovitskaya, Alevtina; Baig, Furqan; Xu, Zhigang; Shukla, Rohit; Zambani, Pratik Sushil; Swaminathan, Arun; Jahangir, Md Majid; Chowdhry, Khadija; Lachhani, Rahul; Idnani, Nitesh; Schumacher, Michael; Aberer, Karl; Stoller, Scott D; Ryu, Samuel; Wang, Fusheng.

J Med Internet Res ; 22(8): e13598, 2020 08 21.

Artículo en Inglés | MEDLINE | ID: mdl-32821064

RESUMEN

BACKGROUND: With increased specialization of health care services and high levels of patient mobility, accessing health care services across multiple hospitals or clinics has become very common for diagnosis and treatment, particularly for patients with chronic diseases such as cancer. With informed knowledge of a patient's history, physicians can make prompt clinical decisions for smarter, safer, and more efficient care. However, due to the privacy and high sensitivity of electronic health records (EHR), most EHR data sharing still happens through fax or mail due to the lack of systematic infrastructure support for secure, trustable health data sharing, which can also cause major delays in patient care. OBJECTIVE: Our goal was to develop a system that will facilitate secure, trustable management, sharing, and aggregation of EHR data. Our patient-centric system allows patients to manage their own health records across multiple hospitals. The system will ensure patient privacy protection and guarantee security with respect to the requirements for health care data management, including the access control policy specified by the patient. METHODS: We propose a permissioned blockchain-based system for EHR data sharing and integration. Each hospital will provide a blockchain node integrated with its own EHR system to form the blockchain network. A web-based interface will be used for patients and doctors to initiate EHR sharing transactions. We take a hybrid data management approach, where only management metadata will be stored on the chain. Actual EHR data, on the other hand, will be encrypted and stored off-chain in Health Insurance Portability and Accountability Act-compliant cloud-based storage. The system uses public key infrastructure-based asymmetric encryption and digital signatures to secure shared EHR data. RESULTS: In collaboration with Stony Brook University Hospital, we developed ACTION-EHR, a system for patient-centric, blockchain-based EHR data sharing and management for patient care, in particular radiation treatment for cancer. The prototype was built on Hyperledger Fabric, an open-source, permissioned blockchain framework. Data sharing transactions were implemented using chaincode and exposed as representational state transfer application programming interfaces used for the web portal for patients and users. The HL7 Fast Healthcare Interoperability Resources standard was adopted to represent shared EHR data, making it easy to interface with hospital EHR systems and integrate a patient's EHR data. We tested the system in a distributed environment at Stony Brook University using deidentified patient data. CONCLUSIONS: We studied and developed the critical technology components to enable patient-centric, blockchain-based EHR sharing to support cancer care. The prototype demonstrated the feasibility of our approach as well as some of the major challenges. The next step will be a pilot study with health care providers in both the United States and Switzerland. Our work provides an exemplar testbed to build next-generation EHR sharing infrastructures.

Asunto(s)

Cadena de Bloques/normas , Manejo de Datos/métodos , Registros Electrónicos de Salud/normas , Neoplasias/epidemiología , Humanos , Proyectos Piloto

8.

SparkGIS: Resource Aware Efficient In-Memory Spatial Query Processing.

Baig, Furqan; Vo, Hoang; Kurc, Tahsin; Saltz, Joel; Wang, Fusheng.

Proc ACM SIGSPATIAL Int Conf Adv Inf ; 20172017 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-30035278

RESUMEN

Much effort has been devoted to support high performance spatial queries on large volumes of spatial data in distributed spatial computing systems, especially in the MapReduce paradigm. Recent works have focused on extending spatial MapReduce frameworks to leverage high performance in-memory distributed processing capabilities of systems such as Spark. However, the performance advantage comes with the requirement of having enough memory and comprehensive configuration. Failing to fulfill this falls back to disk IO, defeating the purpose of such systems or in worst case gets out of memory and fails the job. The problem is aggravated further for spatial processing since the underlying in-memory systems are oblivious of spatial data features and characteristics. In this paper we present SparkGIS - an in-memory oriented spatial data querying system for high throughput and low latency spatial query handling by adapting Apache Spark's distributed processing capabilities. It supports basic spatial queries including containment, spatial join and k-nearest neighbor and allows extending these to complex query pipelines. SparkGIS mitigates skew in distributed processing by supporting several dynamic partitioning algorithms suitable for a rich set of contemporary application scenarios. Multilevel global and local, pre-generated and on-demand in-memory indexes, allow SparkGIS to prune input data and apply compute intensive operations on a subset of relevant spatial objects only. Finally, SparkGIS employs dynamic query rewriting to gracefully manage large spatial query workflows that exceed available distributed resources. Our comparative evaluation has shown that the performance of SparkGIS is on par with contemporary Spark based platforms for relatively smaller queries and outperforms them for larger data and memory intensive workflows by dynamic query rewriting and efficient spatial data management.

9.

SparkGIS: Efficient Comparison and Evaluation of Algorithm Results in Tissue Image Analysis Studies.

Baig, Furqan; Mehrotra, Mudit; Vo, Hoang; Wang, Fusheng; Saltz, Joel; Kurc, Tahsin.

Biomed Data Manag Graph Online Querying (2015) ; 9579: 134-146, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-30198025

RESUMEN

Algorithm evaluation provides a means to characterize variability across image analysis algorithms, validate algorithms by comparison of multiple results, and facilitate algorithm sensitivity studies. The sizes of images and analysis results in pathology image analysis pose significant challenges in algorithm evaluation. We present SparkGIS, a distributed, in-memory spatial data processing framework to query, retrieve, and compare large volumes of analytical image result data for algorithm evaluation. Our approach combines the in-memory distributed processing capabilities of Apache Spark and the efficient spatial query processing of Hadoop-GIS. The experimental evaluation of SparkGIS for heatmap computations used to compare nucleus segmentation results from multiple images and analysis runs shows that SparkGIS is efficient and scalable, enabling algorithm evaluation and algorithm sensitivity studies on large datasets.

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA