RESUMO
The accelerated development of technologies within the Internet of Things landscape has led to an exponential boost in the volume of heterogeneous data generated by interconnected sensors, particularly in scenarios with multiple data sources as in smart cities. Transferring, processing, and storing a vast amount of sensed data poses significant challenges for Internet of Things systems. In this sense, data reduction techniques based on artificial intelligence have emerged as promising solutions to address these challenges, alleviating the burden on the required storage, bandwidth, and computational resources. This article proposes a framework that exploits the concept of data reduction to decrease the amount of heterogeneous data in certain applications. A machine learning model that predicts a distortion rate and its corresponding reduction rate of the imputed data is also proposed, which uses the predicted values to select, among many reduction techniques, the most suitable approach. To support such a decision, the model also considers the context of the data producer that dictates the class of reduction algorithm that is allowed to be applied to the input stream. The achieved results indicate that the Huffman algorithm performed better considering the reduction of time-series data, with significant potential applications for smart city scenarios.
RESUMO
We present a new approach for the development of a data persistency layer for a Digital Imaging and Communications in Medicine (DICOM)-compliant Picture Archiving and Communications Systems employing a hierarchical database. Our approach makes use of the HDF5 hierarchical data storage standard for scientific data and overcomes limitations of hierarchical databases employing inverted indexing for secondary key management and for efficient and flexible access to data through secondary keys. This inverted indexing is achieved through a general purpose document indexing tool called Lucene. This approach was implemented and tested using real-world data against a traditional solution employing a relational database, in various store, search, and retrieval experiments performed repeatedly with different sizes of DICOM datasets. Results show that our approach outperforms the traditional solution on most of the situations, being more than 600 % faster in some cases.