Deep learning model based multimedia retrieval and its optimization in augmented reality applications.

Gupta, Yash Prakash; Gupta, Nitin

Gupta, Yash Prakash; Gupta, Nitin.

Afiliación

Gupta YP; Department of Electronics Communication and Engineering, National Institute of Technology, Hamirpur, Himachal Pradesh India.
Mukul; Department of Electronics Communication and Engineering, National Institute of Technology, Hamirpur, Himachal Pradesh India.
Gupta N; Department of Computer Science and Engineering, National Institute of Technology, Hamirpur, Himachal Pradesh India.

Multimed Tools Appl ; 82(6): 8447-8466, 2023.

Article en En | MEDLINE | ID: mdl-35968406

RESUMEN

With the uproar of touchless technology, the Virtual Continuum has seen some spark in the upcoming products. Today numerous gadgets support the use of Mixed Reality / Augmented Reality (AR)/ Virtual Reality. The Head Mounted Displays (HMDs) like that of Hololens, Google Lens, Jio Glass manifested reality into virtuality. Other than the HMDs many organizations tend to develop mobile AR applications to support umpteen number of industries like medicine, education, construction. Currently, the major issue lies in the performance parameters of these applications, while deploying for mobile application's graphics performance, latency, and CPU functioning. Many industries pose real-time computation requirements in AR but do not implement an efficient algorithm in their frameworks. Offloading the computation of deep learning models involved in the application to the cloud servers will highly affect the processing parameters. For our use case, we will be using Multi-Task Cascaded Convolutional Neural Network (MTCNN) which is a modern tool for face detection, using a 3-stage neural network detector. Therefore, the optimization of communication between local application and cloud computing frameworks needs to be optimized. The proposed framework defines how the parameters involving the complete deployment of a mobile AR application can be optimized in terms of retrieval of multimedia, its processing, and augmentation of graphics, eventually enhancing the performance. To implement the proposed algorithm a mobile application is created in Unity3D. The mobile application virtually augments a 3D model of a skeleton on a target face. After the mentioned experimentation, it is found that average Media Retrieval Time (1.1471 µ s) and Client Time (1.1207 µ s) in the local application are extremely low than the average API process time (288.934ms). The highest time latency is achieved at the frame rate higher than 80fps.

Palabras clave

Augmented reality; Cloud computation; Deep learning; Latency; MTCNN; Media retrieval; Medical augmented reality; OffLoading

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Multimed Tools Appl Año: 2023 Tipo del documento: Article Pais de publicación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google