Pesquisa | Portal de Pesquisa da BVS

Modeling Topics in DFA-Based Lemmatized Gujarati Text.

Chauhan, Uttam; Shah, Shrusti; Shiroya, Dharati; Solanki, Dipti; Patel, Zeel; Bhatia, Jitendra; Tanwar, Sudeep; Sharma, Ravi; Marina, Verdes; Raboaca, Maria Simona.

Sensors (Basel) ; 23(5)2023 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-36904915

RESUMO

Topic modeling is a machine learning algorithm based on statistics that follows unsupervised machine learning techniques for mapping a high-dimensional corpus to a low-dimensional topical subspace, but it could be better. A topic model's topic is expected to be interpretable as a concept, i.e., correspond to human understanding of a topic occurring in texts. While discovering corpus themes, inference constantly uses vocabulary that impacts topic quality due to its size. Inflectional forms are in the corpus. Since words frequently appear in the same sentence and are likely to have a latent topic, practically all topic models rely on co-occurrence signals between various terms in the corpus. The topics get weaker because of the abundance of distinct tokens in languages with extensive inflectional morphology. Lemmatization is often used to preempt this problem. Gujarati is one of the morphologically rich languages, as a word may have several inflectional forms. This paper proposes a deterministic finite automaton (DFA) based lemmatization technique for the Gujarati language to transform lemmas into their root words. The set of topics is then inferred from this lemmatized corpus of Gujarati text. We employ statistical divergence measurements to identify semantically less coherent (overly general) topics. The result shows that the lemmatized Gujarati corpus learns more interpretable and meaningful subjects than unlemmatized text. Finally, results show that lemmatization curtails the size of vocabulary decreases by 16% and the semantic coherence for all three measurements-Log Conditional Probability, Pointwise Mutual Information, and Normalized Pointwise Mutual Information-from -9.39 to -7.49, -6.79 to -5.18, and -0.23 to -0.17, respectively.

An Overview of Fog Data Analytics for IoT Applications.

Bhatia, Jitendra; Italiya, Kiran; Jadeja, Kuldeepsinh; Kumhar, Malaram; Chauhan, Uttam; Tanwar, Sudeep; Bhavsar, Madhuri; Sharma, Ravi; Manea, Daniela Lucia; Verdes, Marina; Raboaca, Maria Simona.

Sensors (Basel) ; 23(1)2022 Dec 24.

Artigo em Inglês | MEDLINE | ID: mdl-36616797

RESUMO

With the rapid growth in the data and processing over the cloud, it has become easier to access those data. On the other hand, it poses many technical and security challenges to the users of those provisions. Fog computing makes these technical issues manageable to some extent. Fog computing is one of the promising solutions for handling the big data produced by the IoT, which are often security-critical and time-sensitive. Massive IoT data analytics by a fog computing structure is emerging and requires extensive research for more proficient knowledge and smart decisions. Though an advancement in big data analytics is taking place, it does not consider fog data analytics. However, there are many challenges, including heterogeneity, security, accessibility, resource sharing, network communication overhead, the real-time data processing of complex data, etc. This paper explores various research challenges and their solution using the next-generation fog data analytics and IoT networks. We also performed an experimental analysis based on fog computing and cloud architecture. The result shows that fog computing outperforms the cloud in terms of network utilization and latency. Finally, the paper is concluded with future trends.

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA