Pesquisa | Portal Regional da BVS

1.

Deep learning for religious and continent-based toxic content detection and classification.

Abbasi, Ahmed; Javed, Abdul Rehman; Iqbal, Farkhund; Kryvinska, Natalia; Jalil, Zunera.

Sci Rep ; 12(1): 17478, 2022 10 19.

Artigo em Inglês | MEDLINE | ID: mdl-36261675

RESUMO

With time, numerous online communication platforms have emerged that allow people to express themselves, increasing the dissemination of toxic languages, such as racism, sexual harassment, and other negative behaviors that are not accepted in polite society. As a result, toxic language identification in online communication has emerged as a critical application of natural language processing. Numerous academic and industrial researchers have recently researched toxic language identification using machine learning algorithms. However, Nontoxic comments, including particular identification descriptors, such as Muslim, Jewish, White, and Black, were assigned unrealistically high toxicity ratings in several machine learning models. This research analyzes and compares modern deep learning algorithms for multilabel toxic comments classification. We explore two scenarios: the first is a multilabel classification of Religious toxic comments, and the second is a multilabel classification of race or toxic ethnicity comments with various word embeddings (GloVe, Word2vec, and FastText) without word embeddings using an ordinary embedding layer. Experiments show that the CNN model produced the best results for classifying multilabel toxic comments in both scenarios. We compared the outcomes of these modern deep learning model performances in terms of multilabel evaluation metrics.

Assuntos

Aprendizado Profundo , Humanos , Processamento de Linguagem Natural , Aprendizado de Máquina , Idioma , Algoritmos

2.

A Novel Benchmark Dataset for COVID-19 Detection during Third Wave in Pakistan.

Jalil, Zunera; Abbasi, Ahmed; Javed, Abdul Rehman; Khan, Muhammad Badruddin; Abul Hasanat, Mozaherul Hoque; AlTameem, Abdullah; AlKhathami, Mohammed; Jilani Saudagar, Abdul Khader.

Comput Intell Neurosci ; 2022: 6354579, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35990145

RESUMO

Coronavirus (COVID-19) is a highly severe infection caused by the severe acute respiratory coronavirus 2 (SARS-CoV-2). The polymerase chain reaction (PCR) test is essential to confirm the COVID-19 infection, but it has certain limitations, including paucity of reagents, is computationally time-consuming, and requires expert clinicians. Clinicians suggest that the PCR test is not a reliable automated COVID-19 patient detection system. This study proposed a machine learning-based approach to evaluate the PCR role in COVID-19 detection. We collect real data containing 603 COVID-19 samples from the Pakistan Institute of Medical Sciences (PIMS) Hospital in Islamabad, Pakistan, during the third COVID-19 wave. The experiments are separated into two sets. The first set comprises 24 features, including PCR test results, whereas the second comprises 24 features without PCR test. The findings demonstrate that the decision tree achieves the best detection rate for positive and negative COVID-19 patients in both scenarios. The findings reveal that PCR does not contribute to detecting COVID-19 patients. The findings also aid in the early detection of COVID-19, mainly when PCR test results are insufficient for diagnosing COVID-19 and help developing countries with a paucity of PCR tests and specialist facilities.

Assuntos

COVID-19 , Benchmarking , COVID-19/diagnóstico , Humanos , Aprendizado de Máquina , Paquistão/epidemiologia , SARS-CoV-2

3.

Authorship identification using ensemble learning.

Abbasi, Ahmed; Javed, Abdul Rehman; Iqbal, Farkhund; Jalil, Zunera; Gadekallu, Thippa Reddy; Kryvinska, Natalia.

Sci Rep ; 12(1): 9537, 2022 06 09.

Artigo em Inglês | MEDLINE | ID: mdl-35680983

RESUMO

With time, textual data is proliferating, primarily through the publications of articles. With this rapid increase in textual data, anonymous content is also increasing. Researchers are searching for alternative strategies to identify the author of an unknown text. There is a need to develop a system to identify the actual author of unknown texts based on a given set of writing samples. This study presents a novel approach based on ensemble learning, DistilBERT, and conventional machine learning techniques for authorship identification. The proposed approach extracts the valuable characteristics of the author using a count vectorizer and bi-gram Term frequency-inverse document frequency (TF-IDF). An extensive and detailed dataset, "All the news" is used in this study for experimentation. The dataset is divided into three subsets (article1, article2, and article3). We limit the scope of the dataset and selected ten authors in the first scope and 20 authors in the second scope for experimentation. The experimental results of proposed ensemble learning and DistilBERT provide better performance for all the three subsets of the "All the news" dataset. In the first scope, the experimental results prove that the proposed ensemble learning approach from 10 authors provides a better accuracy gain of 3.14% and from DistilBERT 2.44% from the article1 dataset. Similarly, in the second scope from 20 authors, the proposed ensemble learning approach provides a better accuracy gain of 5.25% and from DistilBERT 7.17% from the article1 dataset, which is better than previous state-of-the-art studies.

Assuntos

Autoria , Aprendizado de Máquina

4.

Classification of Non-Functional Requirements From IoT Oriented Healthcare Requirement Document.

Khurshid, Iqra; Imtiaz, Salma; Boulila, Wadii; Khan, Zahid; Abbasi, Almas; Javed, Abdul Rehman; Jalil, Zunera.

Front Public Health ; 10: 860536, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35372217

RESUMO

Internet of Things (IoT) involves a set of devices that aids in achieving a smart environment. Healthcare systems, which are IoT-oriented, provide monitoring services of patients' data and help take immediate steps in an emergency. Currently, machine learning-based techniques are adopted to ensure security and other non-functional requirements in smart health care systems. However, no attention is given to classifying the non-functional requirements from requirement documents. The manual process of classifying the non-functional requirements from documents is erroneous and laborious. Missing non-functional requirements in the Requirement Engineering (RE) phase results in IoT oriented healthcare system with compromised security and performance. In this research, an experiment is performed where non-functional requirements are classified from the IoT-oriented healthcare system's requirement document. The machine learning algorithms considered for classification are Logistic Regression (LR), Support Vector Machine (SVM), Multinomial Naive Bayes (MNB), K-Nearest Neighbors (KNN), ensemble, Random Forest (RF), and hybrid KNN rule-based machine learning (ML) algorithms. The results show that our novel hybrid KNN rule-based machine learning algorithm outperforms others by showing an average classification accuracy of 75.9% in classifying non-functional requirements from IoT-oriented healthcare requirement documents. This research is not only novel in its concept of using a machine learning approach for classification of non-functional requirements from IoT-oriented healthcare system requirement documents, but it also proposes a novel hybrid KNN-rule based machine learning algorithm for classification with better accuracy. A new dataset is also created for classification purposes, comprising requirements related to IoT-oriented healthcare systems. However, since this dataset is small and consists of only 104 requirements, this might affect the generalizability of the results of this research.

Assuntos

Documentação/normas , Internet das Coisas , Teorema de Bayes , Atenção à Saúde , Humanos , Aprendizado de Máquina

5.

Evading obscure communication from spam emails.

Rafat, Khan Farhan; Xin, Qin; Javed, Abdul Rehman; Jalil, Zunera; Ahmad, Rana Zeeshan.

Math Biosci Eng ; 19(2): 1926-1943, 2022 01.

Artigo em Inglês | MEDLINE | ID: mdl-35135236

RESUMO

Spam is any form of annoying and unsought digital communication sent in bulk and may contain offensive content feasting viruses and cyber-attacks. The voluminous increase in spam has necessitated developing more reliable and vigorous artificial intelligence-based anti-spam filters. Besides text, an email sometimes contains multimedia content such as audio, video, and images. However, text-centric email spam filtering employing text classification techniques remains today's preferred choice. In this paper, we show that text pre-processing techniques nullify the detection of malicious contents in an obscure communication framework. We use Spamassassin corpus with and without text pre-processing and examined it using machine learning (ML) and deep learning (DL) algorithms to classify these as ham or spam emails. The proposed DL-based approach consistently outperforms ML models. In the first stage, using pre-processing techniques, the long-short-term memory (LSTM) model achieves the highest results of 93.46% precision, 96.81% recall, and 95% F1-score. In the second stage, without using pre-processing techniques, LSTM achieves the best results of 95.26% precision, 97.18% recall, and 96% F1-score. Results show the supremacy of DL algorithms over the standard ones in filtering spam. However, the effects are unsatisfactory for detecting encrypted communication for both forms of ML algorithms.

Assuntos

Inteligência Artificial , Correio Eletrônico , Algoritmos , Comunicação , Aprendizado de Máquina

6.

BCD-WERT: a novel approach for breast cancer detection using whale optimization based efficient features and extremely randomized tree algorithm.

Abbas, Shafaq; Jalil, Zunera; Javed, Abdul Rehman; Batool, Iqra; Khan, Mohammad Zubair; Noorwali, Abdulfattah; Gadekallu, Thippa Reddy; Akbar, Aqsa.

PeerJ Comput Sci ; 7: e390, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33817036

RESUMO

Breast cancer is one of the leading causes of death in the current age. It often results in subpar living conditions for a patient as they have to go through expensive and painful treatments to fight this cancer. One in eight women all over the world is affected by this disease. Almost half a million women annually do not survive this fight and die from this disease. Machine learning algorithms have proven to outperform all existing solutions for the prediction of breast cancer using models built on the previously available data. In this paper, a novel approach named BCD-WERT is proposed that utilizes the Extremely Randomized Tree and Whale Optimization Algorithm (WOA) for efficient feature selection and classification. WOA reduces the dimensionality of the dataset and extracts the relevant features for accurate classification. Experimental results on state-of-the-art comprehensive dataset demonstrated improved performance in comparison with eight other machine learning algorithms: Support Vector Machine (SVM), Random Forest, Kernel Support Vector Machine, Decision Tree, Logistic Regression, Stochastic Gradient Descent, Gaussian Naive Bayes and k-Nearest Neighbor. BCD-WERT outperformed all with the highest accuracy rate of 99.30% followed by SVM achieving 98.60% accuracy. Experimental results also reveal the effectiveness of feature selection techniques in improving prediction accuracy.

7.

COVID-19 Related Sentiment Analysis Using State-of-the-Art Machine Learning and Deep Learning Techniques.

Jalil, Zunera; Abbasi, Ahmed; Javed, Abdul Rehman; Badruddin Khan, Muhammad; Abul Hasanat, Mozaherul Hoque; Malik, Khalid Mahmood; Saudagar, Abdul Khader Jilani.

Front Public Health ; 9: 812735, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-35096755

RESUMO

The coronavirus disease 2019 (COVID-19) pandemic has influenced the everyday life of people around the globe. In general and during lockdown phases, people worldwide use social media network to state their viewpoints and general feelings concerning the pandemic that has hampered their daily lives. Twitter is one of the most commonly used social media platforms, and it showed a massive increase in tweets related to coronavirus, including positive, negative, and neutral tweets, in a minimal period. The researchers move toward the sentiment analysis and analyze the various emotions of the public toward COVID-19 due to the diverse nature of tweets. Meanwhile, people have expressed their feelings regarding the vaccinations' safety and effectiveness on social networking sites such as Twitter. As an advanced step, in this paper, our proposed approach analyzes COVID-19 by focusing on Twitter users who share their opinions on this social media networking site. The proposed approach analyzes collected tweets' sentiments for sentiment classification using various feature sets and classifiers. The early detection of COVID-19 sentiments from collected tweets allow for a better understanding and handling of the pandemic. Tweets are categorized into positive, negative, and neutral sentiment classes. We evaluate the performance of machine learning (ML) and deep learning (DL) classifiers using evaluation metrics (i.e., accuracy, precision, recall, and F1-score). Experiments prove that the proposed approach provides better accuracy of 96.66, 95.22, 94.33, and 93.88% for COVISenti, COVIDSenti_A, COVIDSenti_B, and COVIDSenti_C, respectively, compared to all other methods used in this study as well as compared to the existing approaches and traditional ML and DL algorithms.

Assuntos

COVID-19 , Aprendizado Profundo , Algoritmos , Controle de Doenças Transmissíveis , Humanos , Aprendizado de Máquina , SARS-CoV-2 , Análise de Sentimentos

8.

A comprehensive survey of AI-enabled phishing attacks detection techniques.

Basit, Abdul; Zafar, Maham; Liu, Xuan; Javed, Abdul Rehman; Jalil, Zunera; Kifayat, Kashif.

Telecommun Syst ; 76(1): 139-154, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33110340

RESUMO

In recent times, a phishing attack has become one of the most prominent attacks faced by internet users, governments, and service-providing organizations. In a phishing attack, the attacker(s) collects the client's sensitive data (i.e., user account login details, credit/debit card numbers, etc.) by using spoofed emails or fake websites. Phishing websites are common entry points of online social engineering attacks, including numerous frauds on the websites. In such types of attacks, the attacker(s) create website pages by copying the behavior of legitimate websites and sends URL(s) to the targeted victims through spam messages, texts, or social networking. To provide a thorough understanding of phishing attack(s), this paper provides a literature review of Artificial Intelligence (AI) techniques: Machine Learning, Deep Learning, Hybrid Learning, and Scenario-based techniques for phishing attack detection. This paper also presents the comparison of different studies detecting the phishing attack for each AI technique and examines the qualities and shortcomings of these methodologies. Furthermore, this paper provides a comprehensive set of current challenges of phishing attacks and future research direction in this domain.

9.

A Pilot Study of Infrared Thermography Based Assessment of Local Skin Temperature Response in Overweight and Lean Women during Oral Glucose Tolerance Test.

Jalil, Bushra; Hartwig, Valentina; Moroni, Davide; Salvetti, Ovidio; Benassi, Antonio; Jalil, Zunera; Pistoia, Laura; Minutoli Tegrimi, Tommaso; Quinones-Galvan, Alfredo; Iervasi, Giorgio; L'Abbate, Antonio; Guiducci, Letizia.

J Clin Med ; 8(2)2019 Feb 19.

Artigo em Inglês | MEDLINE | ID: mdl-30791407

RESUMO

Obesity is recognized as a major public health issue, as it is linked to the increased risk of severe pathological conditions. The aim of this pilot study is to evaluate the relations between adiposity (and biophysical characteristics) and temperature profiles under thermoneutral conditions in normal and overweight females, investigating the potential role of heat production/dissipation alteration in obesity. We used Infrared Thermography (IRT) to evaluate the thermogenic response to a metabolic stimulus performed with an oral glucose tolerance test (OGTT). Thermographic images of the right hand and of the central abdomen (regions of interests) were obtained basally and during the oral glucose tolerance test (3 h OGTT with the ingestion of 75 g of oral glucose) in normal and overweight females. Regional temperature vs BMI, % of body fat and abdominal skinfold were statistically compared between two groups. The study showed that mean abdominal temperature was significantly greater in lean than overweight participants (34.11 ± 0.70 °C compared with 32.92 ± 1.24 °C, p < 0.05). Mean hand temperature was significantly greater in overweight than lean subjects (31.87 ± 3.06 °C compared with 28.22 ± 3.11 °C, p < 0.05). We observed differences in temperature profiles during OGTT between lean and overweight subjects: The overweight individuals depict a flat response as compared to the physiological rise observed in lean individuals. This observed difference in thermal pattern suggests an energy rate imbalance towards nutrients storage of the overweight subjects.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA