Long text feature extraction network with data augmentation.
Appl Intell (Dordr)
; 52(15): 17652-17667, 2022.
Article
em En
| MEDLINE
| ID: mdl-35400845
The spread of COVID-19 has had a serious impact on either work or the lives of people. With the decrease in physical social contacts and the rise of anxiety on the pandemic, social media has become the primary approach for people to access information related to COVID-19. Social media is rife with rumors and fake news, causing great damage to the Society. Facing shortages, imbalance, and nosiness, the current Chinese data set related to the epidemic has not helped the detection of fake news. Besides, the accuracy of classification was also affected by the easy loss of edge characteristics in long text data. In this paper, long text feature extraction network with data augmentation (LTFE) was proposed, which improves the learning performance of the classifier by optimizing the data feature structure. In the stage of encoding, Twice-Masked Language Modeling for Fine-tuning (TMLM-F) and Data Alignment that Preserves Edge Characteristics (DA-PEC) was proposed to extract the classification features of the Chinese Dataset. Between the TMLM-F and DA-PEC processes, we use Attention to capture the dependencies between words and generate corresponding vector representations. The experimental results illustrate that this method is effective for the detection of Chinese fake news pertinent to the pandemic.
Texto completo:
1
Base de dados:
MEDLINE
Tipo de estudo:
Prognostic_studies
Idioma:
En
Ano de publicação:
2022
Tipo de documento:
Article