Your browser doesn't support javascript.
loading
Topic-based automatic summarization algorithm for Chinese short text.
Ma, Ting Huai; Wang, Hong Mei; Zhao, Yu Wei; Tian, Yuan; Al-Nabhan, Najla.
Affiliation
  • Ma TH; Nanjing University of Information Science and Technology, Nanjing 210044, China.
  • Wang HM; Nanjing University of Information Science and Technology, Nanjing 210044, China.
  • Zhao YW; Nanjing University of Information Science and Technology, Nanjing 210044, China.
  • Tian Y; Nanjing Institute of Technology, Nanjing 211167, China.
  • Al-Nabhan N; King Saud University, Riyadh 11362, Saudi Arabia.
Math Biosci Eng ; 17(4): 3582-3600, 2020 05 12.
Article in En | MEDLINE | ID: mdl-32987545
ABSTRACT
Most current automatic summarization methods are for English texts. The distinction between words in Chinese text is large, the types of parts of speech are many and complex, and polysemy or ambiguous words appear frequently. Therefore, compared with English text, Chinese text is more difficult to extract useful feature words. Due to the complex syntax of Chinese, there are currently relatively few automatic summarization methods for Chinese text. In the past, only the important sentences in the original text can be selected and simply arranged to obtain a summary with chaotic sentences and insufficient coherence. Meanwhile, because Chinese short text usually contains more redundant information and the sentence structure is not neat, we propose a topic-based automatic summary method for Chinese short text. Firstly, a key sentence selection method is proposed combining topic words and TF-IDF to obtain the score of each text corresponding to the topic in the original text data. Then the sentence with the highest score as the topic sentence of the topic is selected. Considering that the short text of Weibo may contain a lot of irrelevant information and sometimes even lack some important components of topic, three retouching mechanisms are proposed to improve the conciseness, richness and readability of topic sentence extraction results. We validate our approach on natural disaster and social hot event datasets from Sina Weibo. The experimental results show that the polished topic summary not only reflects the exact relationship between topic sentences and natural disasters or social hot events, but also has rich semantic information. More importantly, we can almost grasp the basic elements of natural disaster or social hot event from the topic sentence, so as to help the government guide disaster relief or meet the needs of users for quickly obtaining information of social hot events.
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Type of study: Prognostic_studies Language: En Journal: Math Biosci Eng Year: 2020 Document type: Article Affiliation country: China

Full text: 1 Collection: 01-internacional Database: MEDLINE Type of study: Prognostic_studies Language: En Journal: Math Biosci Eng Year: 2020 Document type: Article Affiliation country: China