An Efficient Method of Summarizing Documents Using Impression Measurements

Authors

  • Abdunabi Ubul Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima
  • El-Sayed Atlam Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima
  • Hiroya Kitagawa Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima
  • Masao Fuketa Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima
  • Kazuhiro Morita Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima
  • Jun-ichi Aoe Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima

Keywords:

Impressive expressions, NMF methods, precision, relevancy

Abstract

Automatic generic document summarization based on unsupervised schemes is a very useful approach because it does not require training data. Although techniques using latent semantic analysis (LSA) and non-negative matrix factorization (NMF) have been applied to determine topics of documents, there are no researches on reduction of matrix and speeding up of computation of the NMF method. In order to achieve this scheme, this paper utilizes the generic impressive expressions from newspapers to extract important sentences as summary. Therefore, it has no stemming processes and no filtering of stop words. Generally, novels are typical documents providing sentimental impression for readers. However, newspapers deliver different impressions for new knowledge because they inform readers about current events, informative articles and diverse features. The proposed method introduces impressive expressions for newspapers and their measurements are applied to the NMF method. From 100 KB text data of experimental results by the proposed method, it turns out that the matrix size reduces by 80 % and the computation of the NMF method becomes 7 times faster than with the original method, without degrading the relevancy of extracted sentences.

Downloads

Download data is not yet available.

Downloads

Published

2013-05-23

How to Cite

Ubul, A., Atlam, E.-S., Kitagawa, H., Fuketa, M., Morita, K., & Aoe, J.- ichi. (2013). An Efficient Method of Summarizing Documents Using Impression Measurements. COMPUTING AND INFORMATICS, 32(2), 371–391. Retrieved from http://www.cai.sk/ojs/index.php/cai/article/view/1626