Compact Indexes Based on Core Content in Personal Dataspace Management System

Authors

  • Ning Wang School of Computer and Information Technology, Beijing Jiaotong University, No. 3 Shangyuancun, Haidian District, 100044 Beijing
  • Hongfang Du School of Computer and Information Technology, Beijing Jiaotong University, No. 3 Shangyuancun, Haidian District, 100044 Beijing
  • Baomin Xu School of Computer and Information Technology, Beijing Jiaotong University, No. 3 Shangyuancun, Haidian District, 100044 Beijing
  • Guojun Dai Computer School, Hangzhou Dianzi University, Hangzhou, 310018

Keywords:

Keyword query, indexing, result quality, semantic analysis, personal dataspace management system

Abstract

A Personal DataSpace Management System is a platform to manage personal data with heterogeneous data types, in which keyword query is a primary query form for users who know little about the structure of the dataspace. Unlike exploratory queries in web search, a user in a personal dataspace usually has a specific search target and wants to find some known items in mind. To improve result quality in terms of query relevance in a personal dataspace, we propose the concept of compact index in this paper. We refer to the most important and representative semantics from documents as core content, and build compact index on it. We propose algorithm for selecting core content from a document based on semantic analysis, which can process English and Chinese documents uniformly. Furthermore, a software platform named Versatile is introduced for flexible personal data management, in which core content is extracted for building compact indexes and generating query-biased snippet efficiently and accurately. Finally, extensive experiments have been conducted to show the effectiveness and feasibility of compact indexes in personal dataspace management system.

Downloads

Download data is not yet available.

Downloads

Published

2014-06-24

How to Cite

Wang, N., Du, H., Xu, B., & Dai, G. (2014). Compact Indexes Based on Core Content in Personal Dataspace Management System. COMPUTING AND INFORMATICS, 33(2), 281–302. Retrieved from https://www.cai.sk/ojs/index.php/cai/article/view/999