A Double Scoring Method for XML Element Retrieval

Authors

  • Tanakorn Wichaiwong Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok
  • Chuleerat Jaruskulchai Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok

Keywords:

Ranking strategies, indexing units, XML retrieval, BM25F

Abstract

Efficient retrieval of XML elements and documents is essential in the effective application of the XML format. The ranking function BM25F is composed of several document fields with potentially different degrees of importance; these fields are known as selected fields that give substantial improvements over the baseline BM25. The BM25F function has performed well in past evaluations; however, there are issues that require additional attention. In the first instance, which elements should be treated as fields? Secondly, what is an appropriate weight for each field? Previously, document fields were selected manually, and the weight for each chosen field was tuned before being assigned. Two automatic methods are introduced in this paper that enable the extraction of fields in document-centric XML documents and the assignment weights to the selected fields. Our experiments show an improvement of up to 28 % over BM25, and up to 15 % over BM25F at iP[0.01] based on INEX evaluations.

Downloads

Download data is not yet available.

Downloads

Published

2013-05-23

How to Cite

Wichaiwong, T., & Jaruskulchai, C. (2013). A Double Scoring Method for XML Element Retrieval. COMPUTING AND INFORMATICS, 32(2), 411–440. Retrieved from http://www.cai.sk/ojs/index.php/cai/article/view/1629