Web retrieval studies have mostly used URL, title, body, and anchor text fields to represent Web documents. On the other hand, HTML standards provide a rich set of elements to define different parts of a Web page. For example, meta elements are used to provide structured metadata about a Web page not to end users, but instead to browsers or crawlers. However, it is unclear whether meta tags are or are not useful for Web retrieval, as most of the previous studies leveraged URL, title, body, and anchor text fields. In this work, we examine the usefulness of two meta tags, namely keywords and description, based on ad-hoc tasks of previous TREC studies. Through experiments on the standard TREC Web datasets and several query sets, our results using the state-of-the-art term-weighting models show that the utilization of description field systematically increases the retrieval effectiveness, to a statistically significant degree most of the time. By contrast, the employment of keywords field may cause a significant deterioration in retrieval effectiveness for certain term-weighting models.
Primary Language | English |
---|---|
Subjects | Engineering |
Journal Section | Articles |
Authors | |
Publication Date | March 31, 2020 |
Published in Issue | Year 2020 |