The Search Evaluation Dataset
This dataset was created to support research on session search evaluation. We conducted a user study which contained 450 search sessions for 9 search tasks. Users’ interactions and explicit feedback were collected during searching process. We collected the document relevance assessment through a crowdsourcing platform.
Motivation
User satisfaction is an important variable in Web search evaluation studies and has received more and more attention in recent years. Many studies regard user satisfaction as the ground truth for designing better evaluation metrics. However, most of the existing studies focus on designing Cranfield-like evaluation metrics to reflect user satisfaction at query-level.
As information need becomes more and more complex, users often need multiple queries and multiround search interactions to complete a search task (e.g. exploratory search). In those cases, how to characterize the user’s satisfaction during a search session still remains to be investigated.
In this study, we collect a dataset through a laboratory study in which users need to complete some complex search tasks. With the help of hierarchical linear models (HLM), we try to reveal how user’s query-level and session-level satisfaction are a ected by different cognitive effects.
Data description
The dataset consists of 450 search sessions of 9 tasks. For each search task, the participant needs to read and memorize the task description and repeat the task description without viewing it. The participant can submit queries and click on the results to collect information as they usually do with commercial search engines. He/She is asked to mark whether the clicked documents were useful (4-level) and give a 5-level graded satisfaction feedback on each query. Finally, he/she is required to give an answer to the search task and an overall 5-level graded satisfaction feedback of search experience in the task.
we also collect the relevance assessment (4-level) of all the documents in our user study with a popular Chinese crowdsourcing platform. The detailed information are shown in the following table.
Measure | Type | Description |
---|---|---|
task | #9 indexs | Index used to distinguish tasks |
query | text | user submitted query |
clicked_url | url | user clicked url |
start/end time | numerical | the time user behavior occurs |
usefulness | 1(low)~4(high) | user’s usefulness feedback on a document |
query satisfaction | 1(low)~5(high) | user’s satisfaction feedback on a search query |
session satisfaction | 1(low)~5(high) | user’s satisfaction feedback on a search session |
answer | text | user’s answer on a task |
relevance | 0(low)~3(high) | crowdsourcing annotation |
How to get the detailed dataset
We provide the data used in the paper we published at the KDD2019 conference. For the whole dataset that contains the detailed user behavior, you need to contact with us (maojiaxin@gmail.com). After signing an application forum online, we can send you the data.
Citation
If you use this dataset in your research, please add the following bibtex citation in your references. A preprint of this paper can be found here.
@inproceedings{DBLP:conf/kdd/LiuMLZM19,
author = {Mengyang Liu and
Jiaxin Mao and
Yiqun Liu and
Min Zhang and
Shaoping Ma},
title = {Investigating Cognitive Effects in Session-level Search User Satisfaction},
booktitle = {Proceedings of the 25th {ACM} {SIGKDD} International Conference on
Knowledge Discovery {\&} Data Mining, {KDD} 2019, Anchorage, AK,
USA, August 4-8, 2019},
pages = {923--931},
year = {2019},
crossref = {DBLP:conf/kdd/2019},
url = {https://doi.org/10.1145/3292500.3330981},
doi = {10.1145/3292500.3330981},
timestamp = {Mon, 04 Nov 2019 09:51:27 +0100},
biburl = {https://dblp.org/rec/bib/conf/kdd/LiuMLZM19},
bibsource = {dblp computer science bibliography, https://dblp.org}
}