Abstract
Most of the prior work on goal-oriented dialog systems has concentrated on developing systems that heavily rely on the relevant domain APIs to generate a response. However, in the real world, users frequently make such requests that the provided APIs cannot handle, we call them 'off-script' queries. Ideally, existing information retrieval approaches could have leveraged relevant enterprise's unstructured data sources to retrieve the appropriate information to synthesize responses for such queries. But, in multi-turn dialogs, these queries oftentimes are not self-contained, rendering most of the existing information retrieval methods ineffective, and the dialog systems end up responding 'sorry I don't know this'. That is, off-script queries may mention entities from the previous dialog turns (often expressed through pronouns) or do not mention the referred entities at all. These two problems are known as coreference resolution and ellipsis, respectively; extensively studied research problems in the supervised settings. In this paper, we first build a dataset of off-script and contextual user queries for goal-oriented dialog systems. Then, we propose a zero-label approach to rewrite the contextual query as a self-contained one by leveraging the dialog's state. We propose two parallel coreference and ellipsis resolution pipelines to synthesize candidate queries, rank and select the candidates based on the pre-trained language model GPT-2, and refine the selected self-contained query with the pre-trained BERT. We show that our approach leads to higher quality expanded questions compared to state-of-the-art supervised methods, on our dataset and existing datasets. The key advantage of our novel zero-label approach is that it requires no labeled training data and can be applied to any domain seamlessly, in contrast to previous work that requires labeled training data for each new domain.
Original language | English |
---|---|
Title of host publication | Proceedings - 16th IEEE International Conference on Semantic Computing, ICSC 2022 |
Pages | 217-224 |
Number of pages | 8 |
ISBN (Electronic) | 9781665434188 |
DOIs | |
State | Published - 2022 |
Event | 16th IEEE International Conference on Semantic Computing, ICSC 2022 - Virtual, Online, United States Duration: Jan 26 2022 → Jan 28 2022 |
Publication series
Name | Proceedings - 16th IEEE International Conference on Semantic Computing, ICSC 2022 |
---|
Conference
Conference | 16th IEEE International Conference on Semantic Computing, ICSC 2022 |
---|---|
Country/Territory | United States |
City | Virtual, Online |
Period | 1/26/22 → 1/28/22 |
Bibliographical note
Publisher Copyright:© 2022 IEEE.
Keywords
- Contextual Query Rewrite
- Dialog Systems
- Goal-Oriented Dialog Systems
- Zero-label Learning
ASJC Scopus subject areas
- Artificial Intelligence
- Computer Networks and Communications
- Computer Science Applications
- Information Systems and Management