Zero-label Anaphora Resolution for Off-Script User Queries in Goal-Oriented Dialog Systems

M. H. Maqbool, Luxun Xu, A. B. Siddique, Niloofar Montazeri, Vagelis Hristidis, Hassan Foroosh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Most of the prior work on goal-oriented dialog systems has concentrated on developing systems that heavily rely on the relevant domain APIs to generate a response. However, in the real world, users frequently make such requests that the provided APIs cannot handle, we call them 'off-script' queries. Ideally, existing information retrieval approaches could have leveraged relevant enterprise's unstructured data sources to retrieve the appropriate information to synthesize responses for such queries. But, in multi-turn dialogs, these queries oftentimes are not self-contained, rendering most of the existing information retrieval methods ineffective, and the dialog systems end up responding 'sorry I don't know this'. That is, off-script queries may mention entities from the previous dialog turns (often expressed through pronouns) or do not mention the referred entities at all. These two problems are known as coreference resolution and ellipsis, respectively; extensively studied research problems in the supervised settings. In this paper, we first build a dataset of off-script and contextual user queries for goal-oriented dialog systems. Then, we propose a zero-label approach to rewrite the contextual query as a self-contained one by leveraging the dialog's state. We propose two parallel coreference and ellipsis resolution pipelines to synthesize candidate queries, rank and select the candidates based on the pre-trained language model GPT-2, and refine the selected self-contained query with the pre-trained BERT. We show that our approach leads to higher quality expanded questions compared to state-of-the-art supervised methods, on our dataset and existing datasets. The key advantage of our novel zero-label approach is that it requires no labeled training data and can be applied to any domain seamlessly, in contrast to previous work that requires labeled training data for each new domain.

Original languageEnglish
Title of host publicationProceedings - 16th IEEE International Conference on Semantic Computing, ICSC 2022
Pages217-224
Number of pages8
ISBN (Electronic)9781665434188
DOIs
StatePublished - 2022
Event16th IEEE International Conference on Semantic Computing, ICSC 2022 - Virtual, Online, United States
Duration: Jan 26 2022Jan 28 2022

Publication series

NameProceedings - 16th IEEE International Conference on Semantic Computing, ICSC 2022

Conference

Conference16th IEEE International Conference on Semantic Computing, ICSC 2022
Country/TerritoryUnited States
CityVirtual, Online
Period1/26/221/28/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Keywords

  • Contextual Query Rewrite
  • Dialog Systems
  • Goal-Oriented Dialog Systems
  • Zero-label Learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Zero-label Anaphora Resolution for Off-Script User Queries in Goal-Oriented Dialog Systems'. Together they form a unique fingerprint.

Cite this