Deep keyphrase generation

Rui Meng, Sanqiang Zhao, Shuguang Han, Daqing He, Peter Brusilovsky, Yu Chi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

253 Scopus citations

Abstract

Keyphrase provides highly-summative information that can be effectively used for understanding, organizing and retrieving text content. Though previous studies have provided many workable solutions for automated keyphrase extraction, they commonly divided the to-be-summarized content into multiple text chunks, then ranked and selected the most meaningful ones. These approaches could neither identify keyphrases that do not appear in the text, nor capture the real semantic meaning behind the text. We propose a generative model for keyphrase prediction with an encoder-decoder framework, which can effectively overcome the above drawbacks. We name it as deep keyphrase generation since it attempts to capture the deep semantic meaning of the content with a deep learning method. Empirical analysis on six datasets demonstrates that our proposed model not only achieves a significant performance boost on extracting keyphrases that appear in the source text, but also can generate absent keyphrases based on the semantic meaning of the text. Code and dataset are available at https://github.com/memray/seq2seqkeyphrase.

Original languageEnglish
Title of host publicationACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
Pages582-592
Number of pages11
ISBN (Electronic)9781945626753
DOIs
StatePublished - 2017
Event55th Annual Meeting of the Association for Computational Linguistics, ACL 2017 - Vancouver, Canada
Duration: Jul 30 2017Aug 4 2017

Publication series

NameACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
Volume1

Conference

Conference55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
Country/TerritoryCanada
CityVancouver
Period7/30/178/4/17

Bibliographical note

Funding Information:
We would like to thank Jiatao Gu and Miltiadis Allamanis for sharing the source code and giving helpful advice. We also thank Wei Lu, Yong Huang, Qikai Cheng and other IRLAB members at Wuhan University for the assistance of dataset development. This work is partially supported by the National Science Foundation under Grant No.1525186.

Publisher Copyright:
© 2017 Association for Computational Linguistics.

Funding

We would like to thank Jiatao Gu and Miltiadis Allamanis for sharing the source code and giving helpful advice. We also thank Wei Lu, Yong Huang, Qikai Cheng and other IRLAB members at Wuhan University for the assistance of dataset development. This work is partially supported by the National Science Foundation under Grant No.1525186.

FundersFunder number
National Science Foundation Arctic Social Science Program1525186
Wuhan University
National Science Foundation Arctic Social Science Program

    ASJC Scopus subject areas

    • Language and Linguistics
    • Artificial Intelligence
    • Software
    • Linguistics and Language

    Fingerprint

    Dive into the research topics of 'Deep keyphrase generation'. Together they form a unique fingerprint.

    Cite this