Text retrieval using sparsified concept decomposition matrix

Jing Gao, Jun Zhang

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

We examine text retrieval strategies using the sparsified concept decomposition matrix. The centroid vector of a tightly structured text collection provides a general description of text documents in that collection. The union of the centroid vectors forms a concept matrix. The original text data matrix can be projected into the concept space spanned by the concept vectors. We propose a procedure to conduct text retrieval based on the sparsified concept decomposition (SCD) matrix. Our experimental results show that text retrieval based on SCD may enhance the retrieval accuracy and reduce the storage cost, compared with the popular text retrieval technique based on latent semantic indexing with singular value decomposition.

Original languageEnglish
Pages (from-to)523-529
Number of pages7
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3314
DOIs
StatePublished - 2004

Bibliographical note

Funding Information:
★ The research work of the authors was supported in part by the U.S. National Science Foundation under grants CCR-0092532, and ACR-0202934, and in part by the U.S. Department of Energy Office of Science under grant DE-FG02-02ER45961.

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Text retrieval using sparsified concept decomposition matrix'. Together they form a unique fingerprint.

Cite this