Comparison of overlap detection techniques

Krisztián Monostori, Raphael Finkel, Arkady Zaslavsky, Gábor Hodász, Máté Pataki

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

25 Scopus citations


Easy access to the World Wide Web has raised concerns about copyright issues and plagiarism. It is easy to copy someone else's work and submit it as someone's own. This problem has been targeted by many systems, which use very similar approaches. These approaches are compared in this paper and suggestions are made when different strategies are more applicable than others. Some alternative approaches are proposed that perform better than previously presented methods. These previous methods share two common stages: chunking of documents and selection of representative chunks. We study both stages and also propose alternatives that are better in terms of accuracy and space requirement. The applications of these methods are not limited to plagiarism detection but may target other copy-detection problems. We also propose a third stage to be applied in the comparison that uses suffix trees and suffix vectors to identify the overlapping chunks.

Original languageEnglish
Title of host publicationComputational Science, ICCS 2002 - International Conference, Proceedings
Number of pages10
EditionPART 1
StatePublished - 2002
EventInternational Conference on Computational Science, ICCS 2002 - Amsterdam, Netherlands
Duration: Apr 21 2002Apr 24 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume2329 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceInternational Conference on Computational Science, ICCS 2002

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Comparison of overlap detection techniques'. Together they form a unique fingerprint.

Cite this