Japanese / English

Detail of Publication

Text Language Japanese
Authors Masaki Nouso´╝îKoichi Kise´╝îMasakazu iwamura
Title Speeding up Local Text Reuse Detection by Locality-Sensitive Hashing
Journal Proc. First Forum on Data Engineering and Information Management
Presentation number D8-6
Reviewed or not Not reviewed
Month & Year March 2009
Abstract Local text reuse detection is a technology to detect partial copies of documents and useful for anti-plagiarism of documents. A problem of methods for local text reuse detection is how to make the calculation of similarity of text parts efficiently, since it is generally based on nearest neighbor search whose cost is related to the size of the database. In order to solve this problem, we propose a method of local text reuse detection with the help of approximate nearest neighbor search. Methods called Locality-Sensitive Hashing (LSH) and its variant called Spherical LSH are both employed to speed up the processing.
Back to list