Journal article icon

Journal article

An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs

Abstract:
Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users’ queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with “vanilla” LSH, even when using the same amount of space.Graham Cormode;Anirban Dasgupta;Amit Goyal;Chi Hoon Le
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.1371/journal.pone.0191175

Authors

More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0002-0698-0922
More by this author
Role:
Author
ORCID:
0000-0002-8494-3692
More by this author
Role:
Author
ORCID:
0000-0002-4004-8039


Publisher:
Public Library of Science
Journal:
PLoS ONE More from this journal
Volume:
13
Issue:
1
Pages:
e0191175-e0191175
Publication date:
2018-01-18
DOI:
EISSN:
1932-6203
ISSN:
1932-6203


Language:
English
Keywords:
Pubs id:
2364721
Local pid:
pubs:2364721
Source identifiers:
W2790518931
Deposit date:
2026-01-30
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP