Journal article icon

Journal article

Constructive visual analytics for text similarity detection

Abstract:
Detecting similarity between texts is a frequently encountered text mining task. Because the measurement of similarity is typically composed of a number of metrics, and some measures are sensitive to subjective interpretation, a generic detector obtained using machine learning often has difficulties balancing the roles of different metrics according to the semantic context exhibited in a specific collection of texts. In order to facilitate human interaction in a visual analytics process for text similarity detection, we first map the problem of pairwise sequence comparison to that of image processing, allowing patterns of similarity to be visualized as a 2D pixelmap. We then devise a visual interface to enable users to construct and experiment with different detectors using primitive metrics, in a way similar to constructing an image processing pipeline. We deployed this new approach for the identification of commonplaces in 18th-century literary and print culture. Domain experts were then able to make use of the prototype system to derive new scholarly discoveries and generate new hypotheses.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.1111/cgf.12798

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Sub department:
Oxford e-Research Centre
Role:
Author


Publisher:
Wiley
Journal:
Computer Graphics Forum More from this journal
Publication date:
2016-02-19
Acceptance date:
2015-12-17
DOI:
EISSN:
1467-8659
ISSN:
0167-7055

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP