Thesis icon

Thesis

Category-theoretic quantitative compositional distributional models of natural language semantics

Abstract:

This thesis is about the problem of compositionality in distributional semantics. Distributional semantics presupposes that the meanings of words are a function of their occurrences in textual contexts. It models words as distributions over these contexts and represents them as vectors in high dimensional spaces. The problem of compositionality for such models concerns itself with how to produce distributional representations for larger units of text (such as a verb and its arguments) by composing the distributional representations of smaller units of text (such as individual words).

This thesis focuses on a particular approach to this compositionality problem, namely using the categorical framework developed by Coecke, Sadrzadeh, and Clark, which combines syntactic analysis formalisms with distributional semantic representations of meaning to produce syntactically motivated composition operations. This thesis shows how this approach can be theoretically extended and practically implemented to produce concrete compositional distributional models of natural language semantics. It furthermore demonstrates that such models can perform on par with, or better than, other competing approaches in the field of natural language processing.

There are three principal contributions to computational linguistics in this thesis. The first is to extend the DisCoCat framework on the syntactic front and semantic front, incorporating a number of syntactic analysis formalisms and providing learning procedures allowing for the generation of concrete compositional distributional models. The second contribution is to evaluate the models developed from the procedures presented here, showing that they outperform other compositional distributional models present in the literature. The third contribution is to show how using category theory to solve linguistic problems forms a sound basis for research, illustrated by examples of work on this topic, that also suggest directions for future research.

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Research group:
Quantum group; Computational Linguistics Group
Oxford college:
Balliol College
Role:
Author
More by this author
Division:
MPLS
Department:
Computer Science
Role:
Author

Contributors

Division:
MPLS
Department:
Computer Science
Role:
Supervisor
Division:
MPLS
Department:
Computer Science
Role:
Supervisor
Division:
MPLS
Department:
Computer Science
Role:
Supervisor


Publication date:
2013
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
Oxford University, UK


Language:
English
Keywords:
Subjects:
UUID:
uuid:d7f9433b-24c0-4fb5-925b-d8b3744b7012
Local pid:
ora:8598
Deposit date:
2014-06-13

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP