Journal article icon

Journal article

A Hierarchical Bayesian Language Model based on Pitman-Yor Processes

Abstract:

We propose a new hierarchical Bayesian n-gram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called Pitman-Yor processes which produce power-law distributions more closely resembling those in natural languages. We show that an approximation to the hierarchical Pitman-Yor language model recovers the exact formulation of interpolated Kneser-Ney, one of the best smoothing methods for n-gram language models. Experiments verify that...

Expand abstract
Publication status:
Published

Actions


Authors


More by this author
Institution:
University of Oxford
Department:
Oxford, MPLS, Statistics
Role:
Author
Journal:
COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE
Volume:
1
Pages:
985-992
Publication date:
2006-01-01
URN:
uuid:5fc3cc33-dca0-4179-805f-9226c263e2cb
Source identifiers:
353269
Local pid:
pubs:353269
Language:
English

Terms of use


Metrics


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP