Journal article icon

Journal article

On Markov chain Monte Carlo Methods for Tall Data

Abstract:
Markov chain Monte Carlo methods are often deemed too computationally intensive to be of any practical use for big data applications, and in particular for inference on datasets containing a large number n of individual data points, also known as tall datasets. In scenarios where data are assumed independent, various approaches to scale up the Metropolis- Hastings algorithm in a Bayesian inference context have been recently proposed in machine learning and computational statistics. These approaches can be grouped into two categories: divide-and-conquer approaches and, subsampling-based algorithms. The aims of this article are as follows. First, we present a comprehensive review of the existing literature, commenting on the underlying assumptions and theoretical guarantees of each method. Second, by leveraging our understanding of these limitations, we propose an original subsampling-based approach relying on a control variate method which samples under regularity conditions from a distribution provably close to the posterior distribution of interest, yet can require less than O(n) data point likelihood evaluations at each iteration for certain statistical models in favourable scenarios. Finally, we emphasize that we have only been able so far to propose subsampling-based methods which display good performance in scenarios where the Bernstein-von Mises approximation of the target posterior distribution is excellent. It remains an open challenge to develop such methods in scenarios where the Bernstein-von Mises approximation is poor.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Oxford college:
Hertford College
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Role:
Author


More from this funder
Funding agency for:
Holmes, C
Grant:
MC UP A390 1107
More from this funder
Funding agency for:
Bardenet, R
Grant:
ANR-16-CE23-0003
More from this funder
Funding agency for:
Bardenet, R
Grant:
ANR-16-CE23-0003
More from this funder
Funding agency for:
Bardenet, R
Doucet, A
Holmes, C
Grant:
ANR-16-CE23-0003
EP/K000276/1
MC UP A390 1107


Publisher:
Journal of Machine Learning Research
Journal:
Journal of Machine Learning Research More from this journal
Volume:
18
Issue:
47
Pages:
1-43
Publication date:
2017-05-01
Acceptance date:
2017-03-08
EISSN:
1533-7928
ISSN:
1532-4435


Pubs id:
pubs:686656
UUID:
uuid:a148cceb-11f6-4e35-b8c7-60883621624f
Local pid:
pubs:686656
Source identifiers:
686656
Deposit date:
2017-03-22

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP