Journal article
ENFrame: a framework for processing probabilistic data
- Abstract:
-
This article introduces ENFrame, a framework for processing probabilistic data. Using ENFrame, users can write programs in a fragment of Python with constructs such as loops, list comprehension, aggregate operations on lists, and calls to external database engines. Programs are then interpreted probabilistically by ENFrame. We exemplify ENFrame on three clustering algorithms (k-means, k-medoids, and Markov clustering) and one classification algorithm (k-nearest-neighbour).
A key component of ENFrame is an event language to succinctly encode correlations, trace the computation of user programs, and allow for computation of discrete probability distributions for program variables. We propose a family of sequential and concurrent, exact, and approximate algorithms for computing the probability of interconnected events. Experiments with k-medoids clustering and k-nearest-neighbour show orders-of-magnitude improvements of exact processing using ENFrame over naïve processing in each possible world, of approximate over exact, and of concurrent over sequential processing.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 890.9KB, Terms of use)
-
- Publisher copy:
- 10.1145/2877205
Authors
- Publisher:
- Association for Computing Machinery
- Journal:
- ACM Transactions on Database Systems More from this journal
- Volume:
- 41
- Issue:
- 1
- Article number:
- 3
- Publication date:
- 2016-03-18
- Acceptance date:
- 2015-12-01
- DOI:
- EISSN:
-
1557-4644
- ISSN:
-
0362-5915
- Language:
-
English
- Keywords:
- Pubs id:
-
pubs:609202
- UUID:
-
uuid:10f1d59b-d5f3-4a71-a19d-0f27c1140a59
- Local pid:
-
pubs:609202
- Source identifiers:
-
609202
- Deposit date:
-
2016-03-10
- ARK identifier:
Terms of use
- Copyright holder:
- ACM
- Copyright date:
- 2016
- Rights statement:
- Copyright © 2016 ACM.
- Notes:
- This is the author accepted manuscript version of the article. The final version is available online from ACM at: http://dx.doi.org/10.1145/2877205
If you are the owner of this record, you can report an update to it here: Report update to this record