Counting in the wild

Abstract:: In this paper we explore the scenario of learning to count multiple instances of objects from images that have been dot-annotated through crowdsourcing. Specifically, we work with a large and challenging image dataset of penguins in the wild, for which tens of thousands of volunteer annotators have placed dots on instances of penguins in tens of thousands of images. The dataset, introduced and released with this paper, shows such a high-degree of object occlusion and scale variation that individual object detection or simple counting-density estimation is not able to estimate the bird counts reliably. To address the challenging counting task, we augment and interleave density estimation with foreground-background segmentation and explicit local uncertainty estimation. The three tasks are solved jointly by a new deep multi-task architecture. Using this multi-task learning, we show that the spread between the annotators can provide hints about local object scale and aid the foreground-background segmentation, which can then be used to set a better target density for learning density prediction. Considerable improvements in counting accuracy over a single-task density estimation approach are observed in our experiments.

Files:: arteta16.pdf

(Preview, Accepted manuscript, pdf, 4.9MB, Terms of use)

Copyright holder:: Springer International Publishing AG
Notes:: This is an
accepted manuscript of a book chapter published by Springer in ECCV 2016: Computer Vision – ECCV 2016 on 2016-09-16, available online: http://dx.doi.org/10.1007/978-3-319-46478-7_30

Licence:: Terms and Conditions of Use for Oxford University Research Archive

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item