Conference item
The truth about cats and dogs
- Abstract:
- Template-based object detectors such as the deformable parts model of Felzenszwalb et al. [11] achieve state-of-the-art performance for a variety of object categories, but are still outperformed by simpler bag-of-words models for highly flexible objects such as cats and dogs. In these cases we propose to use the template-based model to detect a distinctive part for the class, followed by detecting the rest of the object via segmentation on image specific information learnt from that part. This approach is motivated by two ob- servations: (i) many object classes contain distinctive parts that can be detected very reliably by template-based detec- tors, whilst the entire object cannot; (ii) many classes (e.g. animals) have fairly homogeneous coloring and texture that can be used to segment the object once a sample is provided in an image. We show quantitatively that our method substantially outperforms whole-body template-based detectors for these highly deformable object categories, and indeed achieves accuracy comparable to the state-of-the-art on the PASCAL VOC competition, which includes other models such as bag-of-words. © 2011 IEEE.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 2.8MB, Terms of use)
-
- Publisher copy:
- 10.1109/ICCV.2011.6126398
Authors
- Publisher:
- IEEE
- Host title:
- 2011 International Conference on Computer Vision
- Pages:
- 1427-1434
- Publication date:
- 2012-01-12
- Event title:
- International Conference on Computer Vision (ICCV 2011)
- Event location:
- Barcelona, Spain
- Event start date:
- 2011-11-06
- Event end date:
- 2011-11-13
- DOI:
- EISSN:
-
2380-7504
- ISSN:
-
1550-5499
- EISBN:
- 978-1-4577-1102-2
- ISBN:
- 978-1-4577-1101-5
- Language:
-
English
- Keywords:
- Pubs id:
-
pubs:314498
- UUID:
-
uuid:bc56fd8f-c89e-4bc0-bf14-344b3c1df9d1
- Local pid:
-
pubs:314498
- Source identifiers:
-
314498
- Deposit date:
-
2012-12-19
- ARK identifier:
Terms of use
- Copyright holder:
- IEEE
- Copyright date:
- 2011
- Rights statement:
- © 2011 IEEE.
- Notes:
- This paper was presented at the International Conference on Computer Vision (ICCV 2011), 6th-13th November 2011, Barcelona, Spain. This is the accepted manuscript version of the article. The final version is available online from IEEE at https://dx.doi.org/10.1109/ICCV.2011.6126398
If you are the owner of this record, you can report an update to it here: Report update to this record