Thesis icon

Thesis

Implicit shape representation for 2d/3d tracking and reconstruction

Abstract:

This thesis develops and describes methods for real-time tracking, segmentation and 3-dimensional (3D) model acquisition, in the context of developing games for stroke patients that are rehabilitating at home. Real-time tracking and reconstruction of a stroke patient's feet, hands and the control objects that they are touching can enable not only the graphical visualization of the virtual avatar in the rehabilitation games, but also permits measurement of the patient's performs.

Depth or combined colour and depth imagery from a Kinect sensor is used as input data. The 3D signed distance function (SDF) is used as implicit shape representation, and a series of probabilistic graphical models are developed for the problem of model-based 3D tracking, simultaneous 3D tracking and reconstruction and 3D tracking of multiple objects with identical appearance. The work is based on the assumption that the observed imagery is generated jointly by the pose(s) and the shape(s). The depth of each pixel is randomly and independently sampled from the likelihood of the pose(s) and the shape(s). The pose(s) tracking and 3D shape reconstruction problems are then cast as the maximum likelihood (ML) or maximum a posterior (MAP) estimate of the pose(s) or 3D shape.

This methodology first leads to a novel probabilistic model for tracking rigid 3D objects with only depth data. For a known 3D shape, optimization aims to find the optimal pose that back projects all object region pixels onto the zero level set of the 3D shape, thus effectively maximising the likelihood of the pose. The method is extended to consider colour information for more robust tracking in the presence of outliers and occlusions. Initialised with a coarse 3D model, the extended method is also able to simultaneously reconstruct and track an unknown 3D object in real time. Finally, the concept of `shape union' is introduced to solve the problem of tracking multiple 3D objects with identical appearance. This is formulated as the minimum value of all SDFs in camera coordinates, which (i) leads to a per-pixel soft membership weight for each object thus providing an elegant solution for the data association in multi-target tracking and (ii) it allows for probabilistic physical constraints that avoid collisions between objects to be naturally enforced.

The thesis also explore the possibility of using implicit shape representation for online shape learning. We use the harmonics of 2D discrete cosine transform (DCT) to represent 2D shapes. High frequency harmonics are decoupled from low ones to represent the coarse information and the details of the 2D shape. A regression model is learnt online to model the relationship between the high and low frequency harmonics using Locally Weighted Projection Regression (LWPR). We have demonstrated that the learned regression model is able to detect occlusion and recover them to the complete shape.

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Research group:
Active Vision Group
Oxford college:
Oriel College
Role:
Author

Contributors

Division:
MPLS
Department:
Engineering Science
Role:
Supervisor
Division:
MPLS
Department:
Engineering Science
Role:
Supervisor


More from this funder
Funding agency for:
Ren, Y


Publication date:
2014
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Language:
English
Keywords:
Subjects:
UUID:
uuid:c70dc663-ee7c-4100-b492-3a85bf8640d1
Local pid:
ora:10607
Deposit date:
2015-03-16

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP