Faculty of Science

Department of Computing Science

Lately the focus on developing object tracking algorithms have mainly been in the field of trackers that can adapt to changing appearance of the object in an on-line setting, thus making it more robust. These trackers however are less suitable when used in manipulation set-up because of low convergence. In my research I study high DOF registration based trackers that tackle this issue.

Thesis Draft

TMT - Tracking Manipulation Tasks Dataset

Benchmark methods for Online Learned Trackers (OLT) test a tracker with varied challenges with a major focus on evaluating the appearance model of the tracker. Naturally the videos are less relevant to test registration based trackers. Also, the videos lack structure that would help researchers to fine tune their tracker. To this effect, we (in collaboration with Xi Zhang) present the Tracking Manipulation Dataset (TMT) dataset.

Salient features of the dataset are.



Fig1. Sample frame [LEFT] from a video recorded by the human user and a robot hand [RIGHT], with trackers [Baker Matthew's Inverse Compositional (BMIC), Efficient Second Order Minimization (ESM), Cascade trackers (kd-Tree based Approximate Nearest Neighbour and BMIC [NNBMIC] )] tracking the target.



We report a new metric, "Speed Sensitivity". This accurately models large object motion and compares how robust trackers are in handling them. Speed Sensitivity is the best represented as a 2D plot of success rate (fraction of successfully tracker frames to total frames) with varying object motion. Object motion is quatised by the change in pose in two successive franes.

Fig2. - The graphs show how the trackers perform (Success Rate = No of frames tracked with a set threshold/Total number of frames) on [LEFT] simulated large object motion on the standard "Lena.jpg" image by sampling warp from a Gaussian with varying sigma and [RIGHT] actual interframe motion quantified by how much the object moved between two subsequent frames. Calculating success against actual interframe motion shows GNN based search tracks higher object motion than other registration based trackers. Threshold for a successfully tracked frame was set to 5 pixels.

Graph Based Approximate Nearest Neighbour for Search in Tracking


Travis et. al (RSS, 2013) showed the use of Approximate Nearest Neighbour(ANN) algorithm in registration based tracker. They modelled the problem as a cascade of two trackers, one a coarse tracker-ANN, followed by a precise alignment using inverse compositional tracker. The use of randomised KD-Trees for ANN allowed them to handle large motion.
This being said,randomised KD-Trees doesn't take into account the sequential property of video data. In this research we adapt the Sequential Graph based Approximate Nearest Neighbour Search (SGANNS) algorithm as an alternative to KD-Tree based ANN.

Fig3. GNN based search method using two distance measures (Euclidean - GNNL2 and Manhattan - GNNL1) is compared with existing registration based trackers including RKLT.

We evaluate this new search methods on the following criteria.
Fig4. Comparison of SGNNS algorithm with randomised KD-Tree.

Detection based reinitialization of trackers

Trackers when used in manipulation are often subject to occlusion. Since most registration based tracker don't adapt for change in appearance of the object, an additional detection system is used whic reinitializes trackers.
The system


Fig5. Sample frames where the tracker was re initialised. The tracker here is NNIC.

Further Reading


Related Publications and Reports

Ankush Roy, Xi Zhang, Nina Wolleb, Camilo Perez, Martin Jagersand, "Tracking Benchmark and Evaluation for Manipulation Tasks", Accepted: ICRA - 2015, Seattle, USA

Ankush Roy, Xi Zhang, Martin Jagersand, " Technical Report: Tracking Benchmark and Evaluation for Manipulation Tasks "


References


Simon Baker, and Iain Matthews. "Lucas-kanade 20 years on: A unifying framework." in Proc of International journal of computer vision (IJCV), vol-56.3, (2004): pp. 221-255.
Selim Benhimane, and Ezio Malis. "Real-time image-based tracking of planes using efficient second-order minimization." in International Conference on Intelligent Robots and Systems, 2004.(IROS 2004). 2004 IEEE/RSJ, Vol. 1.
Ankush Roy, Xi Zhang, Nina Wolleb, Camilo Perez Quintero, and Martin Jägersand "Tracking Benchmark and Evaluation for Manipulation Tasks" in 2015 International Conference of Robotics and Automation (ICRA), IEEE, Seattle, USA, May 26-29
David Ross, Jongwoo Lim, and Ming-Hsuan Yang. "Adaptive probabilistic visual tracking with incremental subspace update." in European Conference on Computer Vision (ECCV) 2004. Springer Berlin Heidelberg, 2004. pp. 470-482.
Chenglong Bao, Yi Wu, Haibin Ling, and Hui Ji. "Real time robust l1 tracker using accelerated proximal gradient approach." in Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 1830-1837. IEEE, 2012
Travis Dick, Camilo Perez Quintero, Martin Jägersand, and Azad Shademan. "Realtime Registration-Based Tracking via Approximate Nearest Neighbour Search." in Robotics: Science and Systems. (RSS) 2013.
Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas. "Tracking-learning-detection." in Proc of IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 34, no. 7 (2012): 1409-1422.
Xi Zhang, Abhineet Singh, and Martin Jägersand RKLT: 8 DOF real-time robust video tracking combining coarse RANSAC features and accurate fast template registration in Canadian Conference on Robot Vision (CRV), 2015,, Halifax, Canada
Matej Kristan, Roman Pflugfelder, Aleš Leonardis, Jiri Matas, Luka Čehovin, Georg Nebehay, Tomáš Vojíř et al. "The visual object tracking vot2014 challenge results." In European Conference on Computer Vision-ECCV 2014 Workshops, pp. 191-217. Springer International Publishing, 2014.
Kiana Hajebi, Hong Zhang, (2014), "An Efficient Index for Visual Search in Appearance-based SLAM", in Proceedings of IEEE International Conference on Robotics and Automation (ICRA'14), Hong Kong, China, May 31-June 05