Lately the focus on developing object tracking algorithms
have mainly been in the field of trackers that can adapt to changing appearance of the object in an on-line setting, thus making it more robust. These trackers however are less suitable when used in manipulation set-up because of low convergence. In my research I study high DOF registration based trackers that tackle this issue.
Benchmark methods for Online Learned Trackers (OLT) test a tracker with varied challenges with a major focus on evaluating the appearance model of the tracker. Naturally the videos are less relevant to test registration based trackers. Also, the videos lack structure that would help researchers to fine tune their tracker. To this effect, we (in collaboration with Xi Zhang) present the Tracking Manipulation Dataset (TMT) dataset.
Salient features of the dataset are.
Over 100 videos of daily tasks using different objects
Videos of natural motion
Videos recorded by both a human user and a 7 DOF WAM arm using Barett hand
Publicly available ground truth data for all videos
Videos tagged with challenges they present
New evaluation methology
Detailed evaluation of 6 trackers, evaluation scripts made public
We report a new metric, "Speed Sensitivity". This accurately models large object motion and compares how robust trackers are in handling them. Speed Sensitivity is the best represented as a 2D plot of success rate (fraction of successfully tracker frames to total frames) with varying object motion. Object motion is quatised by the change in pose in two successive franes.
Graph Based Approximate Nearest Neighbour for Search in Tracking
Travis et. al (RSS, 2013) showed the use of Approximate Nearest Neighbour(ANN) algorithm in registration based tracker.
They modelled the problem as a cascade of two trackers, one a coarse tracker-ANN, followed by a precise alignment using
inverse compositional tracker. The use of randomised KD-Trees for ANN allowed them to handle large motion.
This being said,randomised KD-Trees doesn't take into account the sequential property of video data. In this research
we adapt the Sequential Graph based Approximate Nearest Neighbour Search (SGANNS) algorithm as an alternative to KD-Tree based ANN.
We evaluate this new search methods on the following criteria.
Compare Success Rate with other existing trackers
Compare with KD-Tree to evaluate success rate, speed
Compare Speed Sensitivity with other existing trackers.
Results reported on all videos in TMT dataset
Detection based reinitialization of trackers
Trackers when used in manipulation are often subject to occlusion. Since most registration based tracker don't adapt for change in appearance
of the object, an additional detection system is used whic reinitializes trackers.
The system
Uses a model of the object, built on the positional information that user initialised the tracker with.
Detects tracker failure based on the residual error and reinitialises the tracker
Object detection based on matching SIFT feature together with RANSAC is used for reinitilization
Ankush Roy, Xi Zhang, Nina Wolleb, Camilo Perez, Martin Jagersand, "Tracking Benchmark and Evaluation for
Manipulation Tasks", Accepted: ICRA - 2015, Seattle, USA
Ankush Roy, Xi Zhang, Martin Jagersand, " Technical Report: Tracking Benchmark and Evaluation for Manipulation Tasks "
References
Simon Baker, and Iain Matthews. "Lucas-kanade 20 years on: A unifying framework." in Proc of International journal of computer vision (IJCV), vol-56.3, (2004): pp. 221-255.
Selim Benhimane, and Ezio Malis. "Real-time image-based tracking of planes using efficient second-order minimization." in International Conference on Intelligent Robots and Systems, 2004.(IROS 2004). 2004 IEEE/RSJ, Vol. 1.
Ankush Roy, Xi Zhang, Nina Wolleb, Camilo Perez Quintero, and Martin Jägersand "Tracking Benchmark and Evaluation for Manipulation Tasks" in 2015 International Conference of Robotics and Automation (ICRA), IEEE, Seattle, USA, May 26-29
David Ross, Jongwoo Lim, and Ming-Hsuan Yang. "Adaptive probabilistic visual tracking with incremental subspace update." in European Conference on Computer Vision (ECCV) 2004. Springer Berlin Heidelberg, 2004. pp. 470-482.
Chenglong Bao, Yi Wu, Haibin Ling, and Hui Ji. "Real time robust l1 tracker using accelerated proximal gradient approach." in Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 1830-1837. IEEE, 2012
Travis Dick, Camilo Perez Quintero, Martin Jägersand, and Azad Shademan. "Realtime Registration-Based Tracking via Approximate Nearest Neighbour Search." in Robotics: Science and Systems. (RSS) 2013.
Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas. "Tracking-learning-detection." in Proc of IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 34, no. 7 (2012): 1409-1422.
Xi Zhang, Abhineet Singh, and Martin Jägersand RKLT: 8 DOF real-time robust video tracking combining coarse RANSAC features and accurate fast template registration in Canadian Conference on Robot Vision (CRV), 2015,, Halifax, Canada
Matej Kristan, Roman Pflugfelder, Aleš Leonardis, Jiri Matas, Luka Čehovin, Georg Nebehay, Tomáš Vojíř et al. "The visual object tracking vot2014 challenge results." In European Conference on Computer Vision-ECCV 2014 Workshops, pp. 191-217. Springer International Publishing, 2014.
Kiana Hajebi, Hong Zhang, (2014), "An Efficient Index for Visual Search in Appearance-based SLAM", in Proceedings of IEEE International Conference on Robotics and Automation (ICRA'14), Hong Kong, China, May 31-June 05