A Highly Efficient and Extensible Library for Registration based Visual Tracking
✔ Fully modular implementation that is easy to extend
✔ Based on Eigen for speed and simplicity
✔ Python and MATLAB interfaces for ease of use
✔ Ability to run as a stand alone library
✔ Seamless integration with ROS
✔ Cross platform support
Fast and high precision visual tracking is crucial to the success of several robotics and virtual reality applications like SLAM, autonomous navigation and visual servoing. In recent years, online learning and detection based trackers have been more popular in the vision community due to their robustness to changes in the object's appearance which makes them better suited to long term tracking. However, these are often unsuitable for the aforementioned applications for two reasons. Firstly, they are too slow to allow real time execution of tasks where multiple trackers have to be run simultaneously or tracking is only a small part of a larger system with more computationally intensive modules that use its result to make higher level deductions about the environment. Secondly, they are not precise enough to give the exact object pose with sub pixel alignment required for these tasks, being usually limited to the estimation of simple transformations of the target patch such as translation and scaling. As a result, registration based trackers are more suitable for these applications as being several times faster and capable of estimating more complex transformations like affine and homography.
Though several major advances have been made in this domain since the original Lucas Kanade tracker was introduced almost thirty five years ago, yet efficient open source implementations of recent trackers are surprisingly difficult to find. In fact, the only such tracker offered by the popular OpenCV library, uses a pyramidal implementation of the original algorithm. In the absence of good open source implementations of modern trackers, most robotics and VR research groups either use these out dated trackers or implement their own custom trackers. These, in turn, are often not made publicly available or are tailored to suit very specific needs and so require significant reprogramming to be useful for an unrelated project. To address this requirement, we introduce Modular Tracking Framework (MTF) - a generic system for registration based tracking that provides highly efficient implementations for a large subset of trackers introduced in literature to date and is designed to be easily extensible with additional methods.
Each tracker within this framework comprises the following 3 modules:
State Space Model (SSM): Spline (50+ DOF), TPS (50+ DOF), Homography (8 DOF), Affine (6 DOF), ASRT (Anisotropic Scaling, Rotation and Translation - 5 DOF ), Similitude (4 DOF), AST (Anisotropic Scaling and Translation - 4 DOF), Isometry (3 DOF) IST (Isotropic Scaling and Translation - 3 DOF ) or Translation (2 DOF)
for more details on the system design and
for some preliminary results. The latter was published at CRV 2016 while the former has been accepted at IROS 2017.
There is also this newer results paper that was published at WACV 2017.
Finally, the complete thesis based on this framework is available here if even more details are needed. This can also be used as the official documentation till the Doxygen version is completed.
A complete list of related papers is also given below.
The library is implemented entirely in C++ though interfaces for Python and MATLAB are also provided to aid its use in research applications.
A simple interface for ROS is likewise provided for seamless integration with robotics projects.
Finally, MTF comes bundled with several state of the art learning and detection based trackers whose C++ implementations are publicly available including DSST, KCF, CMT, TLD, RCT, MIL, Struck, FragTrack, GOTURN and DFT. As s result, combined with the datasets provided below, MTF can serve as a great test bed for general purpose tracking too. We are always looking to add more such trackers to MTF so please let us know if there is a tracker with open source C++ implementation that you would like to see integrated.
MTF supports both Unix and Windows platforms. Though it has been tested comprehensively only under Linux, specifically Ubuntu 14.04, it should work on Macintosh systems too. The Windows build system is in its early stages and needs some manual setting of variables but it is quite usable and we are working on making it more user friendly.
MTF is provided under BSD license and so is free for research and commercial applications. We do request, however, that this paper be cited by any publications resulting from projects that use MTF so more people can get to know about and benefit from it.
Following is an extended version of the IROS supplementary video showing several usage examples:
Abhineet Singh and Martin Jagersand, "Modular Tracking Framework: A Fast Library for High Precision Tracking", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 [pdf][video]
Xuebin Qin, Shida He, Camilo Alfonso Perez Quintero, Abhineet Singh, Masood Dehghan, Martin Jagersand, "Real-Time Salient Closed Boundary Tracking via Line Segments Perceptual Grouping ", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 [pdf][video]
Lin Chen, Fan Zhou, Yu Shen, Xiang Tian, Haibin Ling and Yaowu Chen, "Illumination Insensitive Efﬁcient Second-order Minimization for Planar Object Tracking", in the IEEE International Conference on Robotics and Automation (ICRA), June 2017 [pdf]
Mennatullah Siam, Abhineet Singh, Camilo Perez and Martin Jagersand, "4-DoF Tracking for Robot Fine Manipulation Tasks", in the 14th Conference on Computer and Robot Vision (CRV), May 2017 [pdf]
Abhineet Singh, "Modular Tracking Framework: A Unified Approach to Registration based Tracking", MSc Thesis, March 2017 [pdf][ppt]
Abhineet Singh, Mennatullah Siam and Martin Jagersand, "Unifying Registration based Tracking: A Case Study with Structural Similarity", in the Winter Conference on Applications of Computer Vision (WACV), March 2017
Vincent Zhang, "PCA based appearance model for tracking", Project Report, 2016 [pdf]
Mennatullah Siam, "CNN Based Appearance Model with Approximate Nearest Neigbour Search", Project Report, 2016 [pdf]
Abhineet Singh, Ankush Roy, Xi Zhang and Martin Jagersand, "Modular Decomposition and Analysis of Registration based Trackers", in 13th Conference on Computer and Robot Vision (CRV), 2016, pp.85-92, June 2016 [pdf][ppt]
Abhineet Singh, "Hessian after Convergence: A New Perspective on Lucas Kanade Tracking", Report, November 2015 [pdf]
Xi Zhang, Abhineet Singh and Martin Jagersand, "RKLT: 8 DOF Real-Time Robust Video Tracking Combing Coarse Ransac Features and Accurate Fast Template Registration," in 12th Conference on Computer and Robot Vision (CRV), 2015, pp.70-77, June 2015 [pdf]
Several publicly available tracking datasets with full ground truth formatted to work with MTF out of the box are also made available here for convenience:
Sequences are in the form of avi video files rather than jpg images to save space; set img_source to m and seq_fmt to avi in mtf.cfg before using this dataset in MTF
The creators of this dataset have only provided ground truth for every other frame (all even numbered frames along with the first frame that is used for initialization);
A very high performing tracker has been used to fill in the ground truth for unlabeled frames but some frames that feature too much occlusion or too large inter-frame motion are virtually untrackable so the ground truth is not correct for these frames; we are working to correct these manually;