Computer Vision and Robotics Research Group, University of Alberta

Welcome!

Our research group primarily focuses on several areas of robotics. We are interested in vision-based autonomous and semi-autonmous control of robot manipulators, including visual servoing and learning based methods. Additionally, we develop haptic leader-follower teleoperation systems for human-in-the-loop telemanipulation. We are also interested in methods for dynamic vision (tracking, on-line 3D modeling from images, predictive display).

Below are some of our recent representative research projects. For a full list of publications, please visit the Publications page. For a list that includes more historical projects, please see the full Research page.

Lab News 1

Paper accepted at IROS 2025! Check it out here.
Congratulations to our new graduates, Allie and Faezeh!
Awarded Best Student Paper at ICRA 2025! Congratulations Allie!
Two papers accepted to ICRA 2025! Congrats to Chen, Allie, and co-authors.
We demoed at the IDEaS Marketplace in Ottawa!

Point and Go: Intuitive Reference Frame Reallocation in Mode Switching for Assistive Robotics

People: Allie Wang; Chen Jiang; Michael Przystupa; Justin Valentine; Martin Jagersand

Operating high degree of freedom robots can be difficult for users of wheelchair mounted robotic manipulators. Mode switching in Cartesian space has several drawbacks such as unintuitive control reference frames, separate translation and orientation control, and limited movement capabilities that hinder performance. We propose Point and Go mode switching, which reallocates the Cartesian mode switching reference frames into a more intuitive action space comprised of new translation and rotation modes..

Thesis Link

Website Link

Robot Manipulation in Salient Vision through Referring Image Segmentation and Geometric Constraints

People: Chen Jiang; Allie Wang; Martin Jagersand

To solve robot manipulation tasks in real-world environments, CLIPU²Net is first employed to segment regions most relevant to the target specified by referring language. Geometric constraints are then applied to the segmented region, generating context-relevant motions for uncalibrated image-based visual servoing (UIBVS) control.

Paper Link

Image Space Path Following Control Using Visual Servoing

People: Cole Dewis; Martin Jagersand

We develop a visual servoing based path following controller to allow robot arms to follow arbitrary paths or contours. This allows tasks to be specified as paths in image space.

Paper Link

Back2Future-SIM: Creating Real-Time Interactable Immersive Virtual World For Robot Teleoperation

People: Sait Akturk; Justin Valentine; Martin Jagersand

In this thesis, our focus is on providing predictive haptic feedback and immersive visuals from the virtual replica of the remote scene in a physics simulator. Our system acts as a bridge between the operator and the follower robot, considering real-world constraints. We create a cyber-physical system using a real-time 3D surface mesh reconstruction algorithm and a digital twin of the Barrett WAM arm robot. The Gazebo physics simulator integrates the digital twin and an incremental surface mesh to create a real-time virtual replica of the remote scene. This virtual replica is used to provide haptic surface interaction feedback through collision detection from the physics simulator. Additionally, we address the operator's spatial awareness by using an immersive display for predictive visualization.

Thesis Link

Website Link

Learning Geometry from Vision for Robotic Manipulation

People: Jun Jin; Martin Jagersand

This thesis studies how to enable a real-world robot to efficiently learn a new task by watching human demonstration videos. Learning by watching provides a more human-intuitive task teaching interface than methods requiring coordinates programming, reward/cost design, kinesthetic teaching, or teleoperation. However, challenges regarding massive human demonstrations, tedious data annotations or heavy training of a robot controller impede its acceptance in real-world applications. To overcome these challenges, we introduce a geometric task structure to the problem solution.

Thesis Link

Website Link

Understanding Manipulation Contexts by Vision and Language for Robotic Vision

People: Chen Jiang; Martin Jagersand

In Activities of Daily Living (ADLs), humans perform thousands of arm and hand object manipulation tasks, such as picking, pouring and drinking a drink. In a pouring manipulation task, its manipulation context involves a sequence of actions executed over time over some specific objects. The study serves as a fundamental baseline to process robotic vision along with natural language understanding. In future, we aim to enhance this framework further for knowledge-guided assistive robotics.

Thesis Link

Actuation Subspace Prediction with Neural Householder Transforms

People: Kerrick Johnstonbaugh; Martin Jagersand

Choosing an appropriate action representation is an integral part of solving robotic manipulation problems. Published approaches include latent action models, which train context-conditioned neural networks to map low-dimensional latent actions to high-dimensional actuation commands. Such models can have a large number of parameters, and can be difficult to interpret from a user perspective. In this thesis, we propose that similar performance gains in robotics tasks can be achieved by restructuring the neural network to map observations to a basis for context-dependent linear actuation subspaces. This results in an action interface wherein a user's actions determine a linear combination of state-conditioned actuation basis vectors. We introduce the Neural Householder Transform (NHT) as a method for computing this basis.

Thesis Link

Advancing the Acceptance and Use of Wheelchair-mounted Robotic Manipulators

People: Laura Petrich; Martin Jagersand

Wheelchair-mounted robotic manipulators have the potential to help the elderly and individuals living with disabilities carry out their activities of daily living independently. While robotics researchers focus on assistive tasks from the perspective of various control schemes and motion types, health research tends to concentrate on clinical assessment and rehabilitation. This difference in perspective often leads to the design and evaluation of experimental tasks that are tailored to specific robotic capabilities rather than solving tasks that would support independent living. In addition, there are many studies in healthcare on which activities are relevant to functional independence, but little is known about how often these activities occur. Understanding which activities are frequently carried out during the day can help guide the development and prioritization of assistive robotic technology. By leveraging the strength of robotics (i.e., performing well on repeated tasks) these activities can be automated, significantly improving the quality of life for our target population.

Thesis Link

Activities of Daily Living: Frequency and Timing of Human Tasks

People: Laura Petrich; Martin Jagersand

Human assistive robotics can help the elderly and those with disabilities with Activities of Daily Living (ADL). Robotics researchers approach this bottom-up publishing on methods for control of different types of movements. Health research on the other hand focuses on hospital clinical assessment and rehabilitation using the International Classification of Functioning (ICF), leaving arguably important differences between each domain. In particular, little is known quantitatively on what ADLs humans perform in their ordinary environment - at home, work etc. This information can guide robotics development and prioritize what technology to deploy for in-home assistive robotics. This study targets several large lifelogging databases, where we compute (i) ADL task frequency from long-term low sampling frequency video and Internet of Things (IoT) sensor data, and (ii) short term arm and hand movement data from 30 fps video data of domestic tasks. Robotics and health care have different terms and taxonomies for representing tasks and motions. From the quantitative ADL task and ICF motion data we derive and discuss a robotics-relevant taxonomy in attempts to ameliorate these taxonomic differences.

Line and Plane based Incremental Surface Reconstruction

People: Junaid Ahmad; Martin Jagersand

In this thesis, we try to take the natural step to also compute and verify 3D planes bottom-up from lines. Our system takes the real-time stream of new cameras and 3D points from a SLAM system and incrementally builds the 3D scene surface model. In previous work, 3D line segments were detected in relevant keyframes and were fed to the modeling algorithm for surface reconstruction. This method has an immediate drawback as some of the line segments generated in every keyframe are redundant and mark similar objects(shifted) creating clutter in the map. To avoid this issue, we track the 3D planes detected over keyframes for consistency and data association. Furthermore, the smoother and better-aligned model surfaces result in more photo-realistic rendering using keyframe texture images.

Incremental 3D Line Segment Extraction for Surface Reconstruction from Semi-dense SLAM

People: Shida He, Xuebin Qin, Zichen Zhang, and Martin Jagersand.

It is challenging to utilize the large scale point clouds of semi-dense SLAM for real-time surface reconstruction. In order to obtain meaningful surfaces and reduce the number of points used in surface reconstruction, we propose to simplify the point clouds generated by semi-dense SLAM using 3D line segments. Specifically, we present a novel incremental approach for real-time 3D line segments extraction. Our experimental results show that the 3D line segments generated by our method are highly accurate compared to other methods. We demonstrate that using the extracted 3D line segments greatly improves the quality of 3D surface compared to using the 3D points directly from SLAM systems.