Skip to main content
To KTH's start page To KTH's start page

Learning for Task-Oriented Grasping

Time: Thu 2020-10-22 14.00

Location: F3, Lindstedtsvägen 26, Stockholm (English)

Subject area: Computer Science

Doctoral student: Mia Kokic , Datavetenskap, Centrum för autonoma system, CAS, Robotik, perception och lärande, RPL, Robotics, Perception and Learning

Opponent: Professor Barbara Caputo,

Supervisor: Danica Kragic, Numerisk analys och datalogi, NADA, Robotik, perception och lärande, RPL, Centrum för autonoma system, CAS

Export to calendar


Task-oriented grasping refers to the problem of computing stable grasps on objects that allow for a subsequent execution of a task. Although grasping objects in a task-oriented manner comes naturally to humans, it is still very challenging for robots. Take for example a service robot deployed in a household. Such a robot should be able to execute complex tasks that might include cutting a banana or flipping a pancake. To do this, the robot needs to know what and how to grasp such that the task can be executed. There are several challenges when it comes to this. First, the robot needs to be able to select an appropriate object for the task. This pertains to the theory of \emph{affordances}. Second, it needs to know how to place the hand such that the task can be executed, for example, grasping a knife on the handle when performing cutting. Finally, algorithms for task-oriented grasping should be scalable and have high generalizability over many object classes and tasks. This is challenging since there are no available datasets that contain information about mutual relations between objects, tasks and grasps.In this thesis, we present methods and algorithms for task-oriented grasping that rely on deep learning. We use deep learning to detect object \emph{affordances}, predict task-oriented grasps on novel objects and to parse human activity datasets for the purpose of transferring this knowledge to a robot.For learning affordances, we present a method for detecting functional parts given a visual observation of an object and a task. We utilize the detected affordances together with other object properties to plan for stable, task-oriented grasps on novel objects.For task-oriented grasping, we present a system for predicting grasp scores that take into account both the task and the stability. The grasps are then executed on a real-robot and refined via bayesian optimization. Finally, for parsing human activity datasets, we present an algorithm for estimating 3D hand and object poses and shapes from 2D images so that the information about the contacts and relative hand placement can be extracted. We demonstrate that we can use the information obtained in this manner to teach a robot task-oriented grasps by performing experiments with a real robot on a set of novel objects.