OpenTAMP is an open-source library for optimization-based Task and Motion Planning (TAMP), and Guided Imitation of TAMP with Python. OpenTAMP aims to make defining and solving new TAMP problems both easy and straightforward, even for users familiar with only the high-level ideas behind TAMP.
To install and begin using OpenTAMP on an Ubuntu (>14.04) Linux Machine, follow these steps:
- Install Poetry by following instructions from here
- If you're on Ubuntu 20.04 or later, make sure you have Python 3.7 available. Run the below commands if you're not sure
-
sudo add-apt-repository ppa:deadsnakes/ppa sudo apt-get update sudo apt-get install python3.7 sudo apt-get install python3.7-dev
-
- Make sure you have tkinter installed:
sudo apt-get install python3.7-tk
- Make sure you have libglew installed
sudo apt-get install libglfw3 libglew2.1
- Install MuJoCo
- Make sure you install openmpi for linux (for use with MuJoCo):
sudo apt install libopenmpi-dev
- Download the correct MuJoCo binary for your OS from here. Be sure to use version 2.1.0 and not a higher version!
- Extract the downloaded
mujoco210
directory into~/.mujoco/mujoco210
- Make sure to run the following (for future convenience, it's recommended to add this line to your
.bashrc
file)export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
- Make sure you install openmpi for linux (for use with MuJoCo):
- Clone the OpenTAMP repository from GitHub to a folder of your choice:
https://github.com/Algorithmic-Alignment-Lab/OpenTAMP.git
cd
into the newly-installed library and runpoetry shell
, thenpoetry install
- Especially on ubuntu 20.04, you might first have to manually specify the path to python 3.7 via
poetry env use /usr/bin/python3.7
- Especially on ubuntu 20.04, you might first have to manually specify the path to python 3.7 via
- Now, you should have a nice virtual environment with python configured to run OpenTAMP! Whenever you want to use this, simply
cd
into the OpenTAMP folder and then runpoetry shell
- (Optional) If you'd like to use Gurobi as a backend solver for motion-planning problems, then follow steps here to obtain and activate a license (note: free licenses are available for students and academic users!)
- Note that for obtaining a license, you must either install gurobi via conda or from source
ToDo: Might need to add baselines and h-baselines because they're currently in setup.sh
Try running python opentamp/src/test_grip_det.py
from the root of the repository, and if this script successfully completes and displays a short video at the end, your installation is correct!
This should take under a minute to run.
If you wish to train policies from the code, verify that a Mujoco key titled mjkey.txt
is in your home directory an that FULL_INSTALL=true
in setup.sh
. Once this has completed, make sure you are on the virtual env (by running tampenv
) and try running test_training.sh
; this script will attempt to train policies for a two object pick-place problem with some default parameter settings. Once completed, is will generate some plots on performance and videos of rollouts into the ~/Dropbox
directory. Note the script will use about 16 proccesses, so it's reccomended to run it from at least an 8-physical core machine.
Specifications for all domains should be placed in the domains
folder directly under the opentamp
directory
For a concrete example, refer to opentamp/domains/namo_domain/generate_namo_domain.py
. This is script designed to generate a domain file (these files are cumbersome to write directly by hand)
The first portion of the file will look like
Types: Can, Target, RobotPose, Robot, Grasp, Obstacle
Here, everything following Types:
is a "type" that parameters in the domain can take on.
The next portion specifies where to find various necessary code
First:
Attribute Import Paths: RedCircle core.util_classes.items, Vector1d core.util_classes.matrix, Vector2d core.util_classes.matrix, Wall core.util_classes.items, NAMO core.util_classes.robots
Here, we tell the planner where to find the python classes parameter attributes can take on (used in Primitive Predicates)
Second:
Predicates Import Path: core.util_classes.namo_predicates
Here, we tell the planner where to find the python classes defining predicates
The next portion specifies primitive and derived predicates
First:
Primitive Predicates: geom, Can, RedCircle; pose, Can, Vector2d
etc...
This defines a list of 3-tuples of the form (attribute_name, parameter_type, attribute_type)
and effectively tells the planner how to build parameters
Second:
Derived Predicates: At, Can, Target; RobotAt, Robot, RobotPose; InGripper, Robot, Can, Grasp; Obstructs, Robot, Target, Target, Can
This defines a list of tuples whose first element is a predicate class and the remaining are the type signatures for parameters of the predicate. In the above, the At
predicate is defined to take both a Can
object and a Target
symbol
An action has five components: a name, a number of timesteps, a list of typed arguments, a "pre" list, and a "post" list
In the generate file above, these are defined as attributes of an Action
class
Suppose we have a moveto action.
For args:
args = '(?robot - Robot ?can - Can ?target - Target ?sp - RobotPose ?gp - RobotPose ?g - Grasp)'
Means the action takes in a robot, a can, a target, a start pose, a goal pose, and a grasp
The pre
list contains both precondition and midconditions for the action (originally, everything here was preconditions but that was impractical)
Preconditions are identified by being active at 0:0
i.e. only the first timestep
Items of this list are pairs of strings, the first describing the constraint and the second specifying what timesteps to enforce that constraint on
('(At ?can ?target)', '0:0')
specifies that a precondition should be enforced constraining ?can
to be at ?target
('(forall (?w - Obstacle) (not (RCollides ?robot ?w)))', '0:{0}'.format(end))
specifies that for all objects of type Obstacle
(not just those in the action arguments), the action should include a constraint prohibiting collision between the object and the robot from time 0 to time end
Note the syntax: forall
allows ranging over the global space of parameters, not just those immediatley visible to the action, while not
enforces the negation of a constraint (here, RCollides
is a constraint to be in collision, so adding the not
instead contrains to avoid collision)
The eff
list is similar, but everything in this list will be considered a postcondition by the task planner and is something that must be true when the action finishes.
('(forall (?obj - Can) (Stationary ?obj))', '{0}:{1}'.format(end, end-1))
Note in the above how the timesteps are (end, end-1)
; this is special syntax telling the motion planner to ignore the constraint even though the task planner will include it, and allows adding extra information to guide the task planner that is uncessary for the motion planner
Specifications for all problem files should be placed under the relevant domain directory
For a concrete example, refer to domains/namo_domain/generate_namo_prob.py
. This is script designed to generate a problem file (these files are cumbersome to write directly by hand)
A problem file will define a list of objects, initial attribute values for those objects, a goal, and a set of initial conditions
The first part is of the form Objects: ...
; everything on this line is of the form ObjectType (name {insert name})
and will list EVERY object and symbol the planner will have access to. Semicolons delimit separate objects and the end of this line must be two new lines \n\n
The next part is of the form Init: ...
; everything in this part is of the form (attribute objectname value)
and specifies the initial value of every attribute for every object.
(pose can0 [0,0])
for example specifies that the initial pose of can0
will be at [0,0]
(pose can1 undefined)
on the other hand specifies that the initial pose of can1
is not fixed and the planner will determine its value. Items here are delimited by commas and the end of this part must be a semicolon.
The next part is then a list of predicates; these are constraints that ARE true at the initial timesteps. These must be satisfied by the initial values from the preceding part. This part is ended with two new lines \n\n
Finally, the last line is of the form GOAL: ...
and specifies what constraints you want to be true at the end of the planning process.
During planning, the list of initial predicates will be converted to the initial state in PDDL while the goal will be converted likewise
Predicates will define both constraints for the trajectory optimization problem as well as PDDL-style predicates for the task planner
All predicates will be subclasses of ExprPredicate
from core.util_classes.common_predicates
There are roughly two types of predicates: linear and non-linear
Each predicate has a priority which can be interpreted as order-to-add: the solver will iteratively optimize problems with constraints restricted up-to the current priority, only adding the next priority when the current problem is solved. Usually, higher priority means a harder to solve constraint like collision avoidance
An example linear predicate:
class At(ExprPredicate):
def __init__(self, name, params, expected_param_types, env=None):
self.can, self.targ = params
attr_inds = OrderedDict([(self.can, [("pose", np.array([0,1], dtype=np.int))]),
(self.targ, [("value", np.array([0,1], dtype=np.int))])])
A = np.c_[np.eye(2), -np.eye(2)]
b = np.zeros((2, 1))
val = np.zeros((2, 1))
aff_e = AffExpr(A, b)
e = EqExpr(aff_e, val)
super(At, self).__init__(name, e, attr_inds, params, expected_param_types, priority=-2)
What's going on here:
attr_inds
is a dictionary describing which indices of which attributes will be used from each parameter; e.g. indices 0 and 1 of the can's pose will be included in the state vector xA
defines a matrix such that Ax = [0., 0.] iff. x[:2] == x[2:]; in this context that means can.pose == targ.valueaff_e
is an affine expression of the formAx+b
e
is then an affine constraint of the formAx+b=val
Predicates can be defined over multiple timesteps
An example multi-timestep predicate:
class RobotStationary(ExprPredicate):
def __init__(self, name, params, expected_param_types, env=None, debug=False):
self.c, = params
attr_inds = OrderedDict([(self.c, [("pose", np.array([0, 1], dtype=np.int))])])
A = np.array([[1, 0, -1, 0],
[0, 1, 0, -1]])
b = np.zeros((2, 1))
e = EqExpr(AffExpr(A, b), np.zeros((2, 1)))
super(RobotStationary, self).__init__(name, e, attr_inds, params, expected_param_types, active_range=(0,1), priority=-2)
What's going on here:
active_range
implies the state vector will be constructed using both the current timestep and the next; generally anactive_range
of the form(a,b)
implies the state vector will use every timestep fromcur_ts+a
tocur_ts+b
; using negative values to specifiy earlier timesteps
An example non-linear predicate
class InGraspAngle(ExprPredicate):
def __init__(self, name, params, expected_param_types, env=None, sess=None, debug=False):
self.r, self.can = params
self.dist = 0.6
attr_inds = OrderedDict([(self.r, [("pose", np.array([0, 1], dtype=np.int)),
("theta", np.array([0], dtype=np.int))]),
(self.can, [("pose", np.array([0, 1], dtype=np.int))]),
])
def f(x):
targ_loc = [-self.dist * np.sin(x[2]), self.dist * np.cos(x[2])]
can_loc = x[3:5]
return np.array([[((x[0]+targ_loc[0])-can_loc[0])**2 + ((x[1]+targ_loc[1])-can_loc[1])**2]])
def grad(x):
curdisp = x[3:5] - x[:2]
theta = x[2]
targ_disp = [-self.dist * np.sin(theta), self.dist * np.cos(theta)]
off = (curdisp[0]-targ_disp[0])**2 + (curdisp[1]-targ_disp[1])**2
(x1, y1), (x2, y2) = x[:2], x[3:5]
x1_grad = -2 * ((x2-x1)+self.dist*np.sin(theta))
y1_grad = -2 * ((y2-y1)-self.dist*np.cos(theta))
theta_grad = 2 * dist * ((x2-x1)*np.cos(theta) + (y2-y1)*np.sin(theta))
x2_grad = 2 * ((x2-x1)+self.dist*np.sin(theta))
y2_grad = 2 * ((y2-y1)-self.dist*np.cos(theta))
return np.array([x1_grad, y1_grad, theta_grad, x2_grad, y2_grad]).reshape((1,5))
self.f = f
self.grad = grad
angle_expr = Expr(f, grad)
e = EqExpr(angle_expr, np.zeros((1,1)))
super(InGraspAngle, self).__init__(name, e, attr_inds, params, expected_param_types, priority=1)
What's going on here:
- This is a constraint specifiying that
can
must be at distance0.6
along directiontheta
fromr
f
replacesaff_expr
from above with a black-box function call;f
can only take the state vector as input and can only return a 1-d arraygrad
will return the jacobian off
with respect to the state vector, or more precisely an array of the gradients of each output off
.grad
can only return a 2-d array of shape(len(f(x)), len(x))
such thatgrad(x).dot(x) + f(x)
is valide
is an equality constraint of the formf(x)=0
;angle_expr
provides bothf
and it's gradientgrad
to the solver