Merge pull request #82 from facebookresearch/pypi

Refactored to upload to pypi
facebookresearch · May 7, 2021 · ca3ac4e · ca3ac4e
2 parents bab531c + ebc59f7
commit ca3ac4e
Show file tree

Hide file tree

Showing 63 changed files with 1,537 additions and 148 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,10 @@
 # Changelog
 
-# v0.1.0
+## v0.1.1
+- Multiple bug fixes
+- Added `third_party` folder for `pytorch_sac` and `dmc2gym` 
+- Library now available in `pypi`
+
+## v0.1.0
 
 Initial release
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -0,0 +1,6 @@
+include LICENSE README.md
+include requirements/*.txt
+include mbrl/examples/conf/*.yaml
+include mbrl/examples/conf/algorithm/*.yaml
+include mbrl/examples/conf/dynamics_model/*.yaml
+include mbrl/examples/conf/overrides/*.yaml
diff --git a/README.md b/README.md
@@ -1,3 +1,4 @@
+[![PyPi Version](https://img.shields.io/pypi/v/mbrl)
 [![Master](https://github.com/facebookresearch/mbrl-lib/workflows/CI/badge.svg)](https://github.com/facebookresearch/mbrl-lib/actions?query=workflow%3ACI)
 [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/facebookresearch/mbrl-lib/tree/master/LICENSE)
 [![Python 3.7+](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/release/python-360/)
@@ -6,7 +7,7 @@
 
 # MBRL-Lib
 
-``mbrl-lib`` is a toolbox for facilitating development of 
+``mbrl`` is a toolbox for facilitating development of 
 Model-Based Reinforcement Learning algorithms. It provides easily interchangeable 
 modeling and planning components, and a set of utility functions that allow writing
 model-based RL algorithms with only a few lines of code. 
@@ -17,43 +18,28 @@ See also our companion [paper](https://arxiv.org/abs/2104.10159).
 
 ### Installation
 
-``mbrl-lib`` is a Python 3.7+ library. To install it, clone the repository,
+#### Standard Installation
 
-    git clone https://github.com/facebookresearch/mbrl-lib.git
-
-then run
+``mbrl`` requires Python 3.7+ library and [PyTorch (>= 1.7)](https://pytorch.org). 
+To install the latest stable version, run
 
-    cd mbrl-lib
-    pip install -e .
+    pip install mbrl
 
-If you are interested in contributing, please install the developer tools as well
+#### Developer installation
+If you are interested in modifying the library, clone the repository and set up 
+a development environment as follows
 
+    git clone https://github.com/facebookresearch/mbrl-lib.git
     pip install -e ".[dev]"
 
-Finally, make sure your Python environment has
-[PyTorch (>= 1.7)](https://pytorch.org) installed with the appropriate 
-CUDA configuration for your system.
-
-For testing your installation, run
+And test it by running the following from the root folder of the repository
 
     python -m pytest tests/core
     python -m pytest tests/algorithms
 
-### Mujoco
-
-Mujoco is a popular library for testing RL methods. Installing Mujoco is not
-required to use most of the components and utilities in MBRL-Lib, but if you 
-have a working Mujoco installation (and license) and want to test MBRL-Lib 
-on it, please run
-
-    pip install -r requirements/mujoco.txt
-
-and to test our mujoco-related utilities, run
-
-    python -m pytest tests/mujoco
 
 ### Basic example
-As a starting point, check out our [tutorial notebook](notebooks/pets_example.ipynb) 
+As a starting point, check out our [tutorial notebook](https://github.com/facebookresearch/mbrl-lib/tree/master/notebooks/pets_example.ipynb) 
 on how to write the PETS algorithm 
 ([Chua et al., NeurIPS 2018](https://arxiv.org/pdf/1805.12114.pdf)) 
 using our toolbox, and running it on a continuous version of the cartpole 
@@ -62,20 +48,23 @@ environment.
 ## Provided algorithm implementations
 MBRL-Lib provides implementations of popular MBRL algorithms 
 as examples of how to use this library. You can find them in the 
-[mbrl/algorithms](mbrl/algorithms) folder. Currently, we have implemented
-[PETS](mbrl/algorithms/pets.py) and [MBPO](mbrl/algorithms/mbpo.py), and
+[mbrl/algorithms](https://github.com/facebookresearch/mbrl-lib/tree/master/mbrl/algorithms) folder. Currently, we have implemented
+[PETS](https://github.com/facebookresearch/mbrl-lib/tree/master/mbrl/algorithms/pets.py) and [MBPO](https://github.com/facebookresearch/mbrl-lib/tree/master/mbrl/algorithms/mbpo.py), and
 we plan to keep increasing this list in the near future.
 
 The implementations rely on [Hydra](https://github.com/facebookresearch/hydra) 
 to handle configuration. You can see the configuration files in 
-[this](conf) folder. The [overrides](conf/overrides) subfolder contains
+[this](https://github.com/facebookresearch/mbrl-lib/tree/master/mbrl/examples/conf) 
+folder. 
+The [overrides](https://github.com/facebookresearch/mbrl-lib/tree/master/mbrl/examples/conf/overrides) 
+subfolder contains
 environment specific configurations for each environment, overriding the 
 default configurations with the best hyperparameter values we have found so far 
 for each combination of algorithm and environment. You can run training
 by passing the desired override option via command line. 
 For example, to run MBPO on the gym version of HalfCheetah, you should call
 ```python
-python main.py algorithm=mbpo overrides=mbpo_halfcheetah 
+python -m mbrl.examples.main algorithm=mbpo overrides=mbpo_halfcheetah 
 ```
 By default, all algorithms will save results in a csv file called `results.csv`,
 inside a folder whose path looks like 
@@ -90,20 +79,27 @@ such as the type of dynamics model
 (e.g., `dynamics_model=basic_ensemble`), or the number of models in the ensemble 
 (e.g., `dynamics_model.model.ensemble_size=some-number`). To learn more about
 all the available options, take a look at the provided 
-[configuration files](conf). 
+[configuration files](https://github.com/facebookresearch/mbrl-lib/tree/master/mbrl/examples/conf). 
 
-Note that running the provided examples and `main.py` requires Mujoco, but
+### Note
+Running the provided examples requires Mujoco, but
 you can try out the library components (and algorithms) on other environments 
-by creating your own entry script and Hydra configuration.
+by creating your own entry script and Hydra configuration (see [examples].
+
+If you do have a working Mujoco installation (and license), you can check
+that it works correctly with our library by running 
+(also requires [`dm_control`](https://github.com/deepmind/dm_control)).
+
+    python -m pytest tests/mujoco
 
 ## Visualization tools
 Our library also contains a set of 
-[visualization](mbrl/diagnostics) tools, meant to facilitate diagnostics and 
-development of models and controllers. These currently require Mujoco installation, but we are 
-planning to add more support and extensions in the future. Currently, 
-the following tools are provided:
+[visualization](https://github.com/facebookresearch/mbrl-lib/tree/master/mbrl/diagnostics) tools, meant to facilitate diagnostics and 
+development of models and controllers. These currently require a Mujoco 
+installation (see previous subsection), but we are planning to add support for other environments 
+and extensions in the future. Currently, the following tools are provided:
 
-* [``Visualizer``](visualize_model_preds.py): Creates a video to qualitatively
+* ``Visualizer``: Creates a video to qualitatively
 assess model predictions over a rolling horizon. Specifically, it runs a 
   user specified policy in a given environment, and at each time step, computes
   the model's predicted observation/rewards over a lookahead horizon for the 
@@ -116,35 +112,35 @@ assess model predictions over a rolling horizon. Specifically, it runs a
   be trained independently. The following gif shows an example of 200 steps 
   of pre-trained MBPO policy on Inverted Pendulum environment.
 
-  ![Example of Visualizer](docs/resources/inv_pendulum_mbpo_vis.gif)
+  ![Example of Visualizer](http://raw.githubusercontent.com/facebookresearch/mbrl-lib/master/docs/resources/inv_pendulum_mbpo_vis.gif)
 
-* [``DatasetEvaluator``](eval_model_on_dataset.py): Loads a pre-trained model
+* ``DatasetEvaluator``: Loads a pre-trained model
 and a dataset (can be loaded from separate directories), and computes 
   predictions of the model for each output dimension. The evaluator then
   creates a scatter plot for each dimension comparing the ground truth output 
   vs. the model's prediction. If the model is an ensemble, the plot shows the
   mean prediction as well as the individual predictions of each ensemble member.
 
-  ![Example of DatasetEvaluator](docs/resources/dataset_evaluator.png)
+  ![Example of DatasetEvaluator](http://raw.githubusercontent.com/facebookresearch/mbrl-lib/master/docs/resources/dataset_evaluator.png)
 
-* [``FineTuner``](finetune_model_with_controller.py): Can be used to train a
+* ``FineTuner``: Can be used to train a
 model on a dataset produced by a given agent/controller. The model and agent
   can be loaded from separate directories, and the fine tuner will roll the 
   environment for some number of steps using actions obtained from the 
   controller. The final model and dataset will then be saved under directory
   "model_dir/diagnostics/subdir", where `subdir` is provided by the user.
 
-* [``True Dynamics Multi-CPU Controller``](control_env.py): This script can run
+* ``True Dynamics Multi-CPU Controller``: This script can run
 a trajectory optimizer agent on the true environment using Python's 
   multiprocessing. Each environment runs in its own CPU, which can significantly
   speed up costly sampling algorithm such as CEM. The controller will also save
   a video if the ``render`` argument is passed. Below is an example on 
   HalfCheetah-v2 using CEM for trajectory optimization.
 
-  ![Control Half-Cheetah True Dynamics](docs/resources/halfcheetah-break.gif)
+  ![Control Half-Cheetah True Dynamics](http://raw.githubusercontent.com/facebookresearch/mbrl-lib/master/docs/resources/halfcheetah-break.gif)
 
 Note that the tools above require Mujoco installation, and are specific to 
-models of type [``OneDimTransitionRewardModel``](../models/one_dim_tr_model.py).
+models of type [``OneDimTransitionRewardModel``](https://github.com/facebookresearch/mbrl-lib/tree/master/mbrl/models/one_dim_tr_model.py).
 We are planning to extend this in the future; if you have useful suggestions
 don't hesitate to raise an issue or submit a pull request!
 
@@ -153,7 +149,7 @@ Please check out our **[documentation](https://facebookresearch.github.io/mbrl-l
 and don't hesitate to raise issues or contribute if anything is unclear!
 
 ## License
-`mbrl-lib` is released under the MIT license. See [LICENSE](LICENSE) for 
+`mbrl` is released under the MIT license. See [LICENSE](LICENSE) for 
 additional details about it. See also our 
 [Terms of Use](https://opensource.facebook.com/legal/terms) and 
 [Privacy Policy](https://opensource.facebook.com/legal/privacy).

diff --git a/docs/index.rst b/docs/index.rst
@@ -1,59 +1,40 @@
 Documentation for mbrl-lib
 ========================================
-``mbrl-lib`` is library to facilitate research on Model-Based Reinforcement Learning.
+``mbrl`` is library to facilitate research on Model-Based Reinforcement Learning.
 
 Getting started
 ===============
 
 Installation
 ------------
 
-``mbrl-lib`` is a Python 3.7+ library. To install it, clone the repository,
+Standard Installation
+^^^^^^^^^^^^^^^^^^^^^
+``mbrl`` requires Python 3.7+ and `PyTorch (>= 1.7) <https://pytorch.org/>`_.
 
-.. code-block:: bash
-
-    git clone https://github.com/facebookresearch/mbrl-lib.git
-
-then run
+To install the latest stable version, run
 
 .. code-block:: bash
 
-    cd mbrl-lib
-    pip install -e .
+    pip install mbrl
 
-If you also want the developer tools for contributing, run
+Development Installation
+^^^^^^^^^^^^^^^^^^^^^^^^
+If you are interested in modifying parts of the library, you can clone the repository
+and set up a development environment, as follows
 
 .. code-block:: bash
 
+    git clone https://github.com/facebookresearch/mbrl-lib.git
     pip install -e ".[dev]"
 
-Finally, make sure your Python environment has
-`PyTorch (>= 1.7) <https://pytorch.org/>`_ installed with the appropriate CUDA configuration
-for your system.
-
-
-To test your installation, run
+And test it by running
 
 .. code-block:: bash
 
     python -m pytest tests/core
+    python -m pytest tests/algorithms
 
-Mujoco
-------
-Mujoco is a popular library for testing RL methods. Installing Mujoco is not
-required to use most of the components and utilities in MBRL-Lib, but if you
-have a working Mujoco installation (and license) and want to test MBRL-Lib
-on it, you please install
-
-.. code-block:: bash
-
-    pip install -r requirements/mujoco.txt
-
-and to test our mujoco-related utilities, run
-
-.. code-block:: bash
-
-    python -m pytest tests/mujoco
 
 Basic Example
 -------------

diff --git a/mbrl/__init__.py b/mbrl/__init__.py
@@ -2,4 +2,4 @@
 #
 # This source code is licensed under the MIT license found in the
 # LICENSE file in the root directory of this source tree.
-__version__ = "0.1.0"
+__version__ = "0.1.1"
diff --git a/mbrl/algorithms/__init__.py b/mbrl/algorithms/__init__.py
diff --git a/mbrl/algorithms/mbpo.py b/mbrl/algorithms/mbpo.py
@@ -9,12 +9,12 @@
 import hydra.utils
 import numpy as np
 import omegaconf
-import pytorch_sac.utils
 import torch
 
 import mbrl.constants
 import mbrl.models
 import mbrl.planning
+import mbrl.third_party.pytorch_sac as pytorch_sac
 import mbrl.types
 import mbrl.util
 import mbrl.util.common

diff --git a/conf/algorithm/mbpo.yaml → mbrl/examples/conf/algorithm/mbpo.yaml b/conf/algorithm/mbpo.yaml → mbrl/examples/conf/algorithm/mbpo.yaml
@@ -15,7 +15,7 @@ num_eval_episodes: 1
 #          SAC Agent configuration
 # --------------------------------------------
 agent:
-  _target_: pytorch_sac.agent.sac.SACAgent
+  _target_: mbrl.third_party.pytorch_sac.agent.sac.SACAgent
   obs_dim: ??? # to be specified later
   action_dim: ??? # to be specified later
   action_range: ??? # to be specified later
@@ -38,14 +38,14 @@ agent:
   target_entropy: ${overrides.sac_target_entropy}
 
 double_q_critic:
-  _target_: pytorch_sac.agent.critic.DoubleQCritic
+  _target_: mbrl.third_party.pytorch_sac.agent.critic.DoubleQCritic
   obs_dim: ${algorithm.agent.obs_dim}
   action_dim: ${algorithm.agent.action_dim}
   hidden_dim: 1024
   hidden_depth: ${overrides.sac_hidden_depth}
 
 diag_gaussian_actor:
-  _target_: pytorch_sac.agent.actor.DiagGaussianActor
+  _target_: mbrl.third_party.pytorch_sac.agent.actor.DiagGaussianActor
   obs_dim: ${algorithm.agent.obs_dim}
   action_dim: ${algorithm.agent.action_dim}
   hidden_depth: ${overrides.sac_hidden_depth}

diff --git a/conf/algorithm/pets.yaml → mbrl/examples/conf/algorithm/pets.yaml b/conf/algorithm/pets.yaml → mbrl/examples/conf/algorithm/pets.yaml
diff --git a/conf/dynamics_model/basic_ensemble.yaml → ...s/conf/dynamics_model/basic_ensemble.yaml b/conf/dynamics_model/basic_ensemble.yaml → ...s/conf/dynamics_model/basic_ensemble.yaml
diff --git a/conf/dynamics_model/gaussian_mlp.yaml → ...les/conf/dynamics_model/gaussian_mlp.yaml b/conf/dynamics_model/gaussian_mlp.yaml → ...les/conf/dynamics_model/gaussian_mlp.yaml
diff --git a/...dynamics_model/gaussian_mlp_ensemble.yaml → ...dynamics_model/gaussian_mlp_ensemble.yaml b/...dynamics_model/gaussian_mlp_ensemble.yaml → ...dynamics_model/gaussian_mlp_ensemble.yaml
diff --git a/conf/main.yaml → mbrl/examples/conf/main.yaml b/conf/main.yaml → mbrl/examples/conf/main.yaml
diff --git a/conf/overrides/mbpo_ant.yaml → mbrl/examples/conf/overrides/mbpo_ant.yaml b/conf/overrides/mbpo_ant.yaml → mbrl/examples/conf/overrides/mbpo_ant.yaml
diff --git a/conf/overrides/mbpo_cartpole.yaml → ...xamples/conf/overrides/mbpo_cartpole.yaml b/conf/overrides/mbpo_cartpole.yaml → ...xamples/conf/overrides/mbpo_cartpole.yaml
diff --git a/conf/overrides/mbpo_halfcheetah.yaml → ...ples/conf/overrides/mbpo_halfcheetah.yaml b/conf/overrides/mbpo_halfcheetah.yaml → ...ples/conf/overrides/mbpo_halfcheetah.yaml
diff --git a/conf/overrides/mbpo_hopper.yaml → .../examples/conf/overrides/mbpo_hopper.yaml b/conf/overrides/mbpo_hopper.yaml → .../examples/conf/overrides/mbpo_hopper.yaml
diff --git a/conf/overrides/mbpo_humanoid.yaml → ...xamples/conf/overrides/mbpo_humanoid.yaml b/conf/overrides/mbpo_humanoid.yaml → ...xamples/conf/overrides/mbpo_humanoid.yaml
diff --git a/conf/overrides/mbpo_inv_pendulum.yaml → ...les/conf/overrides/mbpo_inv_pendulum.yaml b/conf/overrides/mbpo_inv_pendulum.yaml → ...les/conf/overrides/mbpo_inv_pendulum.yaml
diff --git a/conf/overrides/mbpo_pusher.yaml → .../examples/conf/overrides/mbpo_pusher.yaml b/conf/overrides/mbpo_pusher.yaml → .../examples/conf/overrides/mbpo_pusher.yaml
diff --git a/conf/overrides/mbpo_walker.yaml → .../examples/conf/overrides/mbpo_walker.yaml b/conf/overrides/mbpo_walker.yaml → .../examples/conf/overrides/mbpo_walker.yaml
diff --git a/conf/overrides/pets_cartpole.yaml → ...xamples/conf/overrides/pets_cartpole.yaml b/conf/overrides/pets_cartpole.yaml → ...xamples/conf/overrides/pets_cartpole.yaml
diff --git a/conf/overrides/pets_halfcheetah.yaml → ...ples/conf/overrides/pets_halfcheetah.yaml b/conf/overrides/pets_halfcheetah.yaml → ...ples/conf/overrides/pets_halfcheetah.yaml
diff --git a/conf/overrides/pets_hopper.yaml → .../examples/conf/overrides/pets_hopper.yaml b/conf/overrides/pets_hopper.yaml → .../examples/conf/overrides/pets_hopper.yaml
diff --git a/conf/overrides/pets_inv_pendulum.yaml → ...les/conf/overrides/pets_inv_pendulum.yaml b/conf/overrides/pets_inv_pendulum.yaml → ...les/conf/overrides/pets_inv_pendulum.yaml
diff --git a/conf/overrides/pets_pusher.yaml → .../examples/conf/overrides/pets_pusher.yaml b/conf/overrides/pets_pusher.yaml → .../examples/conf/overrides/pets_pusher.yaml
diff --git a/conf/overrides/pets_reacher.yaml → ...examples/conf/overrides/pets_reacher.yaml b/conf/overrides/pets_reacher.yaml → ...examples/conf/overrides/pets_reacher.yaml
diff --git a/main.py → mbrl/examples/main.py b/main.py → mbrl/examples/main.py
diff --git a/mbrl/planning/core.py b/mbrl/planning/core.py
@@ -132,8 +132,11 @@ def load_agent(agent_path: Union[str, pathlib.Path], env: gym.Env) -> Agent:
     agent_path = pathlib.Path(agent_path)
     cfg = omegaconf.OmegaConf.load(agent_path / ".hydra" / "config.yaml")
 
-    if cfg.algorithm.agent._target_ == "pytorch_sac.agent.sac.SACAgent":
-        import pytorch_sac
+    if (
+        cfg.algorithm.agent._target_
+        == "mbrl.third_party.pytorch_sac.agent.sac.SACAgent"
+    ):
+        import mbrl.third_party.pytorch_sac as pytorch_sac
 
         from .sac_wrapper import SACAgent
 

diff --git a/mbrl/planning/sac_wrapper.py b/mbrl/planning/sac_wrapper.py
@@ -3,10 +3,11 @@
 # This source code is licensed under the MIT license found in the
 # LICENSE file in the root directory of this source tree.
 import numpy as np
-import pytorch_sac
-import pytorch_sac.utils
 import torch
 
+import mbrl.third_party.pytorch_sac as pytorch_sac
+import mbrl.third_party.pytorch_sac.utils as pytorch_sac_utils
+
 from .core import Agent
 
 
@@ -40,5 +41,5 @@ def act(
         Returns:
             (np.ndarray): the action.
         """
-        with pytorch_sac.utils.eval_mode(), torch.no_grad():
+        with pytorch_sac_utils.eval_mode(), torch.no_grad():
             return self.sac_agent.act(obs, sample=sample, batched=batched)
diff --git a/mbrl/third_party/__init__.py b/mbrl/third_party/__init__.py
diff --git a/mbrl/third_party/dmc2gym/LICENSE b/mbrl/third_party/dmc2gym/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2020 Denis Yarats
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.