This repository contains a C++ multithreaded implementation of the asynchronous advantage actor-critic (A3C) algorithm based on NVIDIA's GA3C. It has been tested on the CartPole-v0 OpenAI Gym environment using TensorFlow and integeruser/gym-uds-api, with model configuration and parameters as described in jaromiru/AI-blog.
This project requires building TensorFlow 1.3 from sources. Example instructions are provided for macOS (tested on macOS Catalina 10.15.3).
-
Install Homebrew.
-
Install OpenJDK 8:
~$ brew cask install homebrew/cask-versions/adoptopenjdk8
-
Download bazel-0.4.5-jdk7-installer-darwin-x86_64.sh, make it executable with
chmod
, and install Bazel 0.4.5 (to$HOME/bin/bazel
):Downloads$ env JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home" ./bazel-0.4.5-jdk7-installer-darwin-x86_64.sh --user
-
Install pyenv and Python 3.6.10 (to
$HOME/.pyenv/versions/3.6.10/bin/python
):~$ brew install pyenv ~$ pyenv install 3.6.10
-
Install the wheel, NumPy and dlib pip packages:
~$ $HOME/.pyenv/versions/3.6.10/bin/pip install wheel numpy dlib
-
Clone TensorFlow from the official repository:
~$ git clone https://github.com/tensorflow/tensorflow
-
cd
to the TensorFlow directory (assumed to be the working directory for all the next steps), then switch to version 1.3:tensorflow$ git checkout r1.3
-
Configure TensorFlow, specifying, when asked,
$HOME/.pyenv/versions/3.6.10/bin/python
as the location of Python (but expanding$HOME
to its value):tensorflow$ env PATH="$HOME/bin:$PATH" JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home" ./configure
-
Build the TensorFlow shared library (to
bazel-bin/tensorflow/libtensorflow_cc.so
):tensorflow$ env PATH="$HOME/bin:$PATH" JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home" bazel build //tensorflow:libtensorflow_cc.so
Building may fail if (an incompatible version of) protobuf was already installed in the machine, in which case you need to make sure that TensoFlow builds and uses its internal version of protobuf instead.
-
Build and install the TensorFlow pip package:
tensorflow$ env PATH="$HOME/bin:$PATH" JAVA_HOME="/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home" bazel build //tensorflow/tools/pip_package:build_pip_package tensorflow$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg tensorflow$ $HOME/.pyenv/versions/3.6.10/bin/pip install /tmp/tensorflow_pkg/tensorflow-1.3.1-cp36-cp36m-macosx_10_15_x86_64.whl
-
Install the OpenAI Gym pip package:
$ $HOME/.pyenv/versions/3.6.10/bin/pip install gym
-
Clone this repository:
$ git clone https://github.com/integeruser/GA3C-cpp.git
-
cd
to theGA3C-cpp
directory and compile the code for testing on CartPole-v0:GA3C-cpp$ make TENSORFLOW_DIRPATH=/absolute/path/to/tensorflow/repository GA3C-cartpole-v0
-
cd
to theGA3C-cpp/cartpole-v0
directory and start the gym-uds servers:GA3C-cpp/cartpole-v0$ env PATH="$HOME/.pyenv/versions/3.6.10/bin:$PATH" ./start-gym-uds-servers.sh
-
Generate the nontrained TensorFlow model:
GA3C-cpp/cartpole-v0$ $HOME/.pyenv/versions/3.6.10/bin/python ./cartpole-v0.py generate
-
Train the agent (specifying
DYLD_LIBRARY_PATH
for findinglibtensorflow_cc.so
):GA3C-cpp/cartpole-v0$ env DYLD_LIBRARY_PATH="/absolute/path/to/tensorflow/repository/bazel-bin/tensorflow" bin/GA3C-cartpole-v0
-
The updated weights of the model are saved back to disk. Lastly, see the trained agent in action:
GA3C-cpp/cartpole-v0$ $HOME/.pyenv/versions/3.6.10/bin/python ./cartpole-v0.py test