Skip to content

Deep RL implementations. DQN, SAC, DDPG, TD3, PPO and VPG implemented in pytorch. Tested Env: LunarLander-v2 and Pendulum-v0.

Notifications You must be signed in to change notification settings

akashe/DeepReinforcementLearning

Repository files navigation

Deep RL algorithms implemented using Pytorch

Algo list:

  1. DQN
  2. Vanilla policy Gradient
  3. Deep Deterministic Policy Gradient
  4. Twin Delayed Deep Deterministic Policy Gradient
  5. Soft Actor Critic
  6. Proximal Policy Optimization - CLIP
Article on deeper Look into policy gradients

Experimental Results:

Algorithm Discrete Env: LunarLander-v2 Continuous Env: Pendulum-v0
DQN LunnarLander-DQN -
VPG LunarLander-VPG -
DDPG - Pendulum-DDPG
TD3 - Pendulum-TD3
SAC - Pendulum-SAC
PPO - Pendulum-PPO

Usage:

Just run the file/algorithm directly. There is no common structures between algorithms as I implemented them as I learnt them. Different algorithms are inspired from different sources.

Resources:

  1. RL course by David Silver
  2. Lecture slides for above course
  3. Spinning up by OpenAI
  4. More exhaustive RL guide by Deeny Britz

Future projects:

  1. If time available I will add a simple program for elevator using RL.
  2. Better graphs

Releases

No releases published

Packages

No packages published