limitation-learning

This repository houses the code for the NYU Deep Reinforcement Learning Fall 2020 Final Project by Rahul Zalkikar and Noah Kasmanoff, "Limitation Learning: Capturing Adverse Dialog with GAIL".

We apply generative adversarial imitation learning (GAIL) to produce a proxy for the reward function present in a basic conversation, using the Cornell Movie Dialog Corpus. We apply imitation learning to craft coherent replies to the input utterance. Thus, the actor network is Seq2Seq w/ Attention pre-trained with imitation learning.

Down the line, our focus is on an auxilary objective of GAIL, using a discriminator network as a proxy for a reward function.

We hope that after training the policy and discriminator networks to equilibrium, we may use this proxy reward function as a way to probe black box language models with direct feedback. We can then additionally feed inputs to publicly available conversational AI, extract a response, and pass this through the proxy reward to gain a better intuition on that language model's behavior.

This work is just the beginning. We emphasize that GAIL is a method of imitation, not inverse reinforcement learning. This distinction is important in that we cannot recover the underlying reward function of the system, but instead a proxy based on imitation. We consider the application of more advanced techniques such as guided cost learning a worthwhile next step if GAIL succeeds.

Name		Name	Last commit message	Last commit date
Latest commit History 261 Commits
dat/cornell-movie-dialog-corpus		dat/cornell-movie-dialog-corpus
models		models
notebooks		notebooks
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
irl.yml		irl.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

limitation-learning

About

Releases 1

Packages

Languages

License

zalkikar/limitation-learning

Folders and files

Latest commit

History

Repository files navigation

limitation-learning

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages