This version adds two new optimizers for CEM:
- Improved CEM as described here.
- MPPI as used in PDDM.
- Changed config structure so that action optimizer is passed as another config file.
- Added a new iterator for sequences that returns a fixed number of random batches in every loop.