Skip to content

v0.1.3 -- Tons of CLI logging improvements!

Latest
Compare
Choose a tag to compare
@natolambert natolambert released this 04 Oct 23:34
· 5 commits to main since this release
c8f3fd1

rewardbench CLI can be run on any instruction dataset with fancy logging of scores.
This makes it so rewardbench can be used to quickly throw together a rejection sampling pipeline once give generations.

Specifically, I think this type of logging is really great for evaluation. It’s something wandb does for training, but when using the CLI, you pass one arg that will save:

  • All the scores, input text, etc to HuggingFace
  • The command used to launch the eval
  • The current python env for reproducibility

Examples are in the readme: https://github.com/allenai/reward-bench?tab=readme-ov-file#logging

What's Changed

New Contributors

Full Changelog: v0.1.2...v0.1.3