Skip to content

Commit

Permalink
docs: update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
gmickel committed Aug 15, 2024
1 parent 507bb01 commit 5ccbc4c
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 0 deletions.
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ AI-Powered End-to-End Task Implementation & blazingly fast Codebase-to-LLM Conte
[Templates](#-templates)
[Configuration](#-configuration)
[API](#-api)
[Benchmarking](#-benchmarking)
[Contributing](#-contributing)
[Roadmap](#-roadmap)
[FAQ](#-faq)
Expand Down Expand Up @@ -386,6 +387,43 @@ For more detailed instructions on using the GitHub integration and other CodeWhi

CodeWhisper can be used programmatically in your Node.js projects. For detailed API documentation and examples, please refer to [USAGE.md](USAGE.md).

## 🏋️ Benchmarking

CodeWhisper includes a benchmarking tool to evaluate its performance on Exercism Python exercises. This tool allows you to assess the capabilities of different AI models and configurations.

### Key Features

- Docker-based execution for consistent environments
- Concurrent worker support for faster benchmarking
- Detailed Markdown reports with performance metrics
- Options to customize test runs (number of tests, planning mode, diff mode)

### Usage

1. Build the Docker image:

```
./benchmark/docker_build.sh
```

2. Set up the appropriate API key as an environment variable.

3. Run the benchmark:
```
./benchmark/run_benchmark.sh --model <model_name> --workers <num_workers> --tests <num_tests> [options]
```

### Output

The benchmark generates a detailed Markdown report including:

- Summary statistics (total time, cost, pass percentage)
- Per-exercise results (time, cost, mode, model, tests passed)

Reports are saved in `benchmark/reports/` with timestamped filenames.

For full details on running benchmarks, interpreting results, and available options, please refer to the [Benchmark README](./benchmark/README.md).

## 🤝 Contributing

We welcome contributions to CodeWhisper! Please read our [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct and the process for submitting pull requests.
Expand Down
5 changes: 5 additions & 0 deletions benchmark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

This benchmark tool is designed to evaluate the performance of CodeWhisper on Exercism Python exercises.

## Please note

- Running the full benchmark will use a significant amount of tokens.
- Too many concurrent workers is likely to cause rate limiting issues.

## Usage

1. Build the Docker image:
Expand Down

0 comments on commit 5ccbc4c

Please sign in to comment.