AI-Powered End-to-End Task Implementation & blazingly fast Codebase-to-LLM Context Bridge
About • Why CodeWhisper? • Key Features • Quick Start • Installation • Usage • Templates • Configuration • API • Benchmarking • Contributing • Roadmap • FAQ
CodeWhisper is a powerful tool that bridges the gap between your codebase and Large Language Models (LLMs). It serves two primary functions:
-
AI-Powered End-to-End Task Implementation: Tackle complex, codebase-spanning tasks with ease. CodeWhisper doesn't just suggest snippets; it plans, generates, and applies comprehensive code changes across your entire project, from backend logic to frontend integration. CodeWhisper's generations are SOTA and outperform other AI-code generation tools in benchmarks. See Benchmarking for more details.
-
Precision-Guided Context Curation for LLMs: Harness the power of human insight to feed AI exactly what it needs. Quickly transform carefully selected parts of your codebase into rich, relevant context for LLMs, ensuring more accurate and project-aligned results.
Whether you're implementing comprehensive features, tackling complex refactoring, conducting thorough code reviews, or seeking AI-driven architectural insights, CodeWhisper equips your AI tools with the comprehensive understanding they need. It's not just about coding assistance – it's about enabling AI to be a true collaborator in your software development process.
Connect with fellow users and developers, share insights, discuss features, and get support for leveraging CodeWhisper in your coding workflow by joining our CodeWhisper Discord.
CodeWhisper was born out of a simple yet powerful idea: to provide AI models with meticulously curated context from your entire codebase in the most comfortable way possible. What started as a tool to generate comprehensive and customizable prompts from your codebase has evolved into a full-fledged AI-assisted development workflow solution.
Many AI coding assistants and tools fall short when tackling tasks that demand a comprehensive understanding of your project. They often lack the big-picture context necessary for making informed decisions about your codebase as a whole. CodeWhisper addresses this limitation through its unique manually curated context approach, delivering end-to-end task implementation with a git-first workflow:
Read more
-
Precision Through Human-Guided Curation: CodeWhisper trusts you to handpick the most relevant parts of your codebase for any given task. This ensures the AI model receives exactly the context it needs, leading to more accurate and comprehensive task implementation.
Example: For a task to "Implement user authentication":
- You select core auth components, user models, and key API endpoints.
- CodeWhisper then generates and applies all necessary code modifications across selected files.
- The result is a fully implemented feature, from backend logic to frontend integration.
-
Project-Specific Knowledge Integration: Manual curation allows you to include non-code context that automated tools might miss, such as architectural decisions or business logic explanations.
Example: When enhancing your payment system, you can include:
- Relevant code files
- Snippets from financial compliance documents
- Notes on transaction flow architecture
CodeWhisper uses this rich context to generate compliant, architecturally sound code modifications.
-
Noise Reduction, Signal Amplification: By manually curating the context, you eliminate irrelevant information, enabling CodeWhisper to generate more focused and effective code modifications.
Example: For a UI redesign task, you can exclude backend complexities, allowing CodeWhisper to concentrate on generating precise frontend component updates and style changes.
-
Adaptive to Project Evolution: As your project evolves, manual curation ensures CodeWhisper always works with the most up-to-date and relevant information.
Example: After adopting a new state management library, you can immediately update the context, ensuring CodeWhisper's generated code aligns with your new architecture.
-
Seamless Integration of External Knowledge: CodeWhisper's approach allows you to easily incorporate relevant code snippets or documentation from outside your current project.
Example: When implementing a new API integration, you could include:
- Your existing API service files
- Official documentation of the third-party API
- Example implementations from other projects
CodeWhisper will then use this context to generate a fully functional integration, handling authentication, data mapping, and error scenarios.
-
Git-First Workflow: CodeWhisper automatically creates new branches before applying any code modifications, ensuring a clean and organized development process.
Example: For a task to "Add user profile management":
- CodeWhisper creates a new branch (e.g.,
feature/user-profile-management
) - Generates and applies all necessary code changes within this branch
- Optionally prepares a commit with a descriptive message
This approach makes it straightforward to track CodeWhisper's output and review the changes in a dedicated branch.
- CodeWhisper creates a new branch (e.g.,
By leveraging manually curated context and a git-first approach, CodeWhisper transforms from a simple code assistant into a comprehensive task implementation tool. It doesn't just suggest code snippets; it generates, applies, and organizes entire feature implementations. This approach combines the best of both worlds: the vast knowledge and processing power of AI models with the nuanced understanding and decision-making capabilities of experienced developers.
While CodeWhisper excels at performing individual coding tasks and even large feature implementations, its true power shines in its flexibility to also tackle scenarios that require understanding the big picture:
- Refactoring: Make informed decisions about restructuring your code based on a comprehensive understanding of your project's architecture.
- Architectural Insights: Get AI-driven suggestions for improving your overall code structure and design patterns.
- Code Reviews: Conduct more thorough and context-aware code reviews with AI assistance.
- Documentation: Generate more accurate and comprehensive documentation that takes into account the entire project structure.
Feature | Description |
---|---|
🧠 AI-powered task planning and code generation | Leverage AI to plan and implement complex coding tasks |
🚀 SOTA generations | CodeWhisper's generations are SOTA and outperform other AI-code generation tools in benchmarks, even though it uses one-shot generation. See Benchmarking for more details. |
🔄 Full git integration | Version control of AI-generated changes |
🔄 Diff-based code modifications | Handle larger edits within output token limits |
🌍 Support for various LLM providers | Compatible with Anthropic, OpenAI, Ollama and Groq |
🔐 Support for local models | Use local models via Ollama |
🚀 Blazingly fast code processing | Concurrent workers for improved performance |
🎯 Customizable file filtering and exclusion | Fine-tune which files to include in the context |
📊 Intelligent caching | Improved performance through smart caching |
🔧 Extensible template system | Interactive variable prompts for flexible output |
🖊️ Custom variables in templates | Support for single-line and multi-line custom variables |
💾 Value caching | Quick template reuse with cached values |
🖥️ CLI and programmatic API | Use CodeWhisper in scripts or as a library |
🔒 Respect for .gitignore | Option to use custom include and exclude globs |
🌈 Full language support | Compatible with all text-based file types |
🤖 Interactive mode | Granular file selection and template customization |
⚡ Optimized for large repositories | Efficient processing of extensive codebases |
📝 Detailed logging | Log AI prompts, responses, and parsing results |
🔗 GitHub integration | Fetch and work with issues (see Configuration) |
Both videos feature CodeWhisper using Claude 3.5 Sonnet for the plan and code generation steps.
codegen.mp4
CodeWhisper.mp4
Get started with CodeWhisper in just a few steps:
# Install CodeWhisper globally
npm install -g codewhisper
# Navigate to your project directory
cd /path/to/your/project
# Start an AI-assisted coding task
# CodeWhisper will prompt you to select a model from the list of available models
codewhisper task
# The mode of operation for generating code modifications is set automatically based on the model.
# You can override this by using the --diff or --no-diff option.
codewhisper task --diff
codewhisper task --no-diff
# You can also specify a model directly
# Claude-3.5 Sonnet
codewhisper task -m claude-3-5-sonnet-20240620
# GPT-4o
codewhisper task -m gpt-4o-2024-08-06
# DeepSeek Coder
codewhisper task -m deepseek-coder
# Or use a local Ollama model (not recommended as it will be slow and inaccurate for comprehensive feature implementation tasks)
codewhisper task -m ollama:llama3.1:70b --context-window 131072 --max-tokens 8192
# To undo changes made by an AI-assisted task, use the --undo option
codewhisper task --undo
# To redo the last task with the option to change the model, file selection or plan, use the --redo option.
# Note: CodeWhisper saves the plan, instructions, model and selected files from the last task. Other options (such as --dry-run) need to be specified again.
codewhisper task --redo
# Run the task without the AI-generated planning step
codewhisper task --no-plan
# Automatically accept the AI-generated plan and directly proceed to the code generation step
codewhisper task --accept-plan
# Generate an AI-friendly prompt using interactive mode
codewhisper interactive
# List available models
codewhisper list-models
Note: If you are using CodeWhisper's LLM integration with
codewhisper task
, you will need to set the respective environment variable for the model you want to use (e.g.,export ANTHROPIC_API_KEY=your_api_key
orexport OPENAI_API_KEY=your_api_key
orexport GROQ_API_KEY=your_api_key
orexport DEEPSEEK_API_KEY=your_api_key
).
For more detailed instructions, see the Installation and Usage sections.
While CodeWhisper supports a variety of providers and models, our current recommendations are based on extensive testing and real-world usage. Here's an overview of the current status:
This section is still under development. We are actively testing and evaluating models.
Model | Provider | Recommendation | Editing Mode | Plan Quality | Code Quality | Edit Precision | Notes |
---|---|---|---|---|---|---|---|
Claude-3.5-Sonnet | Anthropic | Highest | Diff | Excellent | Excellent | High | Generates exceptional quality plans and results |
GPT-4o | OpenAI | Excellent | Diff | Very Good | Good | Medium | Produces high-quality plans and good results, long max output length (16384 tokens) |
GPT-4o-mini | OpenAI | Strong | Diff | Good | Good | Medium | Good quality plans and results, long max output length (16384 tokens) |
GPT-4o-mini | OpenAI | Strong | Whole* | Good | Very Good | High | Improved code quality and precision in whole-file edit mode |
DeepSeek Coder | DeepSeek | Good | Diff | Good | Good | Medium | Good quality plans and results, long max output length (16384 tokens) |
* Whole-file edit mode is generally more precise but may lead to issues with maximum output token length, potentially limiting the ability to process larger files or multiple files simultaneously. It can also result in incomplete outputs for very large files, with the model resorting to placeholders like "// other functions here" instead of providing full implementations.
For more details, see the Benchmarking section.
- Groq as a provider
- We're eager to test Llama 3.1 405B on Groq
- Current rate limits are too restrictive for thorough testing
- Awaiting access to paid plans and the larger Llama 3.1 model for further evaluation
Currently not recommended for complex tasks. Models tested include:
- Llama 3.1 (8B to 70B variants)
- DeepSeek Coder V2
- Mistral Nemo
- Mistral Large
These models currently struggle to follow instructions accurately for comprehensive task implementation. However, we are actively working on:
- Improving the workflow for smaller local models
- Developing an evaluation pipeline for consistent performance measurement
- Fine-tuning prompts to better suit the capabilities of local models
You can use CodeWhisper without installation using npx
, or install it globally:
# Using npx (no installation required)
npx codewhisper <command>
# Global installation
npm install -g codewhisper
CodeWhisper offers several commands to cater to different use cases:
Command | Description |
---|---|
task |
Begin an AI-assisted coding task |
generate |
Generate a codebase summary |
interactive |
Start an interactive session for codebase summary generation |
apply-task <file> |
Apply a previously AI-generated task |
list-templates |
List available templates |
list-models |
List available AI models |
clear-cache |
Clear CodeWhisper's cache |
export-templates |
Export templates to the current or specified directory |
For detailed usage instructions and examples, please refer to USAGE.md.
To undo changes made by an AI-assisted task, use the --undo
option with the task
command:
codewhisper task --undo
This command will:
- Discard uncommitted changes if any
- Delete the AI-generated branch if not on the main branch
- Offer to revert the last commit if on the main branch
The command will always ask for confirmation before making any changes. It will show you the exact actions it's about to perform, including the full commit message of any commit it's about to revert.
Always review the proposed changes carefully before confirming, as this operation may result in loss of work.
CodeWhisper now supports redoing AI-assisted tasks from the review plan stage. This feature allows you to restart your last task with the option to modify the generated plan as well as the model and file selection. To use this feature, use the --redo
option with the task
command:
codewhisper task --redo
When you use the --redo
option:
- CodeWhisper will retrieve the last task data for the current project directory.
- It will display the details of the last task, including the task description, instructions, model used, and files included.
- You'll be given the option to change the AI model for code generation.
- You'll also have the opportunity to modify the file selection for the task.
- The task will then continue from the review plan stage, where you can then modify the plan to your liking.
This feature is particularly useful when:
- You want to try a different AI model for the same task
- You need to adjust the file selection for the same task
- You want to quickly tweak the plan without starting from scratch
Note: The redo functionality uses a cache stored in your home directory, so it persists across different sessions and is not affected by git operations like branch switching or resetting.
Example workflow:
- Run an initial task:
codewhisper task
- Review the code modifications and decide you want to tweak the plan or try a different model
- Undo the task:
codewhisper task --undo
- Redo the task:
codewhisper task --redo
- Optionally change the model when prompted
- Optionally adjust the file selection
- Adjust the plan as needed
This feature enhances the flexibility of CodeWhisper's AI-assisted workflow, allowing for quick iterations and experimentation with different models or scopes for your tasks.
CodeWhisper uses Handlebars templates to generate output. You can use pre-defined templates or create custom ones. For in-depth information on templating, see TEMPLATES.md.
For more details on custom templates, extending CodeWhisper, and integrating with other tools, check CUSTOMIZATION.md.
To use CodeWhisper's LLM integration, you need to set the appropriate environment variable for the model you want to use:
Provider | Environment Variable | Example |
---|---|---|
Anthropic | ANTHROPIC_API_KEY |
export ANTHROPIC_API_KEY=your_api_key |
OpenAI | OPENAI_API_KEY |
export OPENAI_API_KEY=your_api_key |
Groq | GROQ_API_KEY |
export GROQ_API_KEY=your_api_key |
To use the GitHub issue integration feature, you need to set the GITHUB_TOKEN
environment variable with a valid GitHub personal access token.
You can create a fine-grained personal access token in your GitHub account settings. The token needs the following permission:
- "Issues" repository permission with read access
This allows CodeWhisper to list repository issues.
To set up the token:
- Create a fine-grained personal access token following GitHub's documentation.
- Set the environment variable:
export GITHUB_TOKEN=your_github_personal_access_token
To use the GitHub integration in your CodeWhisper tasks:
- Use the
--github-issue
flag to select from open issues in the current repository:
codewhisper task --github-issue -m claude-3-5-sonnet-20240620
- Use the
--github-issue-filters
flag to filter issues by label, assignee, or other criteria:
codewhisper task --github-issue --github-issue-filters assignee:abc,label:p1 -m claude-3-5-sonnet-20240620
The --github-issue-filters
option accepts comma-separated key:value pairs. For a full list of available filter options, refer to the GitHub API documentation.
Note: This endpoint can be used without authentication if only public resources are requested. However, using a token is recommended to avoid rate limiting and access private repositories.
For more detailed instructions on using the GitHub integration and other CodeWhisper features, please refer to USAGE.md.
CodeWhisper can be used programmatically in your Node.js projects. For detailed API documentation and examples, please refer to USAGE.md.
CodeWhisper includes a benchmarking tool to evaluate its performance on Exercism Python exercises. This tool allows you to assess the capabilities of different AI models and configurations.
- Docker-based execution for consistent environments
- Concurrent worker support for faster benchmarking
- Detailed Markdown reports with performance metrics
- Options to customize test runs (number of tests, planning mode, diff mode)
-
Build the Docker image:
./benchmark/docker_build.sh
-
Set up the appropriate API key as an environment variable.
-
Run the benchmark:
./benchmark/run_benchmark.sh --model <model_name> --workers <num_workers> --tests <num_tests> [options]
The benchmark generates a detailed Markdown report including:
- Summary statistics (total time, cost, pass percentage)
- Per-exercise results (time, cost, mode, model, tests passed)
Reports are saved in benchmark/reports/
with timestamped filenames.
CodeWhisper's performance has been evaluated across different models using the Exercism Python exercises. Below is a summary of the benchmark results:
Model | Tests Passed | Time (s) | Cost ($) | Command |
---|---|---|---|---|
claude-3-5-sonnet-20240620 | 80.26% | 1619.49 | 3.4000 | ./benchmark/run_benchmark.sh --workers 5 --no-plan |
gpt-4o-2024-08-06 | 81.51% | 986.68 | 1.6800 | ./benchmark/run_benchmark.sh --workers 5 --no-plan --model gpt-4o-2024-08-06 |
deepseek-coder | 76.98% | 5850.58 | 0.0000* | ./benchmark/run_benchmark.sh --workers 5 --no-plan --model deepseek-coder |
*The cost calculation was not working properly for this benchmark run.
Note: All benchmarks are one-shot only, unlike other benchmarks which use multiple generations that depend on the results of the test run.
The full reports used to generate these results are available in the benchmark/reports/
directory.
These results provide insights into the efficiency and accuracy of different models when used with CodeWhisper. The "Tests Passed" percentage indicates the proportion of Exercism tests successfully completed, while the time and cost metrics offer a view of the resource requirements for each model.
As we continue to run benchmarks with various models and configurations, this table will be updated to provide a comprehensive comparison, helping users make informed decisions about which model might best suit their needs.
For full details on running benchmarks, interpreting results, and available options, please refer to the Benchmark README.
We welcome contributions to CodeWhisper! Please read our CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.
- Add AI-assisted task creation and code generation
- Add GitHub integration for fetching issues and pull requests
- Add other integrations for fetching issues and pull requests (GitLab, Jira, Linear, etc.)
- Finish OpenAI and Groq support
- Add support for other LLMs
- Add support for local models via Ollama
- Experiment with partial file modifications
- Experiment with generateObject with a fixed schema
- Run evaluations on generated code
- Possibly add agentic behaviors
Question | Answer |
---|---|
How does CodeWhisper handle large codebases? | CodeWhisper uses concurrent workers and intelligent caching for optimal performance with large repositories. For very large projects, use specific file filters or interactive mode to focus on relevant parts. |
Can I use custom templates? | Yes, you can create custom Handlebars templates and use them with the --custom-template option or by placing them in the templates/ directory. See TEMPLATES.md for more information. |
Does CodeWhisper support languages other than JavaScript/TypeScript? | Yes, CodeWhisper supports all text-based file types and has language detection for a wide range of programming languages. |
How can I use CodeWhisper in my CI/CD pipeline? | CodeWhisper can be integrated into CI/CD pipelines. Install it as a dependency and use the CLI or API in your scripts. You can generate code summaries for pull requests or create documentation automatically on each release. See USAGE.md for CI/CD integration examples. |
Can I use CodeWhisper with other AI tools or language models? | Yes, CodeWhisper generates code summaries that can be used as input for various AI tools and language models. You can pipe the output to any AI tool or LLM of your choice. |
How does CodeWhisper handle sensitive information in the code? | CodeWhisper respects .gitignore files by default, helping to exclude sensitive files. Always review generated summaries before sharing, especially with confidential codebases. |
MIT License © 2024-PRESENT Gordon Mickel
- Handlebars for templating
- Commander.js for CLI support
- fast-glob for file matching
- Inquirer.js for interactive prompts
- Vercel AI SDK for the great AI SDK
Gordon Mickel - @gmickel - gordon@mickel.tech
Project Link: https://github.com/gmickel/CodeWhisper
⭐ If you find CodeWhisper useful, please consider giving it a star on GitHub to show your support! ⭐
Made with ❤️ by Gordon Mickel.