This repository contains an implementation of the grep
tool built in C++ as part of the Codecrafters "Build Your Own Grep" Challenge. The project focuses on building a regular expression (Regex) engine from scratch, capable of searching through text with support for various Regex patterns and features.
This project is an attempt to implement grep
, a command-line utility for searching text using regular expressions, in C++. Along the way, we will dive into the workings of regex syntax, character classes, quantifiers, and more. By the end of the challenge, you will have a deeper understanding of how Regex works under the hood and be able to search and match patterns in text just like the real grep
tool.
- Matching literal characters
- Supporting various regex features such as character classes, anchors, quantifiers, and more
- Backreference support
- Regex alternation
- Wildcard matches
The project is developed using:
- C++: The core language for implementing our custom
grep
functionality. - POSIX standard: For guiding the implementation to match
grep
's expected behavior.
To get started with building and running your own grep
implementation, follow the instructions below.
Make sure you have the following installed:
- GCC or any other C++ compiler.
- Make (optional, for build automation).
-
Clone the repository:
git clone https://github.com/your_username/grep_cpp.git
-
Navigate to the project directory:
cd grep_cpp
-
Compile the project:
g++ -o grep_cpp main.cpp
-
Run the program with a test file:
./grep_cpp "pattern" test_file.txt
The most basic functionality of grep
is to search for a literal string within a text.
Our implementation supports matching digits (e.g., \d
).
Match alphanumeric characters using character classes.
We can match specific groups of characters like [a-zA-Z]
.
Supports negation in character groups, like [^a-z]
.
You can combine character classes and patterns for complex matching.
Use the ^
(start of string) and $
(end of string) anchors to match text at the beginning or end of a string.
+
: Match one or more times?
: Match zero or one times
Supports .
wildcard to match any single character.
Supports the |
operator to match alternative patterns (e.g., abc|xyz
).
- Single Backreference: Allows matching repeated patterns.
- Multiple Backreferences: Matches more complex repeated groups.
- Nested Backreferences: Supports matching nested patterns using backreferences.
- Match literal characters
- Implement digit matching and alphanumeric character groups
- Add support for character groups (positive and negative)
- Implement anchors (start/end of string)
- Add quantifiers and alternation support
- Implement backreferences
- Optimize regex matching algorithm for performance
Contributions are welcome! If you'd like to contribute to the project, feel free to fork the repo and submit a pull request.
- Fork the project.
- Create your feature branch (
git checkout -b feature/newFeature
). - Commit your changes (
git commit -m 'Add some feature'
). - Push to the branch (
git push origin feature/newFeature
). - Open a pull request.
Distributed under the MIT License. See LICENSE
for more information.
Your Name - @Linkedin - obidur.shawal@gmail.com
Project Link: https://github.com/Ashfinn/grepper