- Create conda environment
conda create --name env-name ipykernel gitpython
- Download files or clone Github
from git import Repo
Repo.clone_from("https://github.com/ihamdi/Dogs-vs-Cats-Classification.git","/your/directory/")
or download and extract a copy of the files.
- Install PyTorch according to your machine. For example, on my machine:
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
- Install dependencies from
requirements.txt
file:
pip install -r requirements.txt
- Download the data:
The code is designed to download the data using the Kaggle API and extract it automatically. If you haven't used Kaggle API before, please look at the section at the bottom on how to download your API key.
Alternatively, you can download the data from the official Dogs vs. Cats competition page and extract "train.zip" to the train
folder.
Click Run All for the pytorch-cat-vs-dog.ipynb
Jupyter Notebook. After the necessary libraries are imported, you will be asked to input the following:
- Number of epochs
- Dropout rate
- Batch size
- Number of workers
- Learning rate
- Local path (to download data)
- Amount of dataset used
- Ratio for splitting the dataset (into training : validation : testing)
Afterwards, the program will run, giving a summary after each epoch as well as a graph of the training and validation losses and accuracies.
Densenet121 seems to be quite powerful for this task. Even with 20% dropout, the model accuracy passes 90% by the 3rd epoch and starts overfitting.
Training Loss & Accuracy | Validation Loss & Accuracy |
---|---|
This was created purely to gain hands-on experience with Python and Pytorch. Only the training data is utilized and no submission is made to the competition.
For any questions or feedback, please feel free to post comments/issues or contact me at ibraheem.hamdi@mbzuai.ac.ae
[pytorch] cat vs dog was used as base for this code.
Dogs vs. Cats competition on Kaggle.
Densenet paper by Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger.