This git repository contains code and configurations for implementing a Convolutional Neural Network to classify images containing cats or dogs. The data was sourced from the dogs-vs-cats Kaggle competition, and also from freeimages.com using a web scraper. Docker containers were used to deploy the application on an EC2 spot instances in order to scale up hardware and computation power.
- The aws subdirectory contains batch and shell scripts for configuring ec2 spot instances and the deploying docker container remotely.
- The conda subdirectory contains batch and shell scripts for creating a local conda environment for the project.
- The data_prep subdirectory contains python utility scripts to data cleansing and processing for modelling.
- The kaggle subdirectory contains python scripts for downloading and unzipping competition data from Kaggle.
- The model subdirectory contains python scripts for initiating and training CNN models.
- The ref subdirectory contains previous analysis and kernals on dogs vs cats classification from Kaggle community members.
- The report subdirectory contains reportable images and plots generated by the application.
- The webscrapers subdirectory contains webscraping tools for downloading cats and dogs images from freeimages.com.
The main dog and cat image classification application is contained within the root scripts:
- The 01_prg_kaggle_data.py script downloads / unzips the cat vs dogs competition data.
- The 02_prg_scrape_imgs.py script scrapes additional cat and dog images from freeimages.com.
- The 03_prg_keras_model.py script trains, fits and makes image predictions of the cat and dog images using a CNN model.
- The analysis_results.ipynb file contains a high level summary aof the analysis results.
- The cons.py script contains programme constants and configurations.
- The Dockerfile builds the application container for deployment on ec2.
- The exeDocker.bat executes the Docker build process locally on windows.
- The requirements.txt file contains the python package dependencies for the application.
See the analysis results notebook for a summary of the project; including image processing, CNN architecture and model performance.
The application docker container is available on dockerhub here:
https://hub.docker.com/repository/docker/oislen/cat-classifier