Skip to content

Computer-Vision related research papers implemented using PyTorch

License

Notifications You must be signed in to change notification settings

0xd3ba/deep-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Vision

Computer-Vision related research papers implemented using PyTorch, from scratch.

  • /arch/* contains various CNN architectures
  • /semantic_segmentation/* contains CNN models related to the task of semantic segmentation
  • /gan/* contains models for generative adversarial networks

The following list of papers have been currently implemented:

Architectures

Model Year Paper
LeNet-5 1998 Gradient-Based Learning Applied to Document Recognition
AlexNet 2012 ImageNet Classification with Deep Convolutional Neural Networks
ZFNet 2013 Visualizing and Understanding Convolutional Networks
VGG-16 2014 Very Deep Convolutional Networks for Large-Scale Image Recognition
GoogLeNet 2014 Going Deeper with Convolutions
ResNet 2015 Deep Residual Learning for Image Recognition
Inception-v2 2015 Rethinking the Inception Architecture for Computer Vision
Xception 2016 Xception: Deep Learning with Depthwise Separable Convolutions
MobileNet 2017 MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Semantic Segmentation

Model Year Paper
FCN 2014 Fully Convolutional Networks for Semantic Segmentation
SegNet 2015 SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
U-Net 2015 U-Net: Convolutional Networks for Biomedical Image Segmentation
PSPNet 2015 Pyramid Scene Parsing Network
ENet 2016 ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
ICNet 2017 ICNet for Real-Time Semantic Segmentation on High-Resolution Images

Generative Adversarial Networks (GANs)

Model Year Paper
GAN 2014 Generative Adversarial Networks
DCGAN 2015 Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Pix2Pix 2016 Image-to-Image Translation with Conditional Adversarial Networks
WGAN 2017 Wasserstein GAN

Training / Configuring the Models

The only dependencies required are

  • torch (v1.9.0 or similar)
  • numpy (v1.20.1 or similar)
  • tqdm

NOTE #1: Many of the models use external datasets, some of which have not been pushed to this repository due to the sheer amount of size. Cross-check /[model_class]/[model]/dataset/ directory and see the *.py file for information about the dataset that needs to be used. Alternatively, you can also use any relevant dataset of your choice, but it's upto you to write the dataset class and change the required parameters (more on it below).

Follow the steps to train the model:

  • Change to the relevant model directory: $ cd /[model_class]/[model]/
  • Change/Download the dataset, if necessary and place the contents in dataset/ directory.
  • Update main.py and/or train.py (necessary if using a different dataset)
  • Execute main.py: $ python3 main.py

Disclaimer

Some of the implementations differ from the corresponding paper's mentioned configurations (especially the older papers) by a small amount. Wherever such differences are present, it is mentioned in the comments.

About

Computer-Vision related research papers implemented using PyTorch

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages