ImageNet: VGGNet, ResNet, Inception, and Xception with Keras

June 1, 2018 2 minutes read

Notes • DeepLearning

ImageNet • Inception • VGG • Xception • Keras • DeepLearning • CNNs

Key concepts:

CNN: Convolutional Neural Network
ImageNet dataset
ImageNet Large Scale Visual Recognition Challenge: train a model that can correctly classify an input image into 1,000 separate object categories. ImageNet challenge is the de facto benchmark for computer vision classification algorithms
Weight serialization
Weights of a neural network
Pooling
Softmax

Keras comes integrated with VGG16, VGG19, ResNet50, Inception V3, and Xception neural network models (look inside applications submodules)
ImageNet: manually labeled 22 000 object categories
ImageNet Large Scale Visual Recognition Challenge: train a model that can correctly classify an input image into 1,000 separate object categories.
ImageNet challenge is the de facto benchmark for computer vision classification algorithms

arXiv:
- https://arxiv.org/abs/1409.1556
The VGG network architecture was introduced by Simonyan and Zisserman in their 2014 paper, Very Deep Convolutional Networks for Large Scale Image Recognition.
3×3 convolutional layers stacked on top of each other in increasing depth
The “16” and “19” stand for the number of weight layers in the network
The smaller networks converged and were then used as initializations for the larger, deeper networks — this process is called pre-training.
Slow to train
Huge compared to other networks, in terms of bandwidth and disk usage

arXiv:
- https://arxiv.org/abs/1512.03385
- https://arxiv.org/abs/1603.05027
First introduced by He et al. in their 2015 paper, Deep Residual Learning for Image Recognition
Micro-architecture modules -> building blocks to construct the network
ResNet50 -> 50 weight layers, based on 2015 paper

arXiv:
- https://arxiv.org/abs/1409.4842
- https://arxiv.org/abs/1512.00567
First introduced by Szegedy et al. in their 2014 paper, Going Deeper with Convolutions
The goal of the inception module is to act as a “multi-level feature extractor” by computing 1×1, 3×3, and 5×5 convolutions within the same module of the network
GoogLeNet -> Inception vN, N = version number put by google
Keras has inception from paper of 2015

arXiv:
- https://arxiv.org/abs/1610.02357
The original publication, Xception: Deep Learning with Depthwise Separable Convolutions, by François Chollet
Smallest weight serialization

ImageNet : 224×224, 227×227, 256×256, and 299×299; however, you may see other dimensions as well.
VGG16 : 224×224
VGG19 : 224×224
ResNet : 224×224
Inception V3: 299x299
Xception: 299×299 pixel inputs

The weights for the chosen neural network will be downloaded in the first use and will be available for later uses with Keras