ImageNet: VGGNet, ResNet, Inception, and Xception with Keras

Key concepts:

  • CNN: Convolutional Neural Network
  • ImageNet dataset
  • ImageNet Large Scale Visual Recognition Challenge: train a model that can correctly classify an input image into 1,000 separate object categories. ImageNet challenge is the de facto benchmark for computer vision classification algorithms
  • Weight serialization
  • Weights of a neural network
  • Pooling
  • Softmax

Notes:

  • Keras comes integrated with VGG16, VGG19, ResNet50, Inception V3, and Xception neural network models (look inside applications submodules)
  • ImageNet: manually labeled 22 000 object categories
  • ImageNet Large Scale Visual Recognition Challenge: train a model that can correctly classify an input image into 1,000 separate object categories.
  • ImageNet challenge is the de facto benchmark for computer vision classification algorithms

VGG16 VGG19

  • arXiv:
  • The VGG network architecture was introduced by Simonyan and Zisserman in their 2014 paper, Very Deep Convolutional Networks for Large Scale Image Recognition.
  • 3×3 convolutional layers stacked on top of each other in increasing depth
  • The “16” and “19” stand for the number of weight layers in the network
  • The smaller networks converged and were then used as initializations for the larger, deeper networks — this process is called pre-training.
  • Slow to train
  • Huge compared to other networks, in terms of bandwidth and disk usage

ResNet

Inception V3

  • arXiv:
  • First introduced by Szegedy et al. in their 2014 paper, Going Deeper with Convolutions
  • The goal of the inception module is to act as a “multi-level feature extractor” by computing 1×1, 3×3, and 5×5 convolutions within the same module of the network
  • GoogLeNet -> Inception vN, N = version number put by google
  • Keras has inception from paper of 2015

Xception

  • arXiv:
  • The original publication, Xception: Deep Learning with Depthwise Separable Convolutions, by François Chollet
  • Smallest weight serialization

SqueezeNet


Classifying images with VGGNet, ResNet, Inception, and Xception with Python and Keras

CNN Image pixel sizes

  • ImageNet : 224×224, 227×227, 256×256, and 299×299; however, you may see other dimensions as well.
  • VGG16 : 224×224
  • VGG19 : 224×224
  • ResNet : 224×224
  • Inception V3: 299x299
  • Xception: 299×299 pixel inputs

The weights for the chosen neural network will be downloaded in the first use and will be available for later uses with Keras

comments powered by Disqus