Machine learning ecosystem has evolved a lot during recent years.
I am amazed that I could run a very sophisticated experiment of classifying dogs vs cats with 90% accuracy on my regular laptop laptop.
It has 2GB NVidia GPU card and 8GB RAM.
Just in 2012 the state of art result of the dogs vs cats classification was 80%.
I ran it based on an excellent course provided by fast.ai (http://course.fast.ai/).
The competition is organized by Kaggle:
Here's an overview of the approach taken to achieve 90% accuracy.
First, retrieve a publicly available model VGG16, which was prepared by scientists for image recognition competition (for ImageNet). Then remove last layer out of it and replace with Yes / No layer for recognizing cats vs dogs. The remaining layers were set as non trainable. Then run learning process for such model.
The main libraries used here are Keras with Tensorflow backend.
Full code is available on fast.ai website. Here in an overview of the most important parts.
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
# Import our class, and instantiate
import vgg16; reload(vgg16)
from vgg16 import Vgg16
vgg = Vgg16()
path = "data/dogscats/"
#path = "data/dogscats/sample/"
batches = vgg.get_batches(path+'train', batch_size=batch_size)
val_batches = vgg.get_batches(path+'valid', batch_size=batch_size)
vgg.fit(batches, val_batches, nb_epoch=1)
The code uses vgg.finetune call to update the last layer of the model. Here's how it looks like:
model = self.model
for layer in model.layers: layer.trainable=False