Sunday, April 16, 2017

Landing on The Moon using AI Bot

Recently I was playing around with OpenAI Gym and Keras Reinforcement Learning library (keras-rl). I was able to train an AI Agent for a task of landing on the Moon. 

OpenAI Gym provides all sorts of different environments to explore using AI bots. One of them is lunar lander, which I'll focus on here. 
Keras on the other hand is a high level library built on top of TensorFlow (or Theano). It provides mechanisms for constructing deep learning models easily. 

Creating a new environment using OpenAI Gym is as easy as this:

env = gym.make('LunarLander-v2')

Here's how you can add video recorder of progress during training:

env = gym.wrappers.Monitor(env, 
                           'recording', 
                           resume=True, 
                           video_callable=lambda count: count % record_video_every == 0
                           )

The model itself is quite simple DQN Agent with LinearAnnealedPolicy. The most important layer is Dense 512 neuron internal layer. It's responsible for understanding of the current situation during landing. Next small, dense layer on top of it is responsible for final decisions related to lunar lander actions (steering the engines).

Here's how it can be instantiated:

model = Sequential()
model.add(Flatten(input_shape=(1,) + env.observation_space.shape))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dense(nb_actions))
model.add(Activation('linear'))
print(model.summary())

memory = SequentialMemory(limit=1000000, window_length=WINDOW_LENGTH)

policy = LinearAnnealedPolicy(EpsGreedyQPolicy(), attr='eps', value_max=1., value_min=.1, value_test=.05, nb_steps=1000000)

dqn = DQNAgent(model=model, nb_actions=nb_actions, policy=policy, memory=memory, nb_steps_warmup=50000, gamma=.99, target_model_update=10000, train_interval=4, delta_clip=1.)

dqn.compile(Adam(lr=.00025), metrics=['mae'])


Here's a summary of the model:

Layer (type)                     Output Shape          Param #     Connected to                 7    
flatten_1 (Flatten)              (None, 8)             0           flatten_input_1[0][0]            
dense_1 (Dense)                  (None, 512)           4608        flatten_1[0][0]                  
activation_1 (Activation)        (None, 512)           0           dense_1[0][0]                    
dense_2 (Dense)                  (None, 4)             2052        activation_1[0][0]               
activation_2 (Activation)        (None, 4)             0           dense_2[0][0]                    

Total params: 6,660
Trainable params: 6,660
Non-trainable params: 0


Training can take a lot of time. In this example I used 3.5 million steps.
The outcome gives quite reasonable behavior for lunar lander.
Note that depending on environment, learning can take less time (if environement is not very tricky). In this case I found that difficulties are related to sensitivity of the last phase of landing (touch down). It took some time for the model to figure this part out.

As a side note, OpenAI provides lunar lander Agent using optimal trajectory. In this example, the Agent polls environment to figure out actions that give highest output of Q function. 



Saturday, March 18, 2017

Classifying Dogs vs Cats on a Regular Laptop with 2GB GPU and 90% Accuracy


Machine learning ecosystem has evolved a lot during recent years.
I am amazed that I could run a very sophisticated experiment of classifying dogs vs cats with 90% accuracy on my regular laptop laptop.
It has 2GB NVidia GPU card and 8GB RAM.
Just in 2012 the state of art result of the dogs vs cats classification was 80%.

I ran it based on an excellent course provided by fast.ai (http://course.fast.ai/).
The competition is organized by Kaggle:
https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition

Here's an overview of the approach taken to achieve 90% accuracy.
First, retrieve a publicly available model VGG16, which was prepared by scientists for image recognition competition (for ImageNet). Then remove last layer out of it and replace with Yes / No layer for recognizing cats vs dogs. The remaining layers were set as non trainable. Then run learning process for such model.

The main libraries used here are Keras with Tensorflow backend.

Full code is available on fast.ai website. Here in an overview of the most important parts.
Training code:

import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.allow_soft_placement=True
config.log_device_placement=True
set_session(tf.Session(config=config))

# Import our class, and instantiate
import vgg16; reload(vgg16)
from vgg16 import Vgg16
vgg = Vgg16()

batch_size=16
path = "data/dogscats/"
#path = "data/dogscats/sample/"
batches = vgg.get_batches(path+'train', batch_size=batch_size)
val_batches = vgg.get_batches(path+'valid', batch_size=batch_size)
vgg.finetune(batches)
vgg.fit(batches, val_batches, nb_epoch=1)
vgg.model.save('vgg2.h5')

The code uses vgg.finetune call to update the last layer of the model. Here's how it looks like:

model = self.model
        model.pop()
        for layer in model.layers: layer.trainable=False
        model.add(Dense(num, activation='softmax'))

Next, it trains model using vgg.fit call and saves result to vgg2.h5 file. 

I had to put a few tweaks to the model related to device placement for Tensorflow so it could fit in GPU memory. The last few layers were placed on CPU. Here's the code:

      model = self.model = Sequential()
        model.add(Lambda(vgg_preprocess, input_shape=(3,224,224), output_shape=(3,224,224)))

        with tf.device('/gpu:0'):
            self.ConvBlock(2, 64)
            self.ConvBlock(2, 128)
            self.ConvBlock(3, 256)
            self.ConvBlock(3, 512)
            self.ConvBlock(3, 512)

        with tf.device('/cpu:0'):
            model.add(Flatten())
            self.FCBlock()
            self.FCBlock()
            model.add(Dense(1000, activation='softmax'))

        fname = 'vgg16.h5'
        model.load_weights(get_file(fname, self.FILE_PATH+fname, cache_subdir='models'))


Here's the result of a learning process:

23000/23000 [==============================] - 2103s - loss: 0.5482 - acc: 0.8676 - val_loss: 0.4194 - val_acc: 0.9060

The training process completed in 35 minutes with 90% accuracy on validation set. 

I'm very positively surprised that such powerful machine learning tools are available these days and are runnable on regular computers. Moreover the approach presented by fast.ai is very interesting and resembles natural evolution of intelligence by adding new layers.