RNN in Python for Beginners

Recurrent Neural Network

Introduction

What makes us humans special? Its our ability to comprehend things in a contextual way makes us stand apart. We don’t reset our memory each time we talk a new word, but in fact the very idea of talking is to output words in a sequential fashion keeping it in context with the previously uttered words. Machines were not able to do that, they consider each of their inputs and outputs as mutually exclusive from each other. To solve this problem of processing sequential data, Recurrent Neural Networks were introduced.

Basic neuron of RNN
Unfolding of RNN
subscript ‘h’ says that its of hidden layer.
0 says that its of output layer.
  • Language Modelling and Generating Text: Taking a sequence of words as input, we try to predict the possibility of the next word.
  • Machine Translation: The input can be the source language and the output will be in the target language which the user wants.
  • Speech Recognition: RNNs can be used for predicting phonetic segments considering sound waves from a medium as an input source. The set of inputs consists of phoneme or acoustic signals from an audio which are processed in a proper manner and taken as inputs.
  • Generating Image Descriptions: A combination of CNNs and RNNs are used to provide a description of what exactly is happening inside an image. CNN does the segmentation part and RNN then uses the segmented data to recreate the description.
  • Video Tagging: RNNs can be used for video search where we can do image description of a video divided into numerous frames.
  • Call Center Analysis: This can be considered as one of the major applications of RNNs in the field of audio processing. Customer metrics are measured on the output of the call rather than the call itself. However, analyzing the call itself will provide businesses with the solutions to why the support staff succeeded and what were the steps taken in resolving their customer issue.
  • Visual search, Face detection, OCR Applications as Image Recognition: Humans tend to think visually and have an extensive visual shorthand reference board that helps them to navigate in the world. In the fields of search engines, ecommerce and OCR apps this feature is considered for satisfying of the customer needs.
  • Other applications like Music composition.
cosine sequence

RNN code in Python.

Import necessary packages

import math
import numpy as np
import matplotlib.pyplot as plt
sin_wave = np.array([math.cos(x) for x in np.arange(200)])
plt.plot(cos_wave[:50])
X = []
Y = []
seq_len = 50
num_records = len(cos_wave) - seq_len
for i in range(num_records - 50):
X.append(cos_wave[i:i+seq_len])
Y.append(cos_wave[i+seq_len])
X = np.array(X)
X = np.expand_dims(X, axis=2)
Y = np.array(Y)
Y = np.expand_dims(Y, axis=1)
X_val = []
Y_val = []
for i in range(num_records - 50, num_records):
X_val.append(cos_wave[i:i+seq_len])
Y_val.append(cos_wave[i+seq_len])
X_val = np.array(X_val)
X_val = np.expand_dims(X_val, axis=2)
Y_val = np.array(Y_val)
Y_val = np.expand_dims(Y_val, axis=1)
learning_rate = 0.0001
nepoch = 25
T = 50
hidden_dim = 100
output_dim = 1
bptt_truncate = 5
min_clip_value = -10
max_clip_value = 10
  1. V is the weight matrix for weights between hidden and output layers
  2. W is the weight matrix for shared weights in the RNN layer (hidden layer)
U = np.random.uniform(0, 1, (hidden_dim, T))
W = np.random.uniform(0, 1, (hidden_dim, hidden_dim))
V = np.random.uniform(0, 1, (output_dim, hidden_dim))
def sigmoid(x):
return 1 / (1 + np.exp(-x))
  • Forward pass
  • Calculate Error
  • Check the loss on validation data
  • Forward Pass
  • Calculate Error
  • Start actual training
  • Forward Pass
  • Backpropagate Error
  • Update weights
for epoch in range(nepoch):
# check loss on train
loss = 0.0
# do a forward pass to get prediction
for i in range(Y.shape[0]):
x, y = X[i], Y[i] # get input, output values of each record
prev_s = np.zeros((hidden_dim, 1)) # here, prev-s is the value of the previous activation of hidden layer; which is initialized as all zeroes
for t in range(T):
new_input = np.zeros(x.shape) # we then do a forward pass for every timestep in the sequence
new_input[t] = x[t] # for this, we define a single input for that timestep
mulu = np.dot(U, new_input)
mulw = np.dot(W, prev_s)
add = mulw + mulu
s = sigmoid(add)
mulv = np.dot(V, s)
prev_s = s
# calculate error
loss_per_record = (y - mulv)**2 / 2
loss += loss_per_record
loss = loss / float(y.shape[0])
# check loss on val
val_loss = 0.0
for i in range(Y_val.shape[0]):
x, y = X_val[i], Y_val[i]
prev_s = np.zeros((hidden_dim, 1))
for t in range(T):
new_input = np.zeros(x.shape)
new_input[t] = x[t]
mulu = np.dot(U, new_input)
mulw = np.dot(W, prev_s)
add = mulw + mulu
s = sigmoid(add)
mulv = np.dot(V, s)
prev_s = s
loss_per_record = (y - mulv)**2 / 2
val_loss += loss_per_record
val_loss = val_loss / float(y.shape[0]) print('Epoch: ', epoch + 1, ', Loss: ', loss, ', Val Loss: ', val_loss)
  • Add this with the multiplication of weights in the RNN layer. This is because we want to capture the knowledge of the previous timestep
  • Pass it through a sigmoid activation function
  • Multiply this with the weights between hidden and output layers
  • At the output layer, we have a linear activation of the values so we do not explicitly pass the value through an activation layer
  • Save the state at the current layer and also the state at the previous timestep in a list
    for i in range(Y.shape[0]):
x, y = X[i], Y[i]
layers = []
prev_s = np.zeros((hidden_dim, 1))
dU = np.zeros(U.shape)
dV = np.zeros(V.shape)
dW = np.zeros(W.shape)
dU_t = np.zeros(U.shape)
dV_t = np.zeros(V.shape)
dW_t = np.zeros(W.shape)
dU_i = np.zeros(U.shape)
dW_i = np.zeros(W.shape)
# forward pass
for t in range(T):
new_input = np.zeros(x.shape)
new_input[t] = x[t]
mulu = np.dot(U, new_input)
mulw = np.dot(W, prev_s)
add = mulw + mulu
s = sigmoid(add)
mulv = np.dot(V, s)
layers.append({'s':s, 'prev_s':prev_s})
prev_s = s
        # derivative of pred
dmulv = (mulv - y)
# backward pass
for t in range(T):
dV_t = np.dot(dmulv, np.transpose(layers[t]['s']))
dsv = np.dot(np.transpose(V), dmulv)
ds = dsv
dadd = add * (1 - add) * ds
dmulw = dadd * np.ones_like(mulw) dprev_s = np.dot(np.transpose(W), dmulw) for i in range(t-1, max(-1, t-bptt_truncate-1), -1):
ds = dsv + dprev_s
dadd = add * (1 - add) * ds
dmulw = dadd * np.ones_like(mulw)
dmulu = dadd * np.ones_like(mulu)
dW_i = np.dot(W, layers[t]['prev_s'])
dprev_s = np.dot(np.transpose(W), dmulw)
new_input = np.zeros(x.shape)
new_input[t] = x[t]
dU_i = np.dot(U, new_input)
dx = np.dot(np.transpose(U), dmulu)
dU_t += dU_i
dW_t += dW_i
dV += dV_t
dU += dU_t
dW += dW_t
            if dU.max() > max_clip_value:
dU[dU > max_clip_value] = max_clip_value
if dV.max() > max_clip_value:
dV[dV > max_clip_value] = max_clip_value
if dW.max() > max_clip_value:
dW[dW > max_clip_value] = max_clip_value
if dU.min() < min_clip_value:
dU[dU < min_clip_value] = min_clip_value
if dV.min() < min_clip_value:
dV[dV < min_clip_value] = min_clip_value
if dW.min() < min_clip_value:
dW[dW < min_clip_value] = min_clip_value
# update U -= learning_rate * dU
V -= learning_rate * dV
W -= learning_rate * dW
Output after training.
preds = []
for i in range(Y.shape[0]):
x, y = X[i], Y[i]
prev_s = np.zeros((hidden_dim, 1))
# Forward pass
for t in range(T):
mulu = np.dot(U, x)
mulw = np.dot(W, prev_s)
add = mulw + mulu
s = sigmoid(add)
mulv = np.dot(V, s)
prev_s = s
preds.append(mulv)preds = np.array(preds)
plt.plot(preds[:, 0, 0], 'g')
plt.plot(Y[:, 0], 'r')
plt.show()
preds = []
for i in range(Y_val.shape[0]):
x, y = X_val[i], Y_val[i]
prev_s = np.zeros((hidden_dim, 1))
# For each time step...
for t in range(T):
mulu = np.dot(U, x)
mulw = np.dot(W, prev_s)
add = mulw + mulu
s = sigmoid(add)
mulv = np.dot(V, s)
prev_s = s
preds.append(mulv)preds = np.array(preds)plt.plot(preds[:, 0, 0], 'g')
plt.plot(Y_val[:, 0], 'r')
plt.show()

Masters student at IIITM - Kerala

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store