Building a Multi layer Perceptron model

Multi layer Perceptron is kind of Hello world of Deep Learning. In this article we will build a 2 hidden layer Perceptron model  with 3 hidden units in first layer  and  2 units in second  layer with three inputs X_{{1}},X_{{2}},X_{{3}} in input layer and one output unit.  This will be a binary classifier . Demo Python implementation is provided using sklearn.

Before getting forward in this article make sure you understand perceptron neuron model well if  you don't then click here.

So let's build a neural net  ,Before  going into mathematical details first let's visualize our neural network

So, our first weight matrix will be of size 3x3 and each row will be weights for one neuron , same for 2nd hidden layer the weight matrix will be of size 3x2.

The weight matrix at first hidden layer will look something like this,

The activation  function used at both hidden layer for each neuron is Relu and for the final output layer it is sigmoid. The main reason for not using sigmoid at each layer is Vanishing gradient problem and it occurs because of the small magnitude output of sigmoid in case of positive output . Vanishing gradient problem is one of the biggest drawback in case of using sigmoid at each layer which makes it very difficult to train the neural network.

Training a neural network consist of three steps:

  1. Forward Pass
  2. Calculating loss
  3. Backpropagate Loss to update weights

If You are unaware of Backpropagation algorithm then click here

Forward pass :


  1. Z= rW_{{1}}.T\ast X + W_{{0}} ,where W1 is weight matrix  for  1st hidden layer and W0 is bias vector for 1st hidden layer
  2. compute  X^{'} = Relu(Z)
  3. pass the output of 1st hidden layer i.e  X^{'} as input to 2nd hidden layer
    1. Z2 =W_{{2}}.T\ast X' + W_{{0}}
    2. X'' =Relu(Z2)
  4. pass the output of 2nd hidden layer i.e X''  as input to our  output layer
  5. then compute sigmoid (W_{{3}}.T\ast X'' + W_{{0}})

Calculating Loss: We will be using Logistic Loss Passing Output to loss and calculating it is a straight forward task



Backpropagating loss is very complex task as differentiating loss function with respesct to every Weight is highly complex so for this step we Will provide direct derivatives which you can use,Derivative of loss function is p_{{i}} - y_{{i}} , where p_{{i}} is sigmoid (W_{{3}}.T\ast X'' + W_{{0}}) and as we know the values of W's and X's depends upon the output of previous Layers , so we have to backpropagate the loss using Backpropagation.

After this we will update the weights using Gradient Desent,If you are unfamiliar how gradient descent work then just click here.

We Will follow this procedure till the difference between previous Wi and next Wi is negligible.

Python Implementation:

Importing Libraries

Applying Classifier

Mlp is not a new architecture but a powerful one and best beginner's friendly implementation of neural network.We will be covering many such articles on DeepLearning so stay tuned.

Previous Post

Plots for EDA

Next Post

Understanding correlation between Variables

Related Posts