MLP Multiclass Classifier

Iris Dataset

Task: Classify the species of flowers by given attributes.
Data: The well known Iris Flower Dataset consisting of sepal length, sepal width, petal length, petal width and species.

The last column tells us the flower's specie, the class we would like to estimate by using the first 4 numerical columns. We start by importing the dataset and splitting the data into input and output by using pandas read_csv function. This is already described in the binary classifier example.

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('iris.data')

X = dataset.iloc[:, 0:3].values
y = dataset.iloc[:, 4].value

The output is a categorical variable and must be transformed by the LabelEncoder and OneHotEncoder. The onehotencoder.fit_transform function expect a matrix as input but the output is only a single vector. We can transform a single vector into a one line matrix by applying .reshape(-1,1). That way we are able to apply the function.

# Encoding the output

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_Y = LabelEncoder()
y = labelencoder_Y.fit_transform(y)

onehotencoder = OneHotEncoder()
y = onehotencoder.fit_transform(y.reshape(-1,1)).toarray()

Again we split our data in training (80%) and test data (20%). Afterwards we use the training data to fit the meta parameters of the StandardScaler and apply it to the training and test sets.

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

We create a 4 layer MLP with 2 hidden layers with 6 ReLUs each. The problem is a multiclass classification problem. Therefore we use 1 output layer with a 3 softmax activated nodes and the categorical crossentropy loss function.

# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Dense

# Initialising the ANN
classifier = Sequential()

# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 3))

# Adding the second hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation = 'relu'))

# Adding the output layer
classifier.add(Dense(units = 3, kernel_initializer = 'uniform', activation = 'softmax'))

# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

We fit our data with batch size = 1 in 200 epochs.

# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 1, epochs = 200)

This produces the output:

Epoch 1/200
119/119 [==============================] - 0s - loss: 1.0986 - acc: 0.2773          
Epoch 2/200
119/119 [==============================] - 0s - loss: 1.0689 - acc: 0.3613  
...
Epoch 199/200
119/119 [==============================] - 0s - loss: 0.0758 - acc: 0.9664     
Epoch 200/200
119/119 [==============================] - 0s - loss: 0.0735 - acc: 0.9664

After 200 epochs we end up with a pretty high accuracy of 0.9664 for the training set.

Predicting the test set

We use our trained classifier to predict the results for our unseen test dataset by using the evaluate function. This returns the loss and the accuracy metric for the test data, which is similar to the previous scores of the trainings process.

# Predicting the Test set
score = classifier.evaluate(X_test, y_test)

Most times the user likes to classify new data with a multiclass classifier. This is performed by the predict method. Since the network uses the softmax function, the given output are only probabilities to fall into this class. Often the the user likes to receive the original class name of the most likely classes for each prediction. The is archived by choosing the class with the highest score and apply the reverse_transform function to receive to original class names.

prediction = classifier.predict(X_test)
predicted_labels = labelencoder_Y.inverse_transform(np.argmax(prediction, axis=1))
original_labels = labelencoder_Y.inverse_transform(np.argmax(y_test, axis=1))

The original labels of the test set equal match the prediction.

results matching ""

    No results matching ""