Classifier: MLP and CNN

This notebook implements a classifier of MNIST digits. This is an example where a model maps a tensor to a discrete value, i.e., $f: \mathbb{R}^{d1\times d2} \rightarrow \{0, ..., 9\}$. We will implement two classifiers, one a multi-layer perception (MLP) and another using convolutional neural networks (CNNs).

MLP Classifier

Let's try a multi-layer perception (MLP) where $x$ is the input and $W, b$ are the parameters (to be learned). We will use $\text{ReLU}$ as the non-linear activation function, which is a built-in function available in Kokoyi.

Click here

to reveal the answer and copy-paste them into the cell below:
\Module{MLP}{x; W, b}
L \gets |W| \\
h[0 \leq i \leq L] \gets
    \begin{cases}
        x & i = 0 \\
        \ReLU (W[i-1] @ h[i-1] + b[i-1]) & i < L \\
        W[i-1] @ h[i-1] + b[i-1] & otherwise \\
    \end{cases} \\
\Return h[L] \\
\EndModule

Here we first get the number of layers, and then define transformation per layer. Note how layers (and arrays in general) are defined and indexed with brackets. The pattern here is very much like recursion: 1) specify the boundary conditions (using cases), 2) write transition function (on the right-hand side), 3) take care of iteration range (on the left-hand side).

Initialization. We have just used Kokoyi to write our module, its initialization is just like a standard PyTorch module, Kokoyi automatically establishes the names of the modules:

However, you can also let Kokoyi set it up and just do some filling. You can do so when you are on a cell of a Kokoyi module, and just hit the button at the top manual.

Click here

to see the default initialization code generated by Kokoyi for this model:
class MLP(torch.nn.Module):
    def __init__(self):
        """ Add your code for parameter initialization here (not necessarily the same names)."""
        super().__init__()
        self.W = None
        self.b = None

    def get_parameters(self):
        """ Change the following code to return the parameters as a tuple in the order of (W, b)."""
        return None

    forward = kokoyi.symbol["MLP"]

Now you can build an MLP classifier by instantiating one:

CNN Classifier

Before we move on, let's define a model that uses CNNs in addition to MLP. CNN is much more efficient to process images using the inductive bias inherent to visual data.

We will use $ConvBlock$ module to extract features from the input images, and generate new features to feed into the $MLP$ module. $ConvBlock$ consists of two convolutions, each followed by a rectified linear unit ($ReLU$). Note that $Flatten$ is a built-in function of Kokoyi, which flattens input by reshaping it into a one-dimensional tensor.

Similarly, you need to initialize the CNN module, as in cell below, or you can use auto-init feature.

Click here

to see the default initialization code generated by Kokoyi for this model:
class ConvBlock(torch.nn.Module):
    def __init__(self):
        """ Add your code for parameter initialization here (not necessarily the same names)."""
        super().__init__()
        self.Conv2d_0 = None
        self.Conv2d_1 = None

    def get_parameters(self):
        """ Change the following code to return the parameters as a tuple in the order of (Conv2d_0, Conv2d_1)."""
        return None

    forward = kokoyi.symbol["ConvBlock"]


class CNN(torch.nn.Module):
    def __init__(self):
        """ Add your code for parameter initialization here (not necessarily the same names)."""
        super().__init__()
        self.ConvBlock = None
        self.MLP = None

    def get_parameters(self):
        """ Change the following code to return the parameters as a tuple in the order of (ConvBlock, MLP)."""
        return None

    forward = kokoyi.symbol["CNN"]

Note that we call kokoyi.nn.Conv2d not torch.nn.Conv2d in ConvBlock definition. NN modules in Kokoyi are basically the same as NN modules in torch, but with some changes to facilitate auto-batching--you noticed that all definitions in Kokoyi are on single sample, auto-batching refers to Kokoyi compiler's ability to batch samples automatically in training. There is a separate note on how to port PyTorch modules into Kokoyi, as this is a slightly advanced topic. For now, all you need to know is that configuring kokoyi module takes the same parameters as in PyTorch.

Now you can build a $CNN$ classifier by instantiating one:

Loss. We use the standard cross entropy loss:

Training loop and data pipeline

We will use MNIST in torchvision, which is basically a bunch of 2D images (of handwritten numbers) and their class labels (from 0 to 9). Let's continue with some basic setup.

Now we can start training both our $MLP$ model and $CNN$ model using the standard SGD training loop.

Finally, we can test the model accuracy with similar code.