Writing a Custom Layer in PyTorch

3 min readFeb 5, 2020

Background

By now you may have come across the position paper, PyTorch: An Imperative Style, High-Performance Deep Learning Library presented at the 2019 Neural Information Processing Systems. The paper promotes PyTorch as a Deep Learning framework that balances usability with pragmatic performance (sacrificing neither). Read the paper and judge for yourself.

Here, I highlight just one aspect; the ease of creating your custom own Deep Learning layer as part of a neural network (NN) model. Typically, you’ll reuse the existing layers, but many-a-times, you’ll need to create your own custom layer. I show how straight forward that is. Once you have defined the model, there’s plenty of work ahead of you, such as; choice of the optimizer, the learning-rate (and many other hyper-parameters) including your scale-up (GPUs per node) scale-out strategy (number of nodes). All of these need your attention. Thankfully, PyTorch makes the task of model creation natural and intuitive. Here, I build on the code snippets used in the paper and create a working example out of it.

Neural network models started out as simple sequences of feed-forward layers directed acyclic graphs (DAGs) but have evolved into “code” often composed of loops and recursive functions. A simple expression of a recurrent model (in PyTorch) is shown below.

PyTorch preserves the imperative programming model of Python. As shown above, the order of the operations is defined in the code and the computation graph is built (or conditionally rebuilt) at run time. Note, the code that performs the computations for the forward pass also creates the data structure needed for back propagation, so your custom layer must be differentiable (goes without saying, yet important to keep in mind). With this code-as-a-model approach, PyTorch ensures that any new potential neural network architecture can be easily implemented with Python classes. I illustrate this concept with a simple example.

A Neural Network model (in any framework) is usually represented as classes that are composed of individual layers. As in Python, PyTorch class constructors create and initialize their model parameters, and the class’s forward method processes the input in the forward direction.

The Custom Layer

Below we define MyLinearLayer, a custom layer used as a building-block layer for our model called BasicModel. In reality, MyLinearLayer is our own version of a library-provided Linear layer. Note, this is meant to be an example implementation to highlight how simple and natural it is to create a custom layer. You can substitute the library-provided version and expect the same result.

The Model

Below, the MyLinearLayer custom class (from above) is used as a building-block for a simple yet complete neural network model.

The Main Program

Below you see the steps to instantiate the model, create a sample input and apply the model’s forward path on the input.

The main program initializes the input and calls the forward path

The complete working version is available on github at this gist.

The Obligatory Imports

As a footnote, I’m adding the obligatory imports.

The imports

References

https://devblogs.nvidia.com/recursive-neural-networks-pytorch/

https://pytorch.org/tutorials/beginner/nlp/sequence_models_tutorial.html