A fully connected layer in a neural network is a linear combination of the outputs of the preceding layer. Technically, this is all that it is. However, generally speaking, people refer to this linear combination plus a subsequent activation function as a “linear layer,” even though PyTorch and TensorFlow implement the activation function as a separate layer. Hence a linear layer is defined as

where

  • is a vector containing the output states of the preceding layer;
  • is an activation function;
  • is a matrix whose column vectors are the weights of each neuron in the layer, and
  • is a vector whose elements are the bias for each neuron.

This is equivalent to extending the definition of a single neuron to neurons. When at least two linear layers (including the output) are joined, we call it a multi-layer perceptron.