A fully connected layer in a neural network is a linear combination of the outputs of the preceding layer. Technically, this is all that it is. However, generally speaking, people refer to this linear combination plus a subsequent activation function as a “linear layer,” even though PyTorch and TensorFlow implement the activation function as a separate layer. Hence a linear layer is defined as
where
is a vector containing the output states of the preceding layer; is an activation function; is a matrix whose column vectors are the weights of each neuron in the layer, and is a vector whose elements are the bias for each neuron.
This is equivalent to extending the definition