“Softmax cross-entropy loss” is literally just using Softmax on the last layer of a classifier, then using cross-entropy loss on the result. There’s nothing more to it than that.