The paper that introduced residual connections (often called “ResNets”) also published pre-trained models in various depths (18, 34, 50, 101, 152). These models were all trained on ImageNet, which has over a million images and 1,000 classes. As such, these models can be used to produce embeddings at various performance-latency tradeoff points. Although pre-trained ViT models have far surpassed these ResNets in performance, ResNets remain relevant due to their low cost of deployment.
You will see reference to these pre-trained ResNets as, eg, “ResNet101” or “ResNet18.”