https://noisrucer.github.io/posts/vit/
[Paper] ViT: An Image is worth 16x16 words with Full PyTorch Implementation - noisrucer