https://datta0.github.io/posts/transformer-showdown/
Transformer showdown MHA vs MLA vs nGPT vs Differential Transformer - Datta's Blog