https://tfruan2000.github.io/posts/llm-inference-optim/
LLM Inference Optimization - Ruan Tingfeng