https://ishikura-a.github.io/posts/DRO-For-LM/
Distributionally Robust Optimization For Language Modeling - Zihao Tang