Download
https://ishikura-a.github.io/posts/SimPO/
SimPO: Simple Preference Optimization with a Reference-Free Reward - Zihao Tang
Share