Download
https://ishikura-a.github.io/posts/TDPO/
Token-level Direct Preference Optimization - Zihao Tang
Share