https://mpost.io/vi/openai-new-process-supervised-reward-modeling-improves-ai-reasoning/