27
0
Generalized Preference Optimization: A Unified Approach to Offline Alignment
attributed to: Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, RΓ©mi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Γvila Pires, Bilal Piot
Offline preference optimization allows fine-tuning large models directly from
offline data, and has proved effective in recent alignment practices. We
propose generalized preference optimization (GPO), a family of offline losses
parameterized by a general class of convex functions. GPO enables a unified
view over preference optimization, encompassing existing algorithms such as
DPO, IPO and SLiC as special cases, while naturally introducing new variants.
The GPO framework also sheds light on how offline algorithms enforce
regularization, through the design of the convex function that defines the
loss. Our analysis and experiments reveal the connections and subtle
differences between the offline regularization and the KL divergence
regularization intended by the canonical RLHF formulation. In all, our results
present new algorithmic toolkits and empirical insights to alignment
practitioners.
0
Vulnerabilities & Strengths