Full Abstract: As algorithms are becoming more and more data-driven, the greatest lever we have left to make them robustly beneficial to mankind lies in the design of their objective functions. Robust alignment aims to address this design problem. Arguably, the growing importance of social medias’ recommender systems makes it an urgent problem, for instance to ade-quately automate hate speech moderation. In this paper, we propose a preliminary research program for robust alignment. This roadmap aims at decomposing the end-to-end alignment problem into numerous more tractable subproblems. We hope that each subproblem is sufficiently orthogonal to others to be tackled independently, and that combining the solutions to all such subproblems may yield a solution to alignment.