Q: What part of the alignment problem does this plan aim to solve? A: None directly, it aims to empower alignment researchers and accelerate their research. Q: Why has that part of the alignment problem been chosen? A: It leverages continuously increasing LLM capabilities to assist researchers, without automating those researchers since LLMs cannot be trusted with that yet. Q: How does this plan aim to solve the problem? A: Create a human-in-the-loop system where a human hands over safe tasks to an LLM. Cyborgism provides tools and friendly user interfaces like LOOM to get the best out of LLMs and Cyborgist writings have historically deconfused large parts of the community about how to look at LLMs. Q: What evidence is there that the methods will work? A: Cyborgist tools have not had clear success stories yet, but Cyborgist posts have been popular and referenced in works dealing with LLMs. Q: What are the most likely causes of this not working? A: Cyborgism right now is not privileging alignment research, thus any acceleration that alignment research gets, likely will also shorten the time until transformative, agentic AGI arrives.
attributed to: Accelerate Alignment Research
https://www.alignmentforum.org/posts/bxt7uCiHam4QXrQAA/cyborgism This post proposes a strategy for safely accelerating alignment research. The plan is to set up human-in-the-loop systems which empower human agency rather than outsource it, and to use those systems to differentially accelerate progress on alignment.