The “science of designing constitutions for AI” is a real direction to be pursued, but in another important sense this is also just “philosophy.”
Andy Hall (@ahall_research)
I'm convinced we need a whole science of how to design constitutions for AI.
How should we positively align AI so that it helps us pursue the "good" when we don't all agree on what is good?
Historically, alignment discussions have often hand waved and said we'll align to "human values" without confronting the fact that there are many values we don't all share. This paper has some really interesting thoughts (particularly in section 5) on how to design governance for AI in a way that addresses this reality.
The core of the governance problem is, how do you write a constitution that binds globally on the model but doesn't impose values that aren't universally held?
This was exactly the same problem we dealt with in social media. And it's striking that many of the solutions the paper proposes were first tried in social media, such as:
--Decentralize wherever you can. Avoid global rules as much as possible, committing hard to only the rules that you're totally willing to stand behind. This is the idea behind subreddits, etc.
--Where you have global rules, try to write them democratically. We did the first pilot of a "citizens' assembly" for Meta years ago, and it's been cool to see these adapted for AI. I'm a bit skeptical that they'll ever get to the point where they write binding rules, but it's a valuable experiment. We could also imagine an electoral version where users elect representatives who write the constitution for them.
--Encourage a marketplace of options. This was the idea of "middleware" for social media, which was wiped out by GDPR and privacy regs. We see nascent signs of the companies competing on constitutional vibes, but it would probably be good to see a marketplace of options within each model, too.
Ultimately, the clash occurs where some people think a rule should be imposed globally---should be a baked-in feature of overall "alignment"---and others want customization (what this paper calls pluralism). In social media, this was the misinformation battle---the left wanted centrally imposed informational rules, and the right wanted to see what they wanted to see.
In AI, the same thing is going to happen. Some people will push hard for centralized rules to be baked into the constitution that other people will want to be their free choice. We saw hints of this with the Anthropic-DoW battle, but there's going to be a lot more to come on a lot more different issues.
It's good that we're thinking about these things now before those big blow-ups start to happen more regularly!
— https://nitter.net/ahall_research/status/2054237900934676708#m