We formalize AI alignment as a multi-objective optimization problem called $\langle M,N,\varepsilon,\delta\rangle$-agreement, in which a set of $N$ agents (including humans) must reach approximate ($\varepsilon$) agreement across $M$ candidate objectives, with probability at least $1-\delta$.
Analyzing communication complexity, we prove an information-theoretic lower bound showing that once either $M$ or ...
more >>>
Given a random permutation $f: [N] \to [N]$ as a black box and $y \in [N]$, we want to output $x = f^{-1}(y)$. Supplementary to our input, we are given classical advice in the form of a pre-computed data structure; this advice can depend on the permutation but \emph{not} on ... more >>>