We show that if Arthur-Merlin protocols can be derandomized, then there is a Boolean function computable in deterministic exponential-time with access to an NP oracle, that cannot be computed by Boolean circuits of exponential size. More formally, if $\mathrm{prAM}\subseteq \mathrm{P}^{\mathrm{NP}}$ then there is a Boolean function in $\mathrm{E}^{\mathrm{NP}}$ that requires circuits of size $2^{\Omega(n)}$. prAM is the class of promise problems that have Arthur-Merlin protocols, $\mathrm{P}^{\mathrm{NP}}$ is the class of functions that can be computed in deterministic polynomial-time with an NP oracle and $\mathrm{E}^{\mathrm{NP}}$ is its exponential analogue. The lower bound in the conclusion of our theorem suffices to construct very strong pseudorandom generators.
We also show that the same conclusion holds if the problem of approximate counting the number of accepting paths of a nondeterministic Turing machine up to multiplicative factors can be done in nondeterministic polynomial-time. In other words, showing nondeterministic fully polynomial-time approximation schemes for $\sharp\mathrm{P}$-complete problems require proving exponential-size circuit lower bounds.
A few works have already shown that if we can find efficient deterministic solutions to some specific tasks (or classes) that are known to be solvable efficiently by randomized algorithms (or proofs), then we obtain lower bounds against certain circuit models. These lower bounds were only with respect to polynomial-size circuits even if full derandomization is assumed. Thus they only implied fairly weak pseudorandom generators (if at all).
A key ingredient in our proof is a connection between computational learning theory and exponential-size lower bounds. We show that the existence of deterministic learning algorithms with certain properties implies exponential-size lower bounds, where the complexity of the hard function is related to the complexity of the learning algorithm.