Matrix powering, and more generally iterated matrix multiplication, is a fundamental linear algebraic primitive with myriad applications in computer science. Of particular interest is the problem's space complexity as it constitutes the main route towards resolving the $\mathbf{BPL}$ vs. $\mathbf{L}$ problem. The seminal work by Saks and Zhou (FOCS 1995) gives a deterministic algorithm for approximating the product of $n$ stochastic matrices of dimension $w \times w$ in space $O(\log^{3/2}n + \sqrt{\log n} \cdot \log w)$. The first improvement upon Saks-Zhou was achieved by Hoza (RANDOM 2021) who gave a logarithmic improvement in the $n = \mathrm{poly}(w)$ regime, attaining $O(\frac1{\sqrt{\log\log n}} \cdot \log^{3/2}n)$ space.
We give the first polynomial improvement over Saks-Zhou. Our algorithm achieves space complexity of
$$
\widetilde{O}\left(\log n + \sqrt{\log n}\cdot \log w\right).
$$
In particular, in the regime $\log n > \log^2 w$, our algorithm runs in nearly-optimal $\widetilde{O}(\log n)$ space, improving upon the previous best $O(\log^{3/2}n)$.
To obtain our result for the special case of matrix powering, we harness recent machinery from time- and space-bounded Laplacian solvers to the Saks-Zhou framework and devise an intricate precision-alternating recursive scheme. This enables us to bypass the bottleneck of paying $\log n$-space per recursion level. The general case of iterated matrix multiplication poses several additional challenges, the substantial of which is handled by devising an improved shift and truncate mechanism. The new mechanism is made possible by a novel use of the Richardson iteration.