We study the arithmetic complexity of iterated matrix multiplication. We show that any multilinear homogeneous depth $4$ arithmetic formula computing the product of $d$ generic matrices of size $n \times n$, IMM$_{n,d}$, has size $n^{\Omega(\sqrt{d})}$ as long as $d \leq n^{1/10}$. This improves the result of Nisan and Wigderson (Computational Complexity, 1997) for depth $4$ set-multilinear formulas.
We also study $\Sigma\Pi^{(O(d/t))}\Sigma\Pi^{(t)}$ formulas, which are depth $4$ formulas with the stated bounds on the fan-ins of the $\Pi$ gates. A recent depth reduction result of Tavenas (MFCS, 2013) shows that any $n$-variate degree $d = n^{O(1)}$ polynomial computable by a circuit of size poly$(n)$ can also be computed by a depth $4$ $\Sigma\Pi^{(O(d/t))}\Sigma\Pi^{(t)}$ formula of top fan-in $n^{O(d/t)}$. We show that any such formula computing IMM$_{n,d}$ has top fan-in $n^{\Omega({d/t})}$, proving the optimality of Tavenas' result. This also strengthens a result of Kayal, Saha, and Saptharishi (ECCC, 2013) which gives a similar lower bound for an explicit polynomial in VNP.