In this paper, we initiate study of the computational power of adaptive and non-adaptive monotone decision trees – decision trees where each query is a monotone function on the input bits. In the most general setting, the monotone decision tree height (or size) can be viewed as a measure of non-monotonicity of a given Boolean function. We also study the restriction of the model by restricting (in terms of circuit complexity) the monotone functions that can be queried at each node. This naturally leads to complexity classes of the form $DT(mon-\mathcal{C})$ for any circuit complexity class $\mathcal{C}$, where the height of the tree is $O(\log n)$, and the query functions can be computed by monotone circuits in the class $\mathcal{C}$. In the above context, we prove the following characterizations and bounds.
$\bullet$ For any Boolean function $f$, we show that the minimum monotone decision tree height can be exactly characterized (both in the adaptive and non-adaptive versions of the model) in terms of its alternation ($alt(f)$ is defined as the maximum number of times that the function value changes, in any chain in the Boolean lattice). We also characterize the non-adaptive decision tree height with a natural generalization of certification complexity of a function. Similarly, we determine the complexity of non-deterministic and randomized variants of monotone decision trees in terms of $alt(f)$.
$\bullet$ We show that $DT(mon-\mathcal{C}) = \mathcal{C}$ if $\mathcal{C}$ contains monotone circuits for the threshold functions. For $\mathcal{C} = AC_0$, we are able to show that any function in $AC_0$ can be computed by a sub-linear height monotone decision tree with queries having monotone $AC_0$ circuits.
$\bullet$ To understand the logarithmic height case in case of $AC_0$ i.e., $DT(mon-AC_0)$, we show that for any $f$ (on $n$ bits) in $DT(mon-AC_0)$, and for any positive constant $\epsilon \le 1$, there is an $AC_0$ circuit for $f$ with $O(n^\epsilon)$ negation gates.
En route our main proofs, we study the monotone variant of the decision list model, and prove corresponding characterizations in terms of $alt(f)$ and also derive as a consequence that $DT(mon-\mathcal{C}) = DL(mon-\mathcal{C})$ if $\mathcal{C}$ has appropriate closure properties (where $DL(mon-\mathcal{C})$ is defined similar to $DT(mon-\mathcal{C})$ but for decision lists).