We show that a DNF formula that has a CNF representation that contains
at least one ``$1/poly$-heavy''
clause with respect to a distribution $D$ is weakly learnable
under this distribution. So DNF that are not weakly
learnable under the distribution $D$ do not have
any ``$1/poly$-heavy'' clauses in any of their CNF representations.
We then show that $\tau$-CDNF, a DNF $f$ that has a CNF representation that
contains $poly(n)$ clauses that $\tau$-approximates
$f$
according to a distribution $D$,
is weakly ${\tau}+\epsilon$-PAC-learnable
with membership queries under the distribution $D$.
We then show how to change our algorithm to a parallel
algorithm that runs in polylogarithmic time
with a polynomial number of processors.
In particular, decision trees are (strongly) PAC-learnable with
membership queries under any distribution in parallel
in polylogarithmic time with a polynomial number of processors.
Finally we show that no efficient parallel exact learning
algorithm exists for decision trees.