Producing a small DNF expression consistent with given data is a classical problem in computer science that occurs in a number of forms and has numerous applications. We consider two standard variants of this problem. The first one is two-level logic minimization or finding a minimum DNF formula consistent with a given complete truth table (TT-MINDNF). This problem was formulated by Quine in 1952 and has been since one of the key problems in logic design. It was proved NP-complete by Masek in 1979.
The best known polynomial approximation algorithm is based on a reduction to the SET-COVER problem and produces a DNF formula of size $O(d\cdot \mbox{OPT})$, where $d$ is the number of variables. We prove that TT-MINDNF is NP-hard to approximate within $d^\gamma$ for some constant $\gamma > 0$, establishing the first inapproximability result for the problem.
The other DNF minimization problem we consider is PAC learning of DNF expressions when the learning algorithm must output a DNF expression as its hypothesis (referred to as proper learning). We prove that DNF expressions are NP-hard to PAC learn properly even when the learner has access to membership queries, thereby answering a long-standing Valiant's open question (1984). Finally, we provide a concrete connection between these variants of DNF minimization problem. Specifically, we prove that inapproximability of TT-MINDNF implies hardness results for restricted proper learning of DNF expressions with membership queries even when learning with respect to the uniform distribution only.
Producing a small DNF expression consistent with given data is a
classical problem in computer science that occurs in a number of forms and
has numerous applications. We consider two standard variants of this
problem. The first one is two-level logic minimization or finding a minimal
DNF formula consistent with a given complete truth table (MinDNF). This
problem was formulated by Quine in 1952 and has been since one of the key
problems in logic design. It was proved NP-complete by Masek in 1979.
The best known polynomial approximation algorithm is based on a reduction to
the SET-COVER problem and produces a DNF formula of size $O(d \cdot OPT)$,
where $d$ is the number of variables. We prove that MinDNF is NP-hard to
approximate within $d^\gamma$ for some constant $\gamma > 0$, establishing
the first inapproximability result for the problem.
The other DNF minimization problem we consider is PAC learning of DNF
expressions when the learning algorithm must output a DNF expression as its
hypothesis (referred to as proper learning). We prove that DNF expressions
are NP-hard to PAC learn properly even when the learner has access to
membership queries, thereby answering a long-standing open question due to
Valiant \cite{Valiant:84}. Finally, we observe that inapproximability of
MinDNF implies hardness results for restricted proper learning of DNF
expressions with membership queries even when learning with respect to the
uniform distribution only.