Revision #1 Authors: Vitaly Feldman

Accepted on: 10th September 2008 00:00

Downloads: 2732

Keywords:

Producing a small DNF expression consistent with given data is a classical problem in computer science that occurs in a number of forms and has numerous applications. We consider two standard variants of this problem. The first one is two-level logic minimization or finding a minimum DNF formula consistent with a given complete truth table (TT-MINDNF). This problem was formulated by Quine in 1952 and has been since one of the key problems in logic design. It was proved NP-complete by Masek in 1979.

The best known polynomial approximation algorithm is based on a reduction to the SET-COVER problem and produces a DNF formula of size $O(d\cdot \mbox{OPT})$, where $d$ is the number of variables. We prove that TT-MINDNF is NP-hard to approximate within $d^\gamma$ for some constant $\gamma > 0$, establishing the first inapproximability result for the problem.

The other DNF minimization problem we consider is PAC learning of DNF expressions when the learning algorithm must output a DNF expression as its hypothesis (referred to as proper learning). We prove that DNF expressions are NP-hard to PAC learn properly even when the learner has access to membership queries, thereby answering a long-standing Valiant's open question (1984). Finally, we provide a concrete connection between these variants of DNF minimization problem. Specifically, we prove that inapproximability of TT-MINDNF implies hardness results for restricted proper learning of DNF expressions with membership queries even when learning with respect to the uniform distribution only.

Producing a small DNF expression consistent with given data is a

classical problem in computer science that occurs in a number of forms and

has numerous applications. We consider two standard variants of this

problem. The first one is two-level logic minimization or finding a minimal

DNF formula consistent with a given complete truth table (MinDNF). This

problem was formulated by Quine in 1952 and has been since one of the key

problems in logic design. It was proved NP-complete by Masek in 1979.

The best known polynomial approximation algorithm is based on a reduction to

the SET-COVER problem and produces a DNF formula of size $O(d \cdot OPT)$,

where $d$ is the number of variables. We prove that MinDNF is NP-hard to

approximate within $d^\gamma$ for some constant $\gamma > 0$, establishing

the first inapproximability result for the problem.

The other DNF minimization problem we consider is PAC learning of DNF

expressions when the learning algorithm must output a DNF expression as its

hypothesis (referred to as proper learning). We prove that DNF expressions

are NP-hard to PAC learn properly even when the learner has access to

membership queries, thereby answering a long-standing open question due to

Valiant \cite{Valiant:84}. Finally, we observe that inapproximability of

MinDNF implies hardness results for restricted proper learning of DNF

expressions with membership queries even when learning with respect to the

uniform distribution only.