TR15-025 Authors: Shay Moran, Amir Shpilka, Avi Wigderson, Amir Yehudayoff

Publication: 22nd February 2015 07:06

Downloads: 2157

Keywords:

In this work we study the quantitative relation between VC-dimension and two other basic parameters related to learning and teaching. We present relatively efficient constructions of {\em sample compression schemes} and

for classes of low VC-dimension. Let $C$ be a finite boolean concept class of VC-dimension $d$. Set $k = O(d 2^d \log \log |C|)$.

We construct sample compression schemes of size $k$ for $C$, with additional information of $k \log(k)$ bits. Roughly speaking, given any list of $C$-labelled examples of arbitrary length, we can retain only $k$ labeled examples in a way that allows to recover the labels of all others examples in the list.

We also prove that there always exists a concept $c$ in $C$ with a teaching set (i.e. a list of $c$-labelled examples uniquely identifying $c$) of size $k$. Equivalently, we prove that the recursive teaching dimension of $C$ is at most $k$.

The question of constructing sample compression schemes for classes of small VC-dimension was suggested by Littlestone and Warmuth (1986), and the problem of constructing teaching sets for classes of small VC-dimension was suggested by Kuhlmann (1999). Previous constructions for general concept classes yielded size $O(\log |C|)$ for both questions, even when the VC-dimension is constant.