ECCC-Report TR05-088https://eccc.weizmann.ac.il/report/2005/088Comments and Revisions published for TR05-088en-usTue, 16 Aug 2005 17:10:18 +0300
Paper TR05-088
| Learning Juntas in the Presence of Noise |
Jan Arpe
https://eccc.weizmann.ac.il/report/2005/088The combination of two major challenges in machine learning is investigated: dealing with large amounts of irrelevant information and learning from noisy data. It is shown that large classes of Boolean concepts that depend on a small number of variables---so-called juntas---can be learned efficiently from random examples corrupted by random attribute and classification noise.
To accomplish this goal, a two-phase algorithm is presented that copes with several problems arising from the presence of noise: firstly, a suitable method for approximating Fourier coefficients in the presence of noise is applied to infer the relevant variables. Secondly, as one cannot simply read off a truth table from the examples as in the noise-free case, an alternative method to build a hypothesis is established and applied to the examples restricted to the relevant variables.
In particular, for the class of monotone juntas depending on $d$ out of $n$ variables, the sample complexity is polynomial in $\log(n/\delta)$, $2^d$, $\gamma_a^{-d}$, and $\gamma_b^{-1}$, where $\delta$ is the confidence parameter and $\gamma_a,\gamma_b>0$ are noise parameters bounding the noise rates away from $1/2$. The running time is bounded by the sample complexity times a polynomial in $n$.
So far, all results hold for the case of uniformly distributed examples---the only case that (apart from side notes) has been studied in the literature yet. We show how to extend our methods to non-uniformly distributed examples and derive new results for monotone juntas.
For the attribute noise, we have to assume that it is generated by a product distribution since otherwise fault-tolerant learning is in general impossible: we construct a noise distribution $P$ and a concept class $\C$ such that it is impossible to learn $\C$ under $P$-noise.
Tue, 16 Aug 2005 17:10:18 +0300https://eccc.weizmann.ac.il/report/2005/088