Given a set of $n$ random $d$-dimensional boolean vectors with the promise that two of them are $\rho$-correlated with each other, how quickly can one find the two correlated vectors? We present a surprising and simple algorithm which, for any constant $\epsilon>0$ runs in (expected) time $d n^{\frac{3 \omega}{4}+\epsilon} poly(\frac{1}{\rho})< d n^{1.8} \cdot poly(\frac{1}{\rho}),$ where $\omega<2.4$ is the exponent of matrix multiplication. This is the first subquadratic-time algorithm for this problem with polynomial dependence on $\frac{1}{\rho},$ and improves upon $O(d n^{2-O(\rho)}),$ given by Paturi et al., the `Locality Sensitive Hashing' approach of Gionis et al., and the `Bucketing Codes' approach of Dubiner.
This basic algorithm also applies to the adversarial setting, and extensions of this algorithm yield improved algorithms for several other problems, including learning sparse parities with noise, and learning juntas both with, and without noise.
Added Corollary 5, giving an improved exponent for the problem of learning k-juntas without noise. Fixed several minor typos.
Given a set of $n$ random $d$-dimensional boolean vectors with the promise that two of them are $\rho$-correlated with each other, how quickly can one find the two correlated vectors? We present a surprising and simple algorithm which, for any constant $\epsilon>0$ runs in (expected) time $d n^{\frac{3 \omega}{4}+\epsilon} poly(\frac{1}{\rho})< d n^{1.8} \cdot poly(\frac{1}{\rho}),$ where $\omega<2.4$ is the exponent of matrix multiplication. This is the first subquadratic-time algorithm for this problem with polynomial dependence on $\frac{1}{\rho},$ and improves upon $O(d n^{2-O(\rho)}),$ given by Paturi et al., the `Locality Sensitive Hashing' approach of Gionis et al., and the `Bucketing Codes' approach of Dubiner.
This basic algorithm also applies to the adversarial setting, and extensions of this algorithm yield improved algorithms for several other problems, including learning sparse parities with noise, and learning juntas both with, and without noise.
Added Corollary 3, giving an improved exponent for the problem of learning k-juntas without noise. Fixed several minor typos.
Given a set of $n$ random $d$-dimensional boolean vectors with the promise that two of them are $\rho$-correlated with each other, how quickly can one find the two correlated vectors? We present a surprising and simple algorithm which, for any constant $\epsilon>0$ runs in (expected) time $d n^{\frac{3 \omega}{4}+\epsilon} poly(\frac{1}{\rho})< d n^{1.8} \cdot poly(\frac{1}{\rho}),$ where $\omega<2.4$ is the exponent of matrix multiplication. This is the first subquadratic-time algorithm for this problem with polynomial dependence on $\frac{1}{\rho},$ and improves upon $O(d n^{2-O(\rho)}),$ given by Paturi et al., the `Locality Sensitive Hashing' approach of Gionis et al., and the `Bucketing Codes' approach of Dubiner.
This basic algorithm also applies to the adversarial setting, and extensions of this algorithm yield improved algorithms for several other problems, including learning sparse parities with noise, and learning juntas with noise.