We give polynomial time approximation schemes for the problem
of partitioning an input set of n points into a fixed number
k of clusters so as to minimize the sum over all clusters of
the total pairwise distances in a cluster. Our algorithms work
for arbitrary metric spaces as well ...
more >>>
A standard property testing algorithm is required to determine
with high probability whether a given object has property
P or whether it is \epsilon-far from having P, for any given
distance parameter \epsilon. An object is said to be \epsilon-far
from having ...
more >>>
We consider the $K$-clustering problem with the $\ell_2^2$
distortion measure, also known as the problem of optimal
fixed-rate vector quantizer design. We provide a deterministic
approximation algorithm which works for all dimensions $d$ and
which, given a data set of size $n$, computes in time
more >>>
The present work studies clustering from an abstract point of view
and investigates its properties in the framework of inductive inference.
Any class $S$ considered is given by a numbering
$A_0,A_1,...$ of nonempty subsets of the natural numbers
or the rational k-dimensional vector space as a hypothesis space.
A clustering ...
more >>>
We prove the computational hardness of three k-clustering problems using an (almost) arbitrary Bregman divergence as dissimilarity measure: (a) The Bregman k-center problem, where the objective is to find a set of centers that minimizes the maximum dissimilarity of any input point towards its closest center, and (b) the Bregman ... more >>>
We consider the following clustering with outliers problem: Given a set of points $X \subset \{-1,1\}^n$, such that there is some point $z \in \{-1,1\}^n$ for which at least $\delta$ of the points are $\epsilon$-correlated with $z$, find $z$. We call such a point $z$ a $(\delta,\epsilon)$-center of X.
In ... more >>>
k-median and k-means are the two most popular objectives for clustering algorithms. Despite intensive effort, a good understanding of the approximability of these objectives, particularly in $\ell_p$-metrics, remains a major open problem. In this paper, we significantly improve upon the hardness of approximation factors known in literature for these objectives ... more >>>