Abstract: A parametric cluster model is a statistical model providing
geometric insights onto the points defining a cluster.
The {\em spherical cluster model} (SC) approximates a
finite point set $P\subset \mathbb{R}^d$ by a sphere $S(c,r)$ as follows. Taking $r$
as a fraction $\eta\in(0,1)$ (hyper-parameter) of the std deviation of
distances between the center $c$ and the data points, the cost of the
SC model is the sum over all data points lying outside the sphere $S$
of their power distance with respect to $S$.
The center $c$ of the SC model is the point minimizing this cost.
Note that $\eta=0$ yields the celebrated center of mass used in
KMeans clustering.
We show that fitting a spherical cluster leads to a strictly
convex but non-smooth combinatorial optimization problem, and we
develop an exact solver based on the Clarke gradient of non-smooth
functionals over a suitable stratified cell complex induced by an
arrangement of hyperspheres. To the best of our knowledge, our method
is the first practical application of the theory of semiflows of
convex maps, which generalizes the gradient flows of smooth maps. We
\toblack present experiments on a variety of datasets ranging in
dimension from $d=9$ to $d=10,000$, with two main observations.
First, our exact algorithm is orders of magnitude faster than BFGS
based heuristics for datasets of small/intermediate dimension and
small values of $\eta$, and for high dimensional datasets (say
$d>100$) whatever the value of $\eta$. Second, the center of the SC
model behave as a parameterized high-dimensional median.
The SC model is of direct interest for high dimensional multivariate data analysis,
and holds promises for the design of mixtures.
Beyond Pdf: zip
Submission Type: Beyond PDF submission (pageless, webpage-style content)
Changes Since Last Submission: # Section: General
## QUESTION.
The four reviews raise similar concerns about:
* The motivation of the work,
* The positioning in machine learning at large and the targeted audience,
* The definition of the spherical cluster model.
## ANSWER.
To begin with, we wish to stress two points.
First, we believe our contribution is correctly positioned, as we do not advertise a clustering algorithm
but a cluster model and its connexions to clustering, cluster models, and centerpoints.
Second, while the audience may not be as broad as that of a generic
clustering algorithm, we believe the connexions to the three
aforementioned topics ensure that the paper will appeal to a diverse
readership spanning machine learning, statistical learning,
computational geometry, and optimization.
To make these points even more clear, we have revised the paper as follows:
* The Introduction has been revamped, with the addition of a paragraph on cluster models.
* The paragraph on Contributions has been rewritten.
* A new section *Spherical clusters: discussion* has been added, to
comment on the use of the (quadratic) power distance and the role of
the hyper-parameter $\eta$ in particular.
# Section: specific answers to the four reviews
Detailed answers have been appended at the end of the supporting information.
Assigned Action Editor: ~Yuheng_Jia1
Submission Number: 7728
Loading