Wasserstein Archetypal Analysis

  • Katy Craig
  • , Braxton Osting
  • , Dong Wang
  • , Yiming Xu

Research output: Contribution to journalArticlepeer-review

Abstract

Archetypal analysis is an unsupervised machine learning method that summarizes data using a convex polytope. In its original formulation, for fixed k, the method finds a convex polytope with k vertices, called archetype points, such that the polytope is contained in the convex hull of the data and the mean squared Euclidean distance between the data and the polytope is minimal. In the present work, we consider an alternative formulation of archetypal analysis based on the Wasserstein metric, which we call Wasserstein archetypal analysis (WAA). In one dimension, there exists a unique solution of WAA and, in two dimensions, we prove the existence of a solution, as long as the data distribution is absolutely continuous with respect to the Lebesgue measure. We discuss obstacles to extending our result to higher dimensions and general data distributions. We then introduce an appropriate regularization of the problem, via a Rényi entropy, which allows us to obtain the existence of solutions of the regularized problem for general data distributions, in arbitrary dimensions. We prove a consistency result for the regularized problem, ensuring that if the data are iid samples from a probability measure, then as the number of samples is increased, a subsequence of the archetype points converges to the archetype points for the limiting data distribution, almost surely. Finally, we develop and implement a gradient-based computational approach for the two-dimensional problem, based on the semi-discrete formulation of the Wasserstein metric. Detailed numerical experiments are provided to support our theoretical findings.

Original languageEnglish
Article number36
JournalApplied Mathematics and Optimization
Volume90
Issue number2
DOIs
StatePublished - Oct 2024

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

Funding

K. Craig acknowledges partial support from NSF DMS 1811012, NSF DMS 2145900, and a Hellman Faculty Fellowship. K. Craig also acknowledges the support from the Simons Center for Theory of Computing, at which part of this work was completed. B. Osting acknowledges partial support from NSF DMS 17-52202. D. Wang is partially supported by National Natural Science Foundation of China (Grant No. 12101524), Guangdong Basic and Applied Basic Research Foundation (Grant No. 2023A1515012199), Shenzhen Science and Technology Innovation Program (Grant No. JCYJ20220530143803007, RCYX20221008092843046), Guangdong Provincial Key Laboratory of Mathematical Foundations for Artificial Intelligence (2023B1212010001), and Hetao Shenzhen-Hong Kong Science and Technology Innovation Cooperation Zone Project (No.HZQSWS-KCCYB-2024016).

FundersFunder number
Hetao Shenzhen-Hong Kong Science and Technology Innovation Cooperation Zone Project
Guangdong Provincial Key Laboratory of Mathematical Foundations for Artificial Intelligence2023B1212010001
Simons Center for Theory of Computing17-52202
Division of Mathematical Sciences2145900
National Science Foundation Arctic Social Science Program1811012
Shenzhen Science and Technology Innovation ProgramRCYX20221008092843046, JCYJ20220530143803007
National Natural Science Foundation of China (NSFC)12101524
Basic and Applied Basic Research Foundation of Guangdong Province2023A1515012199

    Keywords

    • 49Q22
    • 62G07
    • 62H12
    • 65K10
    • Archetypal analysis
    • Multivariate data summarization
    • Optimal transport
    • Unsupervised learning
    • Wasserstein metric

    ASJC Scopus subject areas

    • Control and Optimization
    • Applied Mathematics

    Fingerprint

    Dive into the research topics of 'Wasserstein Archetypal Analysis'. Together they form a unique fingerprint.

    Cite this