Kernel Matrix Estimation of a Determinantal Point Process from a Finite Set of Samples: Properties and Algorithms

Kernel Matrix Estimation of a Determinantal Point Process from a Finite Set of Samples: Properties and Algorithms

TMLR Paper6225 Authors

16 Oct 2025 (modified: 24 Feb 2026)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Determinantal point processes (DPPs) on finite sets have recently gained popularity because of their ability to promote diversity among selected elements in a given subset. The probability distribution of a DPP is defined by the determinant of a positive semi-definite, real-valued matrix. When estimating the DPP parameter matrix, it is often more convenient to express the maximum likelihood criterion using the framework of L-ensembles. However, the resulting optimization problem is non-convex and N P-hard to solve. In this paper, we establish conditions under which the maximum likelihood criterion has a well-defined optimum for a given finite set of samples. We demonstrate that regularization is generally beneficial for ensuring a proper solution. To solve the resulting optimization problem, we propose a proximal algorithm which minimizes a penalized criterion. Through simulations, we compare our algorithm with previously proposed approaches, illustrating their differing behaviors and providing empirical support for our theoretical findings.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: We thank the reviewers for their careful reading of the manuscript and good quality review. Their independent evaluations seem globally to converge. Besides, all their comments seem to us relevant and justified. We hence made changes to take them into account (in blue in the supplementary material pdf file). We provide a unified answer to the three following questions/requested changes, which are common to the three reviewers. 1. **computational complexity** - (reviewer 2ckl) "discuss the computational trade-offs of their proximal method, comparing its per-iteration cost with that of simpler but unstable fixed-point iteration." - (reviewer VE63) "How does the algorithm behave when N is very large?" - (reviewer bWjR) "For both the proposed method and existing approaches, computational efficiency remains a primary challenge, as these methods typically have 𝑂 ( 𝑁 3 ) complexity. While this does not weaken the contribution of the paper, it would be valuable if the authors could discuss potential directions for improving the computational efficiency or scalability of the algorithm." - *Our answer =>* All reviewers mentionned that computational cost/complexity is a fundamental issue, especially for N large. Following the recommandations, we slightly clarified Section 4.4 and proposed some potential directions for improvement. [modifications on pp.8-9] 2. **better motivation** - (reviewer 2ckl) "motivate regularization more clearly and highlight why the non-coercive property is a more immediate, practical barrier to DPP learning than the known barriers for NP-hardness." - (reviewer VE63) "The authors claim that, as stated in the title, the contributions are the properties and algorithm. However, are these contributions leading to a better solution to the DPP. Since the regularized criterion is already studied in the literature and the proximal algorithm is also a standard approach for solving the regularized optimization problem, the author is suggested to clarify how this paper can push forward the frontier." - *Our answer =>* As suggested by reviewers (2ckl) and (VE63), we improved the motivation of our work in the introduction. First, we better emphasized that non-coercivity constitutes a fundamental issue. Also, we mentionned that proximal methods offer many possibilies. We hope that making these two facts clear will contribute to pushing forward the frontier of DPP kernel estimation. [modifications on p.2] 3. **choice of the parameters** - (reviewer 2ckl) "a sensitivity analysis in the experiments to show how the choice of the regularization parameter 𝜆 affects stability, convergence speed, or final kernel accuracy" - (reviewer VE63) "How to choose the regularization parameter of the regularized criterion?" - (reviewer bWjR) "In the experimental section, it would be helpful if the authors could include additional experiments or discussion regarding the tuning of hyperparameters 𝜇 , 𝜖 , 𝜈 , to provide guidance on their practical selection." - *Our answer =>* All reviewers asked for additionnal elements concerning the choice of the different parameters \mu, \epsilon, \nu. Our choices were manually tuned and this is now mentionned in the text. Also, to provide some guiding elements, we added a short simulation (Figure 2) illustrating the choice of \mu and some associated comments. Finally, we illustrated in Table 1 that the returned solution is only slightly perturbed for small values of the parameter \epsilon. [modifications on pp.10-11] Below are two remarks from reviewer (VE63): + "In the abstract, the author has the wording "to address this challenge." What is the challenge?" - *Our answer =>* Thank you for indicating this unclear sentence: we modified it and write now "to solve the resulting optimization problem". [modifications on p.1] + "Please add more details or discussions for Theorem 1." - *Our answer =>* The paragraph below Theorem 1 has been rewritten to better emphasize our comments on the theorem. A reference has also been added to justify that the considered functions satisfy the (KL) assumption by considering an o-minimal structure. [modifications on pp.6-7]

Assigned Action Editor: ~Sylvain_Le_Corff1

Submission Number: 6225

Loading