ArXiv

Penalty-Based First-Order Methods for Bilevel Optimization with Minimax and Constrained Lower-Level Problems

Authors
Yiyang Shen, Yutian He, Weiran Wang...
Categories
math.OC, cs.LG, stat.ML
arXiv
https://arxiv.org/abs/2605.08006v1
PDF
https://arxiv.org/pdf/2605.08006v1

Brief

Bilevel minimax optimization with minimax lower-level problems is addressed via penalty-based first-order methods that remove the need for lower-level strong convexity. Based on the abstract, the deterministic method attains an ε-KKT point in tilde O(ε^{-4}) oracle calls, the constrained lower-level case improves prior tilde O(ε^{-7}) to tilde O(ε^{-4}) via Lagrangian duality, and a stochastic variant achieves near-ε-KKT with tilde O(ε^{-9}).

Why it matters

Yiyang Shen, Yutian He, Weiran Wang, and Qihang Lin (arXiv 2026-05-08) propose penalty-based first-order methods for bilevel problems where both upper- and lower-level problems are minimax; their deterministic algorithm finds an ε-KKT point with tilde O(ε^{-4}) oracle complexity.

Key details

  • They show convex constrained lower-level minimization can be reformulated via Lagrangian duality as a special case of their framework, yielding a tilde O(ε^{-4}) complexity that improves on the prior tilde O(ε^{-7}) bound.
  • The paper extends to the stochastic setting (only stochastic gradient oracles) and proves a stochastic method finds a near-ε-KKT point with tilde O(ε^{-9}) stochastic oracle complexity.
Source evidence

Abstract

We study a class of bilevel optimization problems in which both the upper- and lower-level problems have minimax structures. This setting captures a broad range of emerging applications. Despite the extensive literature on bilevel optimization and minimax optimization separately, existing methods mainly focus on bilevel optimization with lower-level minimization problems, often under strong convexity assumptions, and are not directly applicable to the minimax lower-level setting considered here. To address this gap, we develop penalty-based first-order methods for bilevel minimax optimization without requiring strong convexity of the lower-level problem. In the deterministic setting, we establish that the proposed method finds an $ε$-KKT point with $\tilde{O}(ε^{-4})$ oracle complexity. We further show that bilevel problems with convex constrained lower-level minimization can be reformulated as special cases of our framework via Lagrangian duality, leading to an $\tilde{O}(ε^{-4})$ complexity bound that improves upon the existing $\tilde{O}(ε^{-7})$ result. Finally, we extend our approach to the stochastic setting, where only stochastic gradient oracles are available, and prove that the proposed stochastic method finds a nearly $ε$-KKT point with $\tilde{O}(ε^{-9})$ oracle complexity.