ArXiv

Optimal Policy Learning under Budget and Coverage Constraints

Authors
Giovanni Cerulli
Categories
stat.ML, cs.LG
arXiv
https://arxiv.org/abs/2605.12235v1
PDF
https://arxiv.org/pdf/2605.12235v1

Brief

Optimal policy learning under combined budget and minimum-coverage constraints is treated as a knapsack-type allocation problem; Cerulli (May 2026) proves the optimal rule is an affine threshold in budget and coverage shadow prices, shows an LP relaxation has an O(1) integrality gap, and evaluates two algorithms (GLC, RC), with Monte Carlo confirming near-optimal finite-sample performance and predictable failure modes.

Why it matters

Characterization (Cerulli, 2026-05-12): optimal policy under combined budget and minimum-coverage constraints has a knapsack-type structure and is given by an affine threshold rule in budget and coverage shadow prices; the LP relaxation has an O(1) integrality gap, implying asymptotic equivalence with the optimal discrete allocation.

Key details

  • Algorithms and empirical results: proposes Greedy-Lagrangian (GLC) and Rank-and-Cut (RC) procedures — GLC closely approximates the optimal solution and is near-optimal in finite samples; RC is approximately optimal when the coverage constraint is slack or costs are homogeneous, while misallocation occurs only when cost heterogeneity interacts with a binding coverage constraint; Monte Carlo evidence supports these findings.
Source evidence

Abstract

We study optimal policy learning under combined budget and minimum coverage constraints. We show that the problem admits a knapsack-type structure and that the optimal policy can be characterized by an affine threshold rule involving both budget and coverage shadow prices. We establish that the linear programming relaxation of the combinatorial solution has an O(1) integrality gap, implying asymptotic equivalence with the optimal discrete allocation. Building on this result, we analyze two implementable approaches: a Greedy-Lagrangian (GLC) and a rank-and-cut (RC) algorithm. We show that the GLC closely approximates the optimal solution and achieves near-optimal performance in finite samples. By contrast, RC is approximately optimal whenever the coverage constraint is slack or costs are homogeneous, while misallocation arises only when cost heterogeneity interacts with a binding coverage constraint. Monte Carlo evidence supports these findings.