ArXiv

A Recursive Decomposition Framework for Causal Structure Learning in the Presence of Latent Variables

Authors
Zheng Li, Feng Xie, Shenglan Nie...
Categories
cs.LG, cs.AI, stat.ML
arXiv
https://arxiv.org/abs/2605.10651v1
PDF
https://arxiv.org/pdf/2605.10651v1

Brief

The paper introduces DiCoLa, a recursive decomposition framework that extends divide-and-conquer causal discovery to settings with latent variables by splitting the global CI-testing task into smaller subproblems and integrating solutions via a principled reconstruction step. The authors prove soundness and completeness for the framework and demonstrate substantial runtime improvements on synthetic experiments and practical effectiveness on a real-world dataset, addressing high-dimensional CI-testing bottlenecks.

Why it matters

DiCoLa: a recursive decomposition framework that enables divide-and-conquer constraint-based causal discovery in the presence of latent variables by decomposing the global task into smaller subproblems and reconstructing the global structure.

Key details

  • The paper (Li et al., arXiv 2026-05-11) proves soundness and completeness of DiCoLa and reports significant computational-efficiency gains on synthetic benchmarks plus successful application to a real-world dataset.
Source evidence

Abstract

Constraint-based causal discovery is widely used for learning causal structures, but heavy reliance on conditional independence (CI) testing makes it computationally expensive in high-dimensional settings. To mitigate this limitation, many divide-and-conquer frameworks have been proposed, but most assume causal sufficiency, i.e., no latent variables. In this paper, we show that divide-and-conquer strategies can be theoretically generalized beyond causal sufficiency to settings with latent variables. Specifically, we propose a recursive decomposition framework, termed DiCoLa, that enables divide-and-conquer causal discovery in the presence of latent variables. It recursively decomposes the global learning task into smaller subproblems and integrates their solutions through a principled reconstruction step to recover the global structure. We theoretically establish the soundness and completeness of the proposed framework. Extensive experiments on synthetic data demonstrate that our approach significantly improves computational efficiency across a range of causal discovery algorithms, while experiments on a real-world dataset further illustrate its practical effectiveness.