Statistica Sinica 35 (2025), 479-504
Abstract: Directed acyclic graph (DAG) models are widely used to represent casual relations among collected nodes. This paper proposes an efficient and consistent method to learn DAG with a general causal dependence structure, which is in sharp contrast to most existing methods assuming linear dependence of causal relations. To facilitate DAG learning, the proposed method leverages the concept of topological layer, and connects nonparametric DAG learning with kernel ridge regression in a smooth reproducing kernel Hilbert space (RKHS) and learning gradients by showing that the topological layers of a nonparametric DAG can be exactly reconstructed via kernel-based estimation, and the parent-child relations can be obtained directly by computing the estimated gradient function. The developed algorithm is computationally efficient in the sense that it attempts to solve a convex optimization problem with an analytic solution, and the gradient functions can be directly computed by using the derivative reproducing property in the smooth RKHS. The asymptotic properties of the proposed method are established in terms of exact DAG recovery without requiring any explicit model specification. Its superior performance is also supported by a variety of simulated and a real-life example.
Key words and phrases: Causality, exact DAG recovery, learning gradients, nonparametric DAG, RKHS.