Back To Index Previous Article Next Article Full Text

Statistica Sinica 36 (2026), 1043-1081

CAUSAL AND COUNTERFACTUAL VIEWS OF
MISSING DATA MODELS

Razieh Nabi*, Rohit Bhattacharya, Ilya Shpitser and James M. Robins

Emory University, Williams College, Johns Hopkins University and Harvard University

Abstract: It is often said that the fundamental problem of causal inference is a missing data problem—the comparison of responses to two hypothetical treatment assignments is made difficult because for every experimental unit only one potential response is observed. In this paper, we consider the implications of the converse view: that missing data problems are a form of causal inference. We make explicit how the missing data problem of recovering the complete data law from the observed law can be viewed as identification of a joint distribution over counterfactual variables corresponding to values had we (possibly contrary to fact) been able to observe them. Drawing analogies with causal inference, we show how identification assumptions in missing data can be encoded in terms of graphical models defined over counterfactual and observed variables. We review recent results in missing data identification from this viewpoint. In doing so, we note interesting similarities and differences between missing data and causal identification theories.

Key words and phrases: Causal graphs, causal inference, missing not at random.


Back To Index Previous Article Next Article Full Text