Statistica Sinica 35 (2025), 1881-1898
Abstract: The effectiveness of tracking infected cases with the omicron virus has been greatly compromised due to the availability of at-home COVID test kits in many countries. An alternative solution to monitoring contagions of the COVID disease in the population is to survey viral loads from sewage water systems. In a city, hundreds of sewage manholes form a network of candidate sampling sites that are connected to each other in a complex way. Due to the limited resources, it is not viable in practice to sample wastewater from every manhole. The central question of scientific importance is to select those important manholes that are of most relevance to the prediction of confirmed infectious cases in a specific community. In this paper, we develop a supervised learning paradigm of time-series transitional models via the mixed integer programming optimizer to determine important sampling sites to build on a cost-effective monitoring system. We establish the key theoretical guarantee of the selection consistency for the proposed methodology. A novel multi-compartment dynamic model is proposed to simulate viral loads in the wastewater system from the evolution of the pandemic in the population, which is used to evaluate the performance of our proposed model and algorithm. This proposed methodology is illustrated by a real-world data analysis example.
Key words and phrases: COVID-19, GUROBI, L0 penalization, mixed integer optimization, transitional model.