Statistica Sinica 29 (2019), 1155-1180
Abstract: Gathering information about forest variables is an expensive and arduous activity. Therefore, directly collecting the data required to produce high-resolution maps over large spatial domains is infeasible. Next-generation collection initiatives for remotely sensed light detection and ranging (LiDAR) data are specifically aimed at producing complete-coverage maps over large spatial domains. Given that LiDAR data and forest characteristics are often strongly correlated, it is possible to use the former to model, predict, and map forest variables over regions of interest. This entails dealing with high-dimensional (~102) spatially dependent LiDAR outcomes over a large number of locations (~105 - 106). With this in mind, we develop the spatial factor nearest neighbor Gaussian process (SF-NNGP) model, which we embed in a two-stage approach that connects the spatial structure found in LiDAR signals with forest variables. We provide a simulation experiment that demonstrates the inferential and predictive performance of the SF-NNGP, and use the two-stage modeling strategy to generate complete-coverage maps of the forest variables, with associated uncertainty, over a large region of boreal forests in interior Alaska.
Key words and phrases: Forest outcomes, LiDAR data, nearest neighbor Gaussian processes, spatial prediction.