Statistica Sinica 30 (2020), 673-693
Abstract: As one of the most popular classification methods, the logistic regression model has been studied extensively. Essentially, the model assumes that an individual’s class label is influenced by a set of predictors. However, with the rapid advance of social network services, social network data are becoming increasingly available. As a result, incorporating this additional network structure in order to improve classification accuracy has become an important research problem. To this end, we propose a network-based logistic regression (NLR) model that takes the network structure into consideration. Four interesting scenarios are used to investigate the link formation of the network structure under the NLR model. Furthermore, we determine the impact of the network structure on classification by deriving the asymptotic properties for the prediction rule under different sparsities of network. Lastly, simulation studies are conducted to demonstrate the finite-sample performance of the proposed method, and a real Sina Weibo data set is analyzed for illustrative purposes.
Key words and phrases: Classification, logistic regression, network structure.