Statistica Sinica 32 (2022), 961-982
Xin Liu1 , Qingle Zheng1 , Xiaotong Shen2 and Shaoli Wang1
Abstract: In semi-supervised learning, a training sample comprises both labeled and unlabeled instances from each class under consideration. In practice, an important, yet challenging issue is the detection of novel classes that may be absent from the training sample. Here, we focus on the binary situation in which labeled instances come from the positive class, and unlabeled instances come from both classes. In particular, we propose a semi-supervised large-margin classifier to learn the negative (novel) class based on pseudo-data generated iteratively using an estimated model. Numerically, we employ an efficient algorithm to implement the proposed method using the hinge loss and -loss functions. Theoretically, we derive a learning theory for the new classifier in order to quantify the misclassification error. Finally, a numerical analysis demonstrates that the proposed method compares favorably with its competitors on simulated examples, and is highly competitive on benchmark examples.
Key words and phrases: Biased SVM, iterative algorithm, large-margins, PU learning.