Abstract
We propose to estimate a parametric regression with truncated data
built on the mode value, where the dependent variable is subject to left truncation by another random variable. We construct a kernel mode-based objective
function with a constant bandwidth for estimation and suggest a modified mode
expectation-maximization algorithm to numerically estimate the model.
The
asymptotic normal distribution of the proposed estimator is derived under mild
conditions. To efficiently construct confidence intervals for the resulting estimator, we develop a mode-based empirical likelihood method, where the asymptotic
distribution of the empirical log-likelihood ratio is shown to follow a chi-square
distribution. Furthermore, by combining the kernel mode-based objective function with the SCAD penalty, a variable selection procedure for the parameters is
introduced and its oracle property is established. Monte Carlo simulations and
real data analysis related to housing market are presented to show the finite sample performance of the developed estimation and variable selection procedures.
Information
| Preprint No. | SS-2023-0288 |
|---|---|
| Manuscript ID | SS-2023-0288 |
| Complete Authors | Tao Wang, Weixin Yao |
| Corresponding Authors | Tao Wang |
| Emails | taow@uvic.ca |
References
- Amemiya, T. (1973). Regression Analysis When the Dependent Variable is Truncated Normal. Econometrica, 41, 997-1016.
- Chen, S. X. and Van Keilegom, I. (2009). A Review on Empirical Likelihood Methods for Regression. TEST, 18, 415-447.
- Fan, J. and Li, R. (2001). Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties. Journal of the American Statistical Association, 96, 1348-1360.
- Hausman, J. A. and Wise, D. A. (1977). Social Experimentation, Truncated Distributions, and Efficient Estimation. Econometrica, 45, 919-938.
- He, S. and Yang, G. L. (1998). Estimation of the Truncation Probability in the Random Truncation Model. The Annals of Statistics, 26, 1011-1027.
- He, S. and Yang, G. L. (2003). Estimation of Regression Parameters with Left Truncated Data. Journal of Statistical Planning and Inference, 117, 99-122.
- Kemp, G. C. R. and Santos Silva, J. M. C. (2012). Regression towards the Mode. Journal of Econometrics, 170, 92-101.
- Lee, M. J. (1989). Mode Regression. Journal of Econometrics, 42, 337-349.
- Lee, M. J. (1993). Quadratic Model Regression. Journal of Econometrics, 57, 1-19.
- Owen, A. B. (1988). Empirical Likelihood Ratio Confidence Intervals for A Single Functional. Biometrika, 75, 237-249.
- Owen, A. B. (1990). Empirical Likelihood Ratio Confidence Regions. The Annals of Statistics, 18, 90-120.
- Stute, W. (1993). Almost Sure Representations of the Product-Limit Estimator for Truncated Data. The Annals of Statistics, 21, 146-156.
- Su, Y.-R. and Wang, J.-L. (2012). Modeling Left-Truncated and RightCensored Survival Data with Longitudinal Covariates. The Annals of Statistics, 40, 1465-1488.
- Ullah, A., Wang, T., and Yao, W. (2021). Modal Regression for Fixed Effects Panel Data. Empirical Economics, 60, 261-308.
- Ullah, A., Wang, T., and Yao, W. (2022). Nonlinear Modal Regression for Dependent Data with Application for Predicting COVID-19. Journal of the Royal Statistical Society Series A, 185, 1424-1453.
- Ullah, A., Wang, T., and Yao, W. (2023). Semiparametric Partially Linear Varying Coefficient Modal Regression. Journal of Econometrics, 10011026.
- Wang, M. C. (1989). A Semiparametric Model for Randomly Truncated Data. Journal of the American Statistical Association, 84, 742-748.
- Wang, K. and Li, S. (2021). Robust Distributed Modal Regression for Massive Data. Computational Statistics & Data Analysis, 160, 107225.
- Wang, T. (2024). Nonlinear Kernel Mode-Based Regression for Dependent Data. Journal of Time Series Analysis, 45, 189-213.
- Woodroofe, M. (1985). Estimating a Distribution Function with Truncated Data. The Annals of Statistics, 13, 163-177.
- Yao, W. and Li, L. (2014). A New Regression Model: Modal Linear Regression. Scandinavian Journal of Statistics, 41, 656-671.
- Zhou, W. (2011). A Weighted Quantile Regression for Randomly Truncated Data. Computational Statistics and Data Analysis, 55, 554-566.
- Zhou, Y. and Yip, P. S. (1999). A Strong Representation of the ProductLimit Estimator for Left Truncated and Right Censored Data. Journal of Multivariate Analysis, 69, 261-280. a. Department of Economics and Department of Mathematics and Statistics (by courtesy), University of Victoria, Victoria, BC V8W 2Y2, Canada. E-
Acknowledgments
We are deeply grateful to the Co-Editor Yi-Hau Chen, Associate Editor,
and two anonymous referees for their constructive comments, leading to
the substantial improvement of the paper. We would also like to thank
Bo Honor´e, Aman Ullah, and the seminar participants at the UC Riverside, University of Washington, and University of Iowa for their helpful
comments. Tao Wang’s research is supported by SSHRC-IDG grant (430-
2023-00149) and UVic-SSHRC Explore grant (2023-2024), and Weixin Yao’s
research is supported by NSF grant (DMS-2210272).
Supplementary Materials
The supplementary file contains additional numerical and technical results.