Abstract
Decision trees are one of the most widely used nonparametric methods
for regression and classification. In existing literature, decision tree-based methods have been used for estimating continuous functions or piecewise-constant
functions. However, they are not flexible enough to estimate the complex shapes
of jump location curves (JLCs) in two-dimensional regression functions. In this
article, we explore the Oblique-axis Regression Tree (ORT) and propose a method
to efficiently estimate piece-wise continuous functions in a general finite dimension with fixed design points. The central idea involves clustering the local pixel
intensities by recursive tree partitioning and using the local leaf-only averaging
for estimation of the regression function at a given pixel. The proposed method
can preserve complex shapes of the JLCs well in a finite-dimensional regression
function.
Due to a different set of assumptions on the underlying regression
function, the overall framework of the proofs is different from what is available
in the literature on regression trees. Theoretical analysis and numerical results,
particularly on image denoising, indicate that the proposed method effectively
preserves complicated edge structures while efficiently removing noise from piecewise continuous regression surfaces.
Information
| Preprint No. | SS-2025-0120 |
|---|---|
| Manuscript ID | SS-2025-0120 |
| Complete Authors | Subhasish Basak, Anik Roy, Partha Sarathi Mukherjee |
| Corresponding Authors | Partha Sarathi Mukherjee |
| Emails | psmukherjee.statistics@gmail.com |
References
- Abdelhamed, A., M. Afifi, R. Timofte, M. Brown, Y. Cao, Z. Zhang, W. Zuo, X. Zhang, J. Liu, W. Chen, C. Wen, M. Liu, S. Lv, Y. Zhang, Z. Pan, B. Li, T. Xi, Y. Fan, X. Yu, and
- V. Kumar (2020, 06). Ntire 2020 challenge on real image denoising: Dataset, methods and results. pp. 2077–2088.
- Breiman, L. (1984). Classification and Regression Trees. (The Wadsworth statistics / probability series). Wadsworth International Group.
- Breiman, L. (2001). Random forests. Machine learning 45, 5–32.
- Buades, A., B. Coll, and J.-M. Morel (2011). Non-local means denoising. Image Processing On Line 1, 208–212.
- Cattaneo, M. D., R. Chandak, and J. M. Klusowski (2024). Convergence rates of oblique regression trees for flexible function libraries. The Annals of Statistics 52(2), 466–490.
- Chaudhuri, A. and S. Chatterjee (2023). A cross-validation framework for signal denoising with applications to trend filtering, dyadic cart and beyond. The Annals of Statistics 51(4), 1534–1560.
- Chu, C., I. Glad, F. Godtliebsen, and J. Marron (1998). Edge-preserving smoothers for image processing. Journal of the American Statistical Association 93(442), 526–541.
- Donoho, D. L. (1997). Cart and best-ortho-basis: a connection. The Annals of statistics 25(5), 1870–1911.
- Froment, J. (2014). Parameter-free fast pixelwise non-local means denoising. Image Processing On Line 4, 300–326.
- Gonzalez, R. and R. Woods (2018). Digital Image Processing (4th ed.). USA: Pearson.
- Gramacy, R. B. and H. K. H. Lee (2008). Bayesian treed gaussian process models with an application to computer modeling. Journal of the American Statistical Association 103(483), 1119–1130.
- Kang, Y. and P. Qiu (2024). DRIP: Discontinuous Regression and Image Processing. R package version 2.3.
- Kang, Y., Y. Shi, Y. Jiao, W. Li, and D. Xiang (2021). Fitting jump additive models. Computational Statistics & Data Analysis 162, 107266.
- Konomi, B. A., H. Sang, and B. K. Mallick (2014). Adaptive bayesian nonstationary modeling for large spatial datasets using covariance approximations. Journal of Computational and Graphical Statistics 23(3), 802–829.
- Li, P., S. Wang, T. Li, J. Lu, Y. HuangFu, and D. Wang (2020). A large-scale ct and pet/ct dataset for lung cancer diagnosis (lung-pet-ct-dx). https://www.cancerimagingarchive. net/collection/lung-pet-ct-dx/.
- Mukherjee, P. S. and P. Qiu (2011). 3-d image denoising by local smoothing and nonparametric regression. Technometrics 53(2), 196–208.
- Qiu, P. (2005). Image Processing and Jump Regression Analysis. Wiley Series in Probability and Statistics. New York: Wiley.
- Qiu, P. (2009). Jump-preserving surface reconstruction from noisy data. Annals of the Institute of Statistical Mathematics 61, 715–751.
- Roy, A. and P. S. Mukherjee (2024). Image comparison based on local pixel clustering. Technometrics 66(4), 495–506.
- Roy, A. and P. S. Mukherjee (2025). Upper quantile-based cusum-type control chart for detecting small changes in image data. Journal of Applied Statistics 52(11), 2156–2171.
- Zamir, S. W., A. Arora, S. Khan, M. Hayat, F. S. Khan, and M.-H. Yang (2022). Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5728–5739.
- Zhan, H., Y. Liu, and Y. Xia (2024). Consistency of oblique decision tree and its boosting and random forest. Author Affiliations and Contact Details: Subhasish Basak, Indian Statistical Institute, Kolkata, India,
Acknowledgments
The authors thank the editor, the associate editor,
and two anonymous reviewers for their comments and suggestions, which
greatly improved the quality of this paper.