Full Text:   <6572>

CLC number: O21

On-line Access: 2024-08-27

Received: 2023-10-17

Revision Accepted: 2024-05-08

Crosschecked: 2008-12-29

Cited: 11

Clicked: 6845

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE A 2009 Vol.10 No.6 P.909-921

http://doi.org/10.1631/jzus.A0820140


Outlier detection by means of robust regression estimators for use in engineering science


Author(s):  Serif HEKIMOGLU, R. Cuneyt ERENOGLU, Jan KALINA

Affiliation(s):  Department of Geodesy and Photogrammetry Engineering, Yildiz Technical University, Istanbul 34349, Turkey; more

Corresponding email(s):   hekim@yildiz.edu.tr, ceren@yildiz.edu.tr

Key Words:  Linear regression, Outlier, Mean success rate (MSR), Leverage point, Least median of squares (LMS), Least trimmed squares (LTS)


Serif HEKIMOGLU, R. Cuneyt ERENOGLU, Jan KALINA. Outlier detection by means of robust regression estimators for use in engineering science[J]. Journal of Zhejiang University Science A, 2009, 10(6): 909-921.

@article{title="Outlier detection by means of robust regression estimators for use in engineering science",
author="Serif HEKIMOGLU, R. Cuneyt ERENOGLU, Jan KALINA",
journal="Journal of Zhejiang University Science A",
volume="10",
number="6",
pages="909-921",
year="2009",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A0820140"
}

%0 Journal Article
%T Outlier detection by means of robust regression estimators for use in engineering science
%A Serif HEKIMOGLU
%A R. Cuneyt ERENOGLU
%A Jan KALINA
%J Journal of Zhejiang University SCIENCE A
%V 10
%N 6
%P 909-921
%@ 1673-565X
%D 2009
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A0820140

TY - JOUR
T1 - Outlier detection by means of robust regression estimators for use in engineering science
A1 - Serif HEKIMOGLU
A1 - R. Cuneyt ERENOGLU
A1 - Jan KALINA
J0 - Journal of Zhejiang University Science A
VL - 10
IS - 6
SP - 909
EP - 921
%@ 1673-565X
Y1 - 2009
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A0820140


Abstract: 
This study compares the ability of different robust regression estimators to detect and classify outliers. Well-known estimators with high breakdown points were compared using simulated data. Mean success rates (MSR) were computed and used as comparison criteria. The results showed that the least median of squares (LMS) and least trimmed squares (LTS) were the most successful methods for data that included leverage points, masking and swamping effects or critical and concentrated outliers. We recommend using LMS and LTS as diagnostic tools to classify outliers, because they remain robust even when applied to models that are heavily contaminated or that have a complicated structure of outliers.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1] Barnett, V., Lewis, T., 1994. Outliers in Statistical Data (3rd Ed.). John Wiley and Sons, New York.

[2] Chen, C., 2002. Robust Regression and Outlier Detection with the ROBUSTREG Procedure. SUGI Paper No.265-27. SAS Institute, Cary, NC.

[3] Daniel, C., Wood, F.S., 1971. Fitting Equations to Data. Wiley, New York.

[4] Davies, P.L., 1993. Aspects of robust linear regression. Ann. Stat., 21(4):1843-1899.

[5] Davies, P.L., Gather, U., 2005. Breakdown and groups with discussion and rejoinder. Ann. Stat., 33(3):977-1035.

[6] Donoho, D.L., 1982. Breakdown Properties of Multivariate Location Estimators. PhD Qualifying Paper, Harvard University, Boston.

[7] Donoho, D.L., Huber, P.J., 1983. The Notion of Breakdown Point. In: Bickel, P.J., Doksum, K., Hodges, J.L.J. (Eds.), A Festschrift for Erich L. Lehmann. Wadsworth, Belmont, p.157-184.

[8] Gather, U., Hilker, T., 1997. A note on Tyler’s modification of the MAD for the Stahel-Donoho estimator. Ann. Stat., 25(5):2024-2026.

[9] Hadi, A.S., Simonoff, J.S., 1993. Procedures for the identification of multiple outliers in linear models. J. Am. Stat. Assoc., 88(424):1264-1272.

[10] Hampel, F.R., 1968. Contributions to the Theory of Robust Estimation. PhD Thesis, University of California, Berkeley.

[11] Hampel, F.R., 1971. A general qualitative definition of robustness. Ann. Math. Stat., 42(6):1887-1896.

[12] Hampel, F.R., 1975. Beyond location parameters: robust concepts and methods (with discussion). Bull. Inst. Int. Stat., 46:375-391.

[13] Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.R., Shatel, W.A., 1986. Robust Statistics: The Approach Based on Influence Functions. Wiley, New York.

[14] Hekimoglu, S., 1997. Finite sample breakdown points of outlier detection procedures. ASCE J. Surv. Eng., 123(1):15-31.

[15] Hekimoglu, S., 2005. Do robust methods identify outliers more reliably than conventional test for outlier? Zeitschrift für Vermessungwesen, 3:174-180.

[16] Hekimoglu, S., Koch, K.R., 1999. How Can Reliability of the Robust Methods Be Measured? In: Altan, M.O., Gründig, L. (Eds.), Third Turkish-German Joint Geodetic Days, 1:179-196.

[17] Hekimoglu, S., Erenoglu, R.C., 2005. Estimation of Parameters for Linear Regression Using Median Estimator. Int. Conf. on Robust Statistics, University of Jyvaskyla, Finland, p.26.

[18] Hekimoglu, S., Erenoglu, R.C., 2007. Effect of heteroscedasticity and heterogeneousness on outlier detection for geodetic networks. J. Geod., 81(2):137-148.

[19] Huber, P.J., 1981. Robust Statistics. John Wiley and Sons, New York.

[20] Kamgar-Parsi, B., Netanyahu, N.S., 1989. A nonparametric method for fitting a straight line to a noisy image. IEEE Trans. Pattern Anal. Mach. Intell., 11(9):998-1001.

[21] Lopuhaa, H.P., Rousseeuw, P.J., 1991. Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann. Stat., 19(1):229-248.

[22] Rousseeuw, P.J., 1984. Least median of squares regression. J. Am. Stat. Assoc., 79(388):871-880.

[23] Rousseeuw, P.J., 1985. Multivariate Estimation with High Breakdown Point. In: Grossman, W., Pflug, G., Vincze, I., Werz, W. (Eds.), Mathematical Statistics and Applications. Reidel, Dordrecht, p.283-297.

[24] Rousseeuw, P.J., Leroy, A.M., 1987. Robust Regression and Outlier Detection. John Wiley and Sons, New York.

[25] Sen, P.K., 1968. Estimates of the regression coefficient based on Kendall’s tau. J. Am. Stat. Assoc., 63(324):1379-1389.

[26] Shevlyakov, G.L., Vilchevski, N.O., 2001. Robustness in Data Analysis: Criteria and Methods. VSP International Science Publishers, Utrecht.

[27] Siegel, A.F., 1982. Robust regression using repeated medians. Biometrika, 69(1):242-244.

[28] Stahel, W.A., 1981. Breakdown of Covariance Estimators. Research Rep. 31, Fachgruppe für Statistik, ETH, Zurich.

[29] Staudte, R.G., Sheather, S.J., 1990. Robust Estimation and Testing. Wiley, New York.

[30] Stromberg, A.J., 1993. Computing the exact least median of squares estimate and stability diagnostics in multiple linear regression. SIAM J. Sci. Comput., 14(6):1289-1299.

[31] Theil, H., 1950. A rank-invariant method of linear and polynomial regression analysis. Nederlandse Akademie Wetenchappen Series A, 53:386-392.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE