Hostname: page-component-cd9895bd7-7cvxr Total loading time: 0 Render date: 2024-12-25T20:49:09.157Z Has data issue: false hasContentIssue false

Detecting atypical data in air pollutionstudies by using shorth intervals for regression

Published online by Cambridge University Press:  15 November 2005

Cécile Durot
Affiliation:
Université Paris Sud, Bâtiment 425, 91405 Orsay Cedex, France; [email protected]
Karelle Thiébot
Affiliation:
Université Paris Sud, Bâtiment 425, 91405 Orsay Cedex, France; [email protected] ; Air Pays de la Loire, 2 rue A. Kastler, BP 30723, 44307 Nantes Cedex 3, France.
Get access

Abstract

To validate pollution data, subject-matter experts in Airpl (an organizationthat maintains a network of air pollution monitoring stations in westernFrance) daily perform visual examinations of the data and check theirconsistency. In this paper, we describe these visual examinations andpropose a formalization for this problem. The examinations consistin comparisons of so-called shorth intervals so we build a statisticaltest that compares such intervals in a nonparametric regression model.This allows to detect atypical data. A practical applicationof the test is given.

Type
Research Article
Copyright
© EDP Sciences, SMAI, 2005

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bel, L., Bellanger, L., Bonneau, V., Ciuperca, G., Dacunha-Castelle, D., Deniau, C., Ghattas, B., Misiti, Y. and Oppenheim, G., Éléments de comparaison de prévisions statistiques des pics d'ozone. Rev. Statist. App. 3 (1999) 725.
C. Durot and K. Thiébot. Bootstrapping the shorth for regression. Submitted (2003).
Hall, P., Kay, J.W. and Titterington, D.M., Asymptotically optimal difference-based estimation of variance in nonparametric regression. Biometrika 77 (1990) 521529. CrossRef
K. Thiébot, Synthèse de l'enquête sur la procédure de validation de données dans les résaux de surveillance de pollution athmosphérique. Technical report, Air Pays de la Loire (1998).