Introduction
Nearest-neighbours (NN) avalanche forecasting compares data describing past avalanche and non-avalanche days with current or forecast data. In NN a distance between days in the dataset and the forecast day is defined to identify previous days which are most “similar” to the forecast day (the nearest neighbours). The nature of events on the nearest neighbours is then used to build hypotheses about the likely resulting avalanches (Reference BuserBuser, 1983, Reference Buser1989).
Statistically, NN is a non-parametric pattern classification technique which arranges data in a multi-dimensional space and applies a distance metric (usually Euclidean) to define the distance between past and present data (Reference RipleyRipley, 1996).
Various NN forecast techniques are currently used operationally in local avalanche forecasting. All assume that similar events are likely to exhibit similar precursors and that snow and weather factors and/or snowpack factors can be extrapolated over the geographic forecast area (e.g. Reference BuserBuser, 1983, Reference Buser1989; Reference Gassner, Birkeland, Etter and LeonardGassner and others, 2001; Reference McCollister, Birkeland, Hansen, Aspinall and ComeyMcCollister and others, 2002; Reference Mérindol, Guyomarc’h and GiraudMérindol and others, 2002; Reference Purves, Morrison, Moss and WrightPurves and others, 2002).
In this paper, the forecasted event is defined as a day with one or more recorded avalanches in the forecast region (an avalanche day). When such an avalanche day is found amongst the nearest neighbours selected by NN, it is called a positive neighbour.
The nearest neighbours can be interpreted in a number of ways, with the three most common interpretations being:
Categorial forecast: Here decision boundaries are used to classify days into a number of forecast categories. Often these categories are dichotomous (avalanches forecast or not), and an avalanche day is forecast when the number of positive neighbours is greater than or equal to some defined decision boundary. Reference Brabec and MeisterBrabec and Meister (2001) have used NN in multi-categorial form to predict the five categories of the European avalanche-hazard scale.
Probability forecast: The probability of the event is estimated (e.g. “Avalanches are expected today with a probability of 10%”). A probability forecast relies on the ability of NN to produce an estimation of the a posteriori probability of an event. This posterior probability is then used in the forecast as the prior probability of the event. In practice, the number of positive neighbours divided by the total number of nearest neighbours is used to estimate the probability of an avalanche day. Reference McCollister, Birkeland, Hansen, Aspinall and ComeyMcCollister and others (2002) have used such an approach atJackson Hole ski area, Wyoming, U. S.A.
Descriptive forecast: A detailed list of events and all associated, individual observations recorded in the past are provided by NN to the forecaster. This description is then used by the forecaster as an aide-mémoire characterizing the nature of the associated avalanche days. This information is combined with other available information, and further interpreted by the forecaster. This descriptive scheme, based on hypothesis testing as described by Reference LaChapelleLaChapelle (1980), has been recommended by Buser (Reference Buser1983, Reference Buser1989) and Reference Purves, Morrison, Moss and WrightPurves and others (2002). It has been practised by users of NXD (Reference Gassner, Birkeland, Etter and LeonardGassner and others, 2001), Cornice (Reference Purves, Morrison, Moss and WrightPurves and others, 2002) and Astral (Reference Mérindol, Guyomarc’h and GiraudMérindol and others, 2002).
Each interpretation requires adequate verification. The verification is intended to indicate the positive and negative aspects of differing interpretations of NN and to examine the possible influences of different datasets. The latter question was addressed using two datasets with different purposes utilized in operational avalanche forecasting.
The first dataset was used to forecast daily avalanche risk to roads, railway and settlement areas in a region of Valais, Switzerland, where the forecaster must decide whether roads or railways must be closed or endangered habitation evacuated.
In the second dataset, the model was used in Lochaber, Scotland, by avalanche forecasters responsible for provision of back-country avalanche forecasts to mountaineers. These forecasts describe the current snow and avalanche conditions and their likely evolution over 24 hours and utilize the European avalanche-hazard scale to describe the degree of hazard.
In both cases, the forecasters utilize the descriptive interpretation of the 10 nearest neighbours. In the Swiss case the NN rule is performed by NXD (Reference Gassner, Birkeland, Etter and LeonardGassner and others, 2001) and in the Scottish case by Cornice (Reference Purves, Morrison, Moss and WrightPurves and others, 2002).
Characteristics of the Datasets
Although both datasets are used to describe avalanche events, they differ a great deal in the purpose of the forecasting being carried out and therefore in the nature and frequency of occurrence of the recorded events.
In the Swiss case, only large avalanches which may reach traffic lines or settlements are recorded. These avalanches often occur in conditions of High or Extreme avalanche hazard, and most are triggered naturally. Avalanches with no hazard potential to roads, railways or habitation are not recorded. The base rate (i.e. the fraction of all days in the dataset when avalanches were recorded) is 7%. In the Scottish case, a mountaineer might be dislodged or buried by even a small avalanche. Given that most events involving victims are triggered by those victims, then human-triggered avalanches are of particular importance to forecasters. Such conditions often equate to Moderate or Considerable hazard of avalanches on the European avalanche-hazard scale. The base rate is 20% for this dataset. Table 1 summarizes characteristics of each dataset.
Verification Methods
Neither the quality of the Scottish and Swiss forecasts, nor NXD and Cornice are compared since both the underlying datasets and the forecast purposes do not match. Indeed, Reference MurphyMurphy (1991) has shown that comparative verification of two forecast systems is a complex and high-dimensional problem compared to the absolute verification considered here.
Verification of the categorial forecast
The measures-oriented verification of dichotomous categorial forecasts can be divided into finding accuracy measures and skill measures. Such measures can be obtained from the joint distribution of observations and forecasts (Table 2), and a selection of such measures is introduced in Table 3. More detail on such measures can be found in Reference Doswell, Davies-Jones and KellerDoswell and others (1990) and Wilks (Reference Wilks1995, p. 238–250).
Verification of the probability forecast
Probability forecasts are best verified and interpreted by factorizing the joint probability distribution of observations and forecasts into conditional and marginal distributions, called distributions-oriented verification (Reference Murphy and WinklerMurphy and Winkler, 1986). Various aspects of forecast quality can be described by factorization. In this paper the following are examined:
Reliability: also called calibration or conditional bias, it is quantified by the weighted average of the squared differences of forecast probabilities and the relative frequencies of the events in each subsample (Reference WilksWilks, 1995, p. 262).
Resolution: the ability to discern days with different avalanche-day probability.
Bias: the general tendency to under- or over-forecast.
Furthermore, additional aspects of the forecast, such as skill, sharpness, discrimination and uncertainty can be deduced from other factorizations (Reference WilksWilks, 1995, p. 258–272).
Verification of the descriptive forecast
If the description (event list and associated details) provided by the NN rule is intended to be used by the forecaster, then some adequate verification of this description is required. This verification should characterize the description with respect to its ability to provide the forecaster with meaningful information.
In this paper, a first approach is presented, whereby the forecaster of the Swiss dataset was asked to perform a critical, subjective post-rating of each day when avalanches occurred in his region. Emphasis was laid on rating the value of the information provided by NN, not the quality of his final forecast. The NN description of each forecast day was rated as one of five ordinal categories: “severe misfit”, “misleading”, “unhelpful” (i.e. neither positive nor negative), “useful” or “very useful”.
Results
Here the results obtained from the verification of the three interpretation schemes are presented.
Categorial forecast
A measures-oriented verification was carried out to examine how the accuracy and the skill of the forecasts varied for a range of decision boundaries between 1 and 10 positive neighbours (Figure 1a and b).
No results are given for decision boundaries above 6 in Figure 1a and above 9 in Figure 1b. No data with these numbers of positive neighbours were available in the respective datasets.
Probability forecast
A distributions-oriented verification was carried out to examine how well NN was able to produce a probability forecast, especially with regard to reliability and resolution as presented in the attributes diagrams (Fig. 2a and b).
Descriptive forecast
A summary of the subjective post-rating of the value of information provided by NN is presented in Figure 3. The histogram bars show the relative frequencies of the classes defined by how helpful the information was on days when avalanches occurred.
Discussion
Categorial forecast
Figure 1 describes the dependency of accuracy (POD, SR, HR) and skill (KSS, HSS) on the choice of decision boundary (k). Various criteria may be used to specify the value of the decision boundary, such as POD(k) = SR(k), max[KSS(k)] or max[HSS(k)]. While Figure 1 is helpful in quantifying the dependency, the choice of decision boundary should be case-dependent and take account of human factors such as appreciation of risk and the consequences of unforecasted events and false alarms (Reference McClungMcClung, 2002).
Despite Murphy’s comments on the difficulties of comparison between datasets (Reference MurphyMurphy, 1991), some simple comparisons between the Swiss and Scottish datasets can still be drawn. The Swiss data (Fig. 1a; base rate 7%) exhibit a higher HR than the Scottish (Fig. 1b; base rate 20%) while tbeir POD/SR pair is less accurate. It appears that these differences are driven chiefly by the base rate.
Probability forecast
The distributions on the attributes diagrams in Figure 2a and b exhibit several interesting features (Reference WilksWilks, 1995, p. 266). The Swiss dataset (Fig. 2a) displays “unsteady” behaviour for days with over four positive neighbours, due to insufficient data. These data points result from only 15 out of 1048 days. This suggests that on a dataset with a base rate as low as 7%, 1048 data points still constitute an insufficient database for a definitive verification over the entire range up to ten neighbours. The Scottish dataset is also not entirely sufficient (Fig. 2b). Indeed, the attributes diagram exhibits a decrease in resolution for days with over five positive neighbours, indicated by the flattening of the curve to the right whereby the probability remains constant for an increasing number of positive neighbours.
Next, points with sufficient data are considered: days with zero to four positive neighbours in Figure 2a and with zero to six in Figure 2b. The closer the data come to line iii indicating perfect reliability, the better the forecast in this respect. Both forecasts exhibit good reliability which is a positive feature of NN. Both forecasts also exhibit little bias as shown by the equal distribution of data points above and below line iii.
Descriptive forecast
On 64% of forecasted avalanche days, the descriptive information provided by the NN rule was a posteriori judged useful or very useful by the forecaster. Severe misfits were exceptional and limited to 2% of the forecast days, while 12% of the descriptions were misleading and 22% unhelpful (Fig. 3). This indicates that the detailed description of the events in the nearest neighbours provides forecasters with valuable information.
Positive and negative aspects of interpretations of NN
All three interpretations provide some useful information content, but this is dependent on the intended application and the underlying data.
Categorial forecasting provides no room for interpretation by the forecaster: no information on the uncertainty of a forecast is available. Thus, if forecasters wish to utilize categorial forecasting it is key that they understand the implications of the POD/SR pair and the human factors related to false alarms and unforecasted events.
Probability forecasts may be helpful when used in a suitable context, but given that the definition of events in this case study is very broad− from a single avalanche to many in a given area and on a given day −a probability value on its own may be of limited use to the forecaster. Defining the events more precisely will inevitably produce less reliable forecasts due to the reduction of the base rate. This is a serious dilemma in avalanche forecasting, where the requirement is often to produce more precise forecasts (in terms of space, time or avalanche type).
Descriptive forecasts provide the most flexibility for the forecaster to interpret the nearest neighbours and associated avalanches. This interpretation, like any other part of the conventional avalanche-forecasting process, requires considerable knowledge and skill from the forecaster.
Conclusion and Further Work
Measures-oriented verification quantifies the skill and accuracy of forecasts but does not allow comparative verification. Distribution-oriented verification of forecasts leads to valuable information on the sufficiency of the database, the reliability of the forecast, its resolution and its bias.
NN apparently produces reliable, unbiased probability forecasts, but this must be verified case-by-case. Forecasters may find difficulty making decisions based only on probability forecasts. A low base rate is a serious limiting factor on the reliability and skill of a NN forecast.
The descriptive interpretation produces useful and interpretable forecasts, and an initial verification is presented in this paper. Many aspects of the value of information using descriptive NN remain unknown and will be investigated in further work.
Acknowledgements
We are very grateful to our editor B. Jamieson and to the reviewers for their many comments leading to significant improvement of the text. We would like to thank M.Volorio for carrying out a tremendous task in assessing the subjective value of information contained in the descriptions of each avalanche day, and G. Moss of the sportScotland Avalanche Information Service.