Introduction
Freshwater is a critically important resource that fuels agriculture, recreation, cultural practices, and various other activities and productive resources. At the same time, nutrients and sediments are impairing the world’s waterbodies: only 60% of countries report having ambient freshwater quality that is “good” or better, and 44% of the world’s household waste is not treated before entering the water system (UN-Water, 2021). Cost–benefit analysis (CBA) is one tool available to inform efficient water management and policy decisions, but for many countries CBA estimates that can be used for benefit transfer are often lacking. Using New Zealand as a case study, we carried out a stated preference (SP) analysis with the aim of providing results that can be used to inform future policy.
New Zealand has put forth many national and regional efforts aimed at improving water quality (PCE, 2016) and received international attention in 2017 by declaring the Whanganui River a legal person (Warne, Reference Warne2019). However, the quality of New Zealand’s rivers and lakes continue to decline. More than two-thirds of rivers exceed the government’s nitrogen or phosphorous limits (MFE and Stats NZ, 2019), and in a recent national survey, the public ranked the condition of rivers and lakes as the lowest among New Zealand’s environmental amenities (Booth et al., Reference Booth, Hughey, Kerr and Stahlmann-Brown2022). To inform substantive change, the government needs tools to analyze trade-offs in water quality policy options. Section 32 of New Zealand’s Resource Management Act (1991) requires an identification and assessment of the benefits and costs of environmental policies and rules, but existing literature is not well suited to analyze national policy, especially with water quality.
Although there are a few studies on water quality valuation in New Zealand, including several SP studies (see Marsh and Mkwara [Reference Marsh and Mkwara2013] and Harris et al. [Reference Harris, Kerr and Doole2016]), applying this existing literature to government policies has been difficult. These difficulties arise from several issues, including scaling up local estimates to the national level, studies using large water quality changes that are not representative of actual policy changes, and the use of subjective or aggregate water quality variables that cannot be linked to policy-relevant measures. These are common issues found throughout the international literature on water quality valuation (Moran and Dann, Reference Moran and Dann2008; Griffiths et al., Reference Griffiths, Klemick, Massey, Moore, Newbold, Simpson, Walsh and Wheeler2012; Newbold et al., Reference Newbold, Walsh, Massey and Hewitt2018).
We designed and implemented a national SP survey with explicit attention given to the use of the results for future benefit transfer. Our discrete choice experiment utilizes three water quality parameters – nutrients, water clarity, and E. coli levels – chosen to align with government policy levers and to be relevant and salient to the public. The choice experiment presents policy changes at the regional council level, which corresponds to the administrative unit for most environmental policies in New Zealand.Footnote 1 The policy changes presented to respondents are also more in-line with the outcomes of actual policies, as compared to many past SP studies that pose unrealistically large water quality changes in the environmental commodity (Newbold et al., Reference Newbold, Walsh, Massey and Hewitt2018). Furthermore, the borders of New Zealand’s regional councils are based around watersheds and catchments, so there are less cross-boundary pollution concerns compared to administrative units in other countries. The results of this study are particularly useful for CBA within New Zealand, and the methods provide an example for studies in other countries to better align with policy.
We find that people are willing to pay for improvements in all three water quality parameters and identify respondent characteristics that drive observed heterogeneity in willingness to pay (WTP). Accounting for such heterogeneity allows the results to be tailored to subnational areas in a benefit transfer. At the same time, we also find and control for significant unobserved heterogeneity in WTP. We apply our results to a recent water quality policy proposed by New Zealand’s Ministry for the Environment (MFE), to reduce sediment runoff (Neverman et al., Reference Neverman, Djanibekov, Soliman, Walsh, Spiekermann and Basher2019). Benefits transfer based on our survey results suggests nationwide annual benefits of about NZ $144 million (2018 NZD) or approximately $99 million USD. This benefit transfer exercise also illuminates notable differences between regional councils. We compare our results to a recent municipal vote on an Auckland property tax designed to raise over $500 million NZD in the next 10 years for water quality improvements. The vote was successful, with the resulting tax applied to commercial and residential buildings, and exemplifies the large values residents have for water quality.
Background
Freshwater resources are an integral part of the cultural heritage, economic development, and national character of New Zealand (NZ) (Ambrey et al., Reference Ambrey, Fleming and Manning2017; Awatere et al., Reference Awatere, Robb, Taura, Reihana, Harnsworth, Te Maru and Watene-Rawiri2017). The indigenous Māori culture is important in NZ, and from a Māori world view, the separation of Ranginui (sky father) and Papatūānuku (earth mother) produced freshwater, emphasizing both the importance and connection of freshwater to the Māori people and NZ population more broadly (MFE & Stats NZ, 2020). Given this importance, there are several existing water quality valuation studies, but many are unpublished, in the gray literature, or from government (Miller, Reference Miller2014; Phillips, Reference Phillips2014; Tait et al., Reference Tait, Miller, Rutherford and Abell2016) or consultant reports that do not yield original estimates (Marsh and Mkwara, Reference Marsh and Mkwara2013). Other studies focus on only one region of NZ (Tait et al., Reference Tait, Baskaran, Cullen and Bicknell2011; Marsh and Phillips, Reference Marsh and Phillips2015). Transferring the results of these case studies to policies and populations in other areas or at the national level is questionable and could result in large errors (Smith and Pattanayak, Reference Smith and Pattanayak2002).
The NZ-based SP literature is similar to the international literature (Van Houtven et al., Reference Van Houtven, Powers and Pattanayak2007) in that it has examined several different water quality indicators. Some of these indicators are related to agricultural practices, such as riparian buffer restoration (Cullen et al., Reference Cullen, Hughey and Kerr2006) or nutrient leaching (Baskaran et al., Reference Baskaran, Cullen and Colombo2009; Takatsuka et al., Reference Takatsuka, Cullen, Wilson and Wratten2009). Tait et al. (Reference Tait, Baskaran, Cullen and Bicknell2011) value the ecological condition of waterbodies using poor, fair, and good quality categories. These categories are described by the type of weeds present, percent algae cover, and the types of insects and fish species present. Swimming suitability indicators of waterbodies have also been used (Marsh, Reference Marsh2012; Miller, Reference Miller2014; Miller et al., Reference Miller, Tait and Saunders2015) to represent recreation and health impacts. Marsh and Phillips (Reference Marsh and Phillips2015) used several different indicators of water quality alongside a qualitative swimming suitability measure, including ecological health, salmon and trout condition, and tributary water quality, which were presented qualitatively as good, satisfactory, not satisfactory, or poor. Translating many of these indicators and qualitative categories to marginal changes, as often predicted from policy projections, is difficult and generally inappropriate. Tait et al. (Reference Tait, Miller, Rutherford and Abell2017) use qualitative indicators like poor, moderate, and good for water clarity and ecological quality. However, they directly link each of their attributes to objective ranges in the underlying water quality parameters. For instance, poor ecological quality is defined as a Macroinvertebrate Community Index score less than 80, and poor clarity is defined as visibility of less than 1.1 m. It is not clear, however, if those quantified ranges were presented to respondents.
Many of these water quality indicators are used in other studies internationally (US EPA, 2015; Johnston and Bauer, Reference Johnston and Bauer2020). The US EPA commonly uses a Water Quality Index (WQI) for valuation work on many rules, which combines several other parameters, including nutrients, pH, temperature, clarity, dissolved oxygen, and sediment (Walsh and Wheeler, Reference Walsh and Wheeler2013). Meta-analysis benefit transfer is used for EPA valuation, which translates studies that use other water quality indicators into the WQI (Johnston et al., Reference Johnston, Boyle, Adamowicz, Bennett, Brouwer, Cameron, Hanemann, Hanley, Ryan, Scarpa, Tourangeau and Vossler2017). The European Framework Directive aims to improve waterbodies to “good” status, as defined by several underlying water quality variables. SP studies there have focused on both individual parameters like nutrients and the bad/poor/good status of waterbodies (Ferrini et al., Reference Ferrini, Schaafsma and Bateman2014; Anciaes, Reference Anciaes2022).
Johnston et al. (Reference Johnston, Schultz, Segerson, Besedin and Ramachandran2012) provide guidelines for including ecological content in SP surveys. They note that less structured treatment of attributes can cause problems with subsequent welfare estimation. Respondents’ internal conceptualization of the commodity may be different from that presented or intended by the researchers. This can be a complicated balancing exercise with water quality because the commodity itself has multiple dimensions that can be difficult to communicate to survey participants. For instance, Milon and Scrogin (Reference Milon and Scrogin2006) explored values associated with wetland restoration by presenting respondents with either functional attributes like water levels or structural attributes like species abundance, with the latter yielding significantly different (higher) WTP. In their paper on contemporary SP guidance, Johnston et al. (Reference Johnston, Boyle, Adamowicz, Bennett, Brouwer, Cameron, Hanemann, Hanley, Ryan, Scarpa, Tourangeau and Vossler2017) recommend that “the change being valued be based on how respondents tend to perceive the good.”
The size of the change presented to respondents creates another issue with applying previous SP estimates for benefit transfer. Miller et al. (Reference Miller, Tait and Saunders2015), for example, have respondents compare policies that result in 0%, 20%, or 40% improvements in the percent of sites suitable for swimming. There are few plausible policies that could improve water quality (or reduce nutrient inputs) by that large of a change. Nonetheless, such large changes, on the order of 20% to 50%, are often applied in SP surveys (e.g., Baskaran et al. [Reference Baskaran, Cullen and Colombo2009]). A meta-analysis of 140 observations from 51 SP studies of water quality (USEPA, 2015), where quality was represented on a scale of 0–100 with 100 representing pristine waterbodies, found that less than 10% of the observations used water quality indices measures under 10 (Figure 1).Footnote 2
Survey design and implementation
We considered all the aforementioned gaps when designing the SP survey instrument used in this study, including: identifying water quality measures and changes that matter to the general public and that could be accurately understood by respondents, are realistic, and that could be directly linked to objective policy-induced changes. The ultimate commodity being valued in the SP survey is improvements in the quality of rivers and streams in the regional council where a respondent resides. The survey was implemented in 2018 and 2019. There are two versions of the survey, one for the North Island and one for the South Island of New Zealand. The surveys are identical except for bar graphs illustrating current attribute levels for each region on that island.
The survey instrument was developed and refined using focus groups and cognitive interviews.Footnote 3 Six focus groups were conducted in total during May and June 2018 at three different locations: two focus groups each at two urban locations (Auckland and Wellington) and two in a rural area (Hawke’s Bay). Input from the focus groups was used to refine the survey text and questions, identify relevant water quality attributes, and improve communication and presentation. To further refine the survey instrument, 10 cognitive interviews were conducted. Eight of the cognitive interviews were in Wellington and two were held in the rural Wairarapa area.Footnote 4
Depending on where a respondent lives, the survey begins with a map of the North or South Island that includes the regions and the major rivers on that island. To emphasize consequentiality (Carson and Groves, Reference Carson and Groves2007) and credibility of the survey, the instructions remind respondents’ that their “… answers will help inform policy makers” and that the baseline data are provided by the MFE and regional council governments. Respondents are then asked questions on recreational use and visitation to rivers and streams in their region, followed by introductory text defining each water quality attribute. Respondents are then shown figures depicting the current baseline levels in their regional council area for each attribute, as well as for the other regions of their island. Questions about awareness of the attributes are also presented. See the full survey example in Appendix D for details.
The policy scenarios, provision mechanism, and payment vehicle for public programs to improve water quality in a respondent’s region are then introduced. To minimize hypothetical bias and enforce consequentiality (Johnston et al., Reference Johnston, Boyle, Adamowicz, Bennett, Brouwer, Cameron, Hanemann, Hanley, Ryan, Scarpa, Tourangeau and Vossler2017; Vossler and Zawojska, Reference Vossler and Zawojska2020), respondents are reminded to act as though their household is actually facing the costs presented and that their responses could influence future policies and programs, as well as costs to their household. A generic regional policy is described as the provision mechanism.
The payment vehicle is specified as a permanent increase in a household’s general cost of living. More specifically, respondents are told that their monthly cost of living would change due to increases in home maintenance costs, utility bills, rent, and/or the price of food and other goods.
The survey then presents respondents an example choice question, followed by three separate discrete choice questions. Each choice scenario includes a status quo option and two policy alternatives. In the status quo option, water quality attributes remain at their current levels, while in the two policy options there are improvements in one or more of the water quality attributes, as well as an associated permanent increase in monthly living costs. Respondents are instructed to consider each choice question independently.
The final survey instrument includes three water quality parameters: water clarity, nutrients, and E. coli. These parameters appear in several upcoming NZ freshwater policies and are well known to the public (MFE, 2020). Statistics NZ includes these parameters in their list of central water quality-tracking indicators (see https://www.stats.govt.nz/indicators/). The representation of each parameter is also chosen to match policy levers and is informed by input from the focus groups and cognitive interviews. Water clarity is expressed as the average visibility for rivers and streams in a respondent’s region and is measured as Secchi disk depth. Nutrients are measured as the percent of rivers and streams in the region that have nutrient levels considered acceptable for aquatic life by official nutrient limits.Footnote 5 Similarly, E. coli is measured as the percent of rivers and streams in the region where concentrations are low enough to be considered suitable for swimming, wading, and fishing.Footnote 6 Each of these attributes was introduced with several sentences of explanation. For instance, with nutrients, respondents were told that although nitrogen and phosphorous are naturally occurring, too much can lead to excessive algal growth that harms underwater habitat (the full descriptions appear in the appendix). The text in the explanations was refined through the focus groups and cognitive interviews.
While the three water quality attributes described are correlated and the ecological end points that individuals directly care about may relate to more than one attribute in some cases (Tait et al., Reference Tait, Baskaran, Cullen and Bicknell2011), distinctions are drawn in the survey as to what each attribute primarily reflects. Excessive levels of nutrients are described as adversely impacting aquatic ecosystems, although reduced esthetics are also mentioned. Water clarity is described as how “murky or cloudy” the water visually appears. E. coli is described in terms of how it affects the health of people who swim, wade, or fish in the water. Based on the past literature and our own focus group and cognitive interview findings, these categories reflect what the general public find to be the most relevant end points related to surface water quality.
The size of the water quality changes presented to respondents are based on the magnitude of changes in national targets from the National Policy Statement for Freshwater Management (NPSFM; (MFE, 2020), p. 64). For example, Table 1 shows the national targets for improvements in primary contact suitability across different waterbody categories, with red being the lowest quality rivers and blue being the highest quality rivers (the full NZ Government figure this is drawn from appears in Appendix C). For example, between 2017 and 2030, the goal is to have a five percentage point reduction in rivers in the worst category (red) and a three percentage point increase in rivers classified in the highest category (blue). These changes are smaller than the scenarios presented in many previous SP surveys (US EPA, 2015).
Each attribute and the posited changes in the attribute levels in the survey are presented in Table 2. A Bayesian efficient design was developed for the choice questions using Ngene software (ChoiceMetrics, 2018). Although there is not much information on the priors for each coefficient, the sign of each parameter was informed by past NZ-based literature (Marsh et al., Reference Marsh, Mkwara and Scarpa2011; Tait et al., Reference Tait, Baskaran, Cullen and Bicknell2011, Reference Tait, Miller, Rutherford and Abell2017; Marsh and Phillips, Reference Marsh and Phillips2015). One advantage of Bayesian efficient designs is that they are more robust than other designs to mis-specification of the priors (ChoiceMetrics, 2018).
The SP survey module concludes with a series of questions to gauge the respondent’s perceived consequentiality of their responses and flag individuals potentially exhibiting protest and warm-glow behaviors. Such questions are used to screen the sample of respondents exhibiting potentially biasing behaviors and assess the robustness of our results. The broader survey includes socioeconomic questions to allow us to examine preference heterogeneity and possibly further tailor such heterogeneity when extrapolating benefit estimates to the broader population.
An example choice question from the survey appears in Figure 2.
The survey was administered online by Horizons Research as a separate module in a broader survey on environmental preferences in NZ that is implemented every few years by Lincoln University (Hughey et al., Reference Hughey, Kerr and Cullen2019). Horizon Research maintains an internet panel of approximately 7000 people. The survey was open from March to April 2019, and 2007 respondents participated. When compared to population data from the Census, our survey sample overrepresents individuals 60–69 years of age, those with a tertiary education, and urban populations, and underrepresents individuals 18–19 years of age, people with only high school qualifications, and rural populations.Footnote 7
A key assumption when extrapolating survey results is that the estimates are representative of the general population. We attempt to bolster this assumption in two ways in order to provide estimates that can be more defensibly applied in benefit transfer exercises. First, we weight responses by regional council area population totals to better reflect the spatial distribution of the population across New Zealand, as well as any systematic differences in those populations and their preferences (see section “Data” for details). Second, in our more comprehensive models, we parametrically control for preference and income heterogeneity and then use the population data from the Census to predict more representative average WTP estimates (see sections “Results” and “Policy illustration” for details). Despite our best efforts, it is nonetheless important to keep this key assumption in mind when interpreting and extrapolating our results.
Methodology
We employ a random utility model (RUM) framework to analyze the data from this discrete choice experiment. In these models, utility is divided into a deterministic component and a random component, represented by v(.) and ϵ, respectively. The utility that household i receives from alternative j is
This specification assumes that the first component of utility is a function of the group of attributes defining each alternative ${a_j}$ , along with numeraire consumption ( ${I_i} - {C_j}$ ), which is the difference between household income ${I_i}$ and the cost of the alternative ${C_j}$ .
In the empirical models, we also add a status quo constant ( $sq{c_i}$ ), which represents respondents’ preferences for or against the status quo option in general, irrespective of the attributes defining the alternative policy options. We allow for unobserved heterogeneity in the $sq{c_i}$ by estimating it as a normally distributed random parameter in a mixed logit framework. In doing so, we accommodate both respondents that have a bias toward or against the status quo (Moore et al., Reference Moore, Guignet, Dockins, Maguire and Simon2018).
We assume a linear specification for v(.). RUMs are often estimated as conditional or mixed logit specifications (Greene, Reference Greene2000; Haab and McConnell, Reference Haab and McConnell2002). The conditional probability that household i would choose alternative j appears in equation (2):Footnote 8
In this formulation, n refers to alternative options in a given choice occasion, D is an indicator variable denoting the status quo alternative, $\beta $ is a vector of coefficients to be estimated, and δ is the coefficient on the cost attribute. δ can be interpreted as the negative of the marginal utility of income. We explore individual-level preference heterogeneity in two ways. First, we include several interaction terms between the main choice attributes and observed household characteristics, including household-specific socioeconomic variables like income and household size, recreational user-related variables, and baseline regional water quality. Second, we explore possible unobserved preference heterogeneity by allowing $\beta $ to vary as a random coefficient across respondents with each element of $\beta $ following an independent normal distribution (Mariel and Artabe, Reference Mariel and Artabe2020).Footnote 9 The cost parameter δ is held fixed to ensure MWTP has defined moments (Layton and Brown, Reference Layton and Brown2000, Revelt and Train, Reference Revelt and Train2001, Daly et al., Reference Daly, Hess and Train2012). Alternate approaches estimate models in WTP space, in which the distribution of welfare is modeled (Train and Weeks, Reference Train and Weeks2005, Scarpa et al., Reference Scarpa, Thiene and Train2008). However, the latter approach is not always found to fit the data as well, and there are computational challenges with estimation (Johnston et al., Reference Johnston, Boyle, Adamowicz, Bennett, Brouwer, Cameron, Hanemann, Hanley, Ryan, Scarpa, Tourangeau and Vossler2017).Footnote 10 The mixed logit models and subsequent calculations are estimated using Stata statistical software (StataCorp, 2021).
Welfare measures can be inferred from the estimated parameters. For example, under the linear specification in equation (2), the vector of household marginal willingness to pay (MWTP) estimates can be calculated as:
Given a projected policy change in the attribute levels from the baseline of ${a^0}$ to ${a^1}$ , and based on our assumed linear functional form, we can calculate the nonmarginal welfare change for a household as:
Notice that we exclude the $sq{c_i}$ estimates from our welfare calculations. This status quo term captures a respondent’s tendency to favor or disfavor the status quo option irrespective of the improvements and costs defining the alternative policy options. The status quo term could therefore be capturing alternative, omitted variable biases that would otherwise confound the welfare parameter estimates of interest (Johnston et al., Reference Johnston, Boyle, Adamowicz, Bennett, Brouwer, Cameron, Hanemann, Hanley, Ryan, Scarpa, Tourangeau and Vossler2017; Moore et al., Reference Moore, Guignet, Dockins, Maguire and Simon2018). For example, $sq{c_i}$ could be capturing the “warm-glow” associated with choosing environmental action. That choice, in and of itself, however, is not directly related to the proposed policy change or attribute levels. Alternatively, this term could be partially capturing legitimate preferences for or against a policy and in that sense could be included it in welfare calculations. We speculate that $sq{c_i}$ likely captures both effects in most applications.
Nonetheless, the appropriateness of including $sq{c_i}$ in welfare calculations for benefit analysis remains unclear for two reasons. First, what respondents perceive to result from a policy option outside of the specified attribute changes is unobserved to us as the researchers. Respondents could be considering changes in end points that are in no way related to policies of interest. Second, our primary objective is to provide estimates for future benefit transfer. As is the intent here, government agencies often transfer primary study estimates to numerous, iteratively implemented policies (Petrolia et al., Reference Petrolia, Guignet, Whitehead, Kent, Caulder and Amon2021). This is due to the high costs of conducting appropriate original studies and a desire to streamline benefits analyses. As is the case in most of the SP literature, respondents in our study are asked to independently evaluate choice occasions where only a single policy, at most, would be implemented. Even if $sq{c_i}$ captured only legitimate preferences against the status quo, or for a policy – an assumption we do not believe would ever hold in reality – any WTP calculations that include $sq{c_i}$ could not be validly transferred to subsequent policy changes.Footnote 11 To be conservative and ensure that the welfare calculations are as unbiased as possible, we exclude $sq{c_i}$ from the welfare calculations.
Data
Of the 2007 respondents that took the survey, 1736 completed all three choice questions (86%), 26 completed two (1.3%), and 7 respondents completed only one (<1%). The remaining 238 respondents (12%) did not respond to the choice questions and are excluded from the analysis. Among the 1769 respondents that answered at least one choice question, 73% are from the North Island (especially Auckland, Bay of Plenty, Waikato, and Wellington), 24% of the respondents are from the South Island (and in particular, Canterbury), and 3% did not provide their region (Table 3).
Note: Among the n = 1769 respondents, 59 did not provide information on the region where they live and are excluded from the above table.
To reduce the potential influence of biasing behaviors sometimes associated with SP methods, we screen the sample based on a combination of responses to the choice and debriefing questions. Based on the criteria below, we identify and flag respondents as potentially exhibiting the following behaviors:
-
Consideration of other waters omitted from the choice experiment: Respondents who disagreed with the statement that they only considered rivers and streams in their region.
-
Hypothetical bias due to warm-glow: Respondents who always chose the highest cost option in each choice question they were presented, and who agreed with the statement that it is important to improve water quality no matter how high the costs.
-
Treated responses as nonconsequential: Respondents who disagreed with the statement that they made their choices as if the presented water quality improvements and increased costs would actually be experienced.
-
Protest response: Respondents who always chose the status quo option and who agreed with one of the following statements: (i) that they value water quality improvements but their household should not have to pay for it, or (ii) that they are against more regulations and government spending.
Table 4 shows how the sample size changes as we screen out respondents exhibiting responses that one would expect to bias MWTP upward (going from left to right), and that could bias MWTP downward (going from top to bottom). The diagonal displays the sample sizes as we treat potential upward and downward biases symmetrically (Banzhaf et al., Reference Banzhaf, Burtraw, Evans and Krupnick2006; Moore et al., Reference Moore, Guignet, Dockins, Maguire and Simon2018). The upper left-corner shows the full sample size of 1769 respondents who answered at least one choice question and the bottom-right corner shows that 1364 respondents remain after fully screening out those who were flagged as potentially exhibiting biasing behaviors.
When estimating the regression models, observations are weighted to account for differences in sampling intensity, response rates, and sample screening across regions. We weight the observations in our regression models to ensure that the sample appropriately represents the population across the regions, which in turn allows interpretation of the estimates as national averages. The weight assigned to each respondent is the total population in their council region divided by the region-specific sample size after screening.
The survey also included several questions about respondents’ recreational activities in rivers and streams and respondents’ awareness of existing water quality levels in their region. Respondents were asked about activities they did at rivers and streams in their regional council area in the last 12 months and could choose multiple options from the following categories:
-
Swimming or wading
-
Fishing
-
Boating, including sailing, and motor boating
-
Water skiing, jet skiing, or kayaking
-
Actively viewing nature (e.g., bird watching)
-
Biking or walking on trails/paths alongside the water
-
I didn’t visit rivers or streams in my regional council area in the last 12 months
The responses to these questions were aggregated into three user categories: contact users (including water skiing, jet skiing, or kayaking, swimming, or wading), non-contact users (including fishing, sailing, or motor boating), and passive users (those actively viewing nature, biking, or walking). Respondents can fall into more than one user category. After respondents were presented with baseline graphs and explanations of each water quality parameter, they were asked if they were aware of the characteristics or impacts of nutrients and E. coli, and whether clarity levels met their expectations. Table 5 summarizes the percent of respondents that fall into the user categories and percent of respondents who were aware of existing water quality levels.
There is some noticeable variation in how respondents use the rivers and streams in their regions (Table 5). For example, Nelson and Marlborough are areas known for their beaches and coastal amenities and so it is no surprise that a high proportion of respondents engage in water contact recreation in rivers and streams as well. Respondents in the West Coast Region also had very high participation in recreation, although the number of respondents there was small (see Table 3).
Results
Regression results
Results from the econometric models estimated using the fully screened sample of respondents are presented in Table 6. The first column shows the results from our base model that includes only the water quality attributes, the cost parameter, and a status quo constant (SQC), with standard errors appearing in parentheses. In this model and each of the subsequent variations, the coefficients corresponding to the water quality variables are treated as random parameters. In models (2)–(5), additional variables are interacted with the water quality variables. The coefficients on those interaction terms are held fixed. In essence, the interaction terms capture observed heterogeneity by shifting the distributions of the random coefficients, which capture any unobserved preference heterogeneity.
Note: Standard errors appear in parentheses. ***, **, and * denote significance at the 99%, 95%, and 90% levels, respectively.
The positive and statistically significant coefficients corresponding to the water quality attributes in Model (1) suggest that respondents are more likely to choose an option with larger improvements in water clarity, and higher proportions of waters meeting the government standards for nutrient and E. coli levels. The coefficient corresponding to the cost attribute is negative and significant, suggesting that respondents are less likely to choose an option as costs increase, which is consistent with a positive marginal utility of income. Finally, the SQC is negative and statistically significant, suggesting a tendency for respondents, on average, to favor a policy option over the status quo, irrespective of the improvements and costs defining those policy options. Such potentially biasing tendencies are controlled for by the inclusion of the SQC and are not included in subsequent welfare calculations. The large and statistically significant standard deviation estimate for the SQC suggests significant heterogeneity across respondents. Additional models (Appendix B) were also estimated that included interactions between regional council dummies and the SQC. The majority of these interactions were insignificant. The statistically significant standard deviation terms for the water quality attributes in model 1 suggest significant unobserved heterogeneity in preferences for water quality across respondents. In the subsequent models, we add interaction terms to try and better explain some of this preference heterogeneity.
Model (2) adds interaction terms between each water quality attribute and (i) the corresponding region-specific baseline level of that attribute and (ii) a measure of the quantity (total length) of rivers in the regional council area. Both attributes were presented to respondents in the survey information (see Appendix). The nutrients improvement interaction with river km is positive and significant, while the clarity interaction with river km is negative. The clarity result goes against initial expectations that WTP would increase with the quantity of waters that experience an improvement but also may reflect the importance of substitutes. Perhaps respondents do not care about clarity improvements as much if they live in areas where there is an abundance of rivers to choose from. On the other hand, this finding may also reflect differences in preferences between urban and rural areas. Two of the three largest cities in New Zealand are in the Auckland and Wellington regions, which have comparatively low total lengths of rivers (see Appendix). The positive coefficient corresponding to the nutrients and river km interaction term provides some evidence of scope sensitivity – that is, respondents’ WTP is increasing for improvements that occur to a greater quantity of waters. The interaction terms with baseline quality levels in Model (2) are generally statistically insignificant. The one exception is the positive and marginally significant coefficient for the E. coli interaction. This may reflect a desire to maintain quality in already relatively pristine areas and/or capture systematically different preferences across regions (i.e., people who value water quality greater tend to live in areas with better quality). Dissanayake and Ando (Reference Dissanayake and Ando2014) find a similar result of potential locational sorting in their SP study of grassland restoration.
Models (3)–(5) include additional interactions with the user-related variables. Across these models, non-contact users, or people that fish and boat, are willing to pay less for improvements in each water quality parameter (relative to nonusers, the omitted category). Although excess nutrients are generally bad for aquatic environments (especially in large levels), some fisherwomen and men may believe that more nutrients equal more fish. While some species do benefit from additional nutrients, those benefits stop after a certain point (National Research Council, 2000). The negative coefficient may also reflect more general differences in preferences between contact, non-contact, and nonusers. The positive coefficient estimates on E. coli*Passive and Clarity*Passive suggest that passive users have a greater preference for reductions in E. coli contamination and improvements in clarity relative to nonusers (all else constant). These results are not completely robust across models (3)–(5), however.
Model (4) includes interaction terms between each water quality attribute and (i) an indicator for achieving at least a bachelor’s degree and (ii) with variables describing respondents’ awareness of the negative effects of elevated nutrient and E. coli levels, and of current clarity levels.Footnote 12 The results from the previous models are robust. We find a positive and significant interaction with Bachelors and Clarity, suggesting that more educated respondents value improvements in clarity more. Otherwise, there is no evidence of preference heterogeneity with respect to education.
The coefficients corresponding to the Nutrients*Aware and Clarity*Aware interaction terms are positive and significant, while the E. coli*Aware coefficient is significant and negative. These finding suggest that respondents who are aware of the negative impacts of nutrients and whose priors for clarity-matched current levels are willing to pay more, while those informed of the negative effects of E. coli are willing to pay less. The descriptive statistics (Table 5) show that awareness of E. coli’s negative effects was much lower than the other measures in every region, with less than 50% in all but three regions.
Model (5) includes all the previous variables and an interaction term between a dummy variable denoting high-income earners and the cost parameter. That interaction term is insignificant (as was a low-income interaction in an alternate model), indicating that the impact of policy cost does not vary across respondents of different income levels. The results from Model (5) are mostly consistent with the earlier models in terms of the signs and significance of coefficients. However, the Clarity*Passive and Clarity*Baseline variables are now significant at the 10% level.
At the bottom of the table, the estimated standard deviation terms remain statistically significant and similar in magnitude across all the models, suggesting that there is still unobserved preference heterogeneity across respondents despite our best efforts to identify and control for the sources of such heterogeneity. Comparisons of the AIC and BIC criteria across all models support Model (5), the most complex model in terms of included covariates, as the best overall model in fitting the data.Footnote 13
WTP estimates
To illustrate the practical implications of the econometric results, Table 7 contains the marginal WTP estimates for the first (1) model specification. That model did not include interaction variables, so (assuming the weighted sample of respondents is representative of the population) the calculated marginal WTP values represent national household averages. Results indicate that people are willing to pay up to $25.30 annually ($NZ, 2018) for a one percentage point increase in regional council rivers meeting nutrient standards and are thus considered acceptable in terms of ecological health. Results also suggest that respondents hold an average annual marginal WTP (MWTP) of $12.10 for a 10-cm increase in average river water clarity. Finally, we see a $26.25 annual MWTP for a one-percentage point increase in the quantity of rivers within a region that meet E. coli standards and are therefore deemed safe for swimming.
Notes: *p < 0.10, ** p < 0.05, *** p < 0.01. Standard errors in parentheses.
The other estimated models include several interaction terms with variables that allow the MWTP to vary across regions. The region-specific values for baseline water quality and quantity levels can be plugged directly into the parameterized model to predict region-specific MWTP estimates. In later models, regional population averages (or proportions) based on the NZ census are entered in for the sociodemographic characteristics. Finally, sample proportions of respondents in each region falling under the different user and awareness categories are used to estimate the region-specific population percentages and are in turn plugged into the parameterized model.
The “average” MWTP values for each region and water quality attribute based on estimates from Model (5) appear in Figure 3 along with their 95% confidence intervals. For example, the circles in the first panel show that a region-wide average one-percentage point improvement in rivers meeting the nutrient criteria is valued in the range of (a statistically insignificant) $2.98 in Nelson to $31.31 in Canterbury. The second panel in Figure 3 shows the MWTP values for a region-wide 10-cm increase in average clarity. These values range from $83.04 to $291.51. The final graph in the figure depicts the MWTP estimates for a percentage point increase in regional waterbodies meeting their E. coli criteria, ranging from $1.09 to $17.93.
Overall, the results show that people are willing to pay positive amounts for improvements in water quality, on average, with notable differences across regions and parameters. There are some statistically significant differences between regions, such as between Canterbury’s MWTP for nutrients and Marlborough and Nelson’s MWTP.Footnote 14 However, there is no statistically significant difference between the MWTPs that are closer to the middle of the range, like Auckland and Bay of Plenty. The MWTP in a region can also vary across the different water quality attributes. Canterbury, for instance, has the highest value for nutrients, but one of the lowest values for clarity.
Policy illustration
To demonstrate how these values might be applied in a policy setting, we perform a benefit transfer on a simulated national water quality improvement that was previously modeled by the National Institute of Water and Atmospheric Research (NIWA) (Hicks et al., Reference Hicks, Greenwood, Clapcott, Davies-Colley, Dymond, Hughes, Shankar and Walter2016, Reference Hicks, Semadeni-Davies, Haddadchi, Shankar and Plew2019). Sediment was identified as a high-priority freshwater contaminant to manage. The National Policy Statement for Freshwater Management (NPSFM) did not previously have sediment as a target, so the MFE was interested in identifying the impact of proposed catchment sediment load limits (MFE, 2020). Catchment load limits could be achieved through land use conversions (such as converting erodible pasture into forestry) and other erosion best management practices aimed at reducing sediment from reaching waterbodies. Both in-stream sediment criteria and clarity criteria were formulated that would meet nationwide “bottom lines” in each of these four primary state variables.Footnote 15 We use the NIWA modeling data on clarity that project feasible improvements in clarity as a result of catchment load limits. The modeling identifies streams and rivers with a median clarity, that is, below the threshold, and simulates the potential improvement from the practices aimed to reduce catchment sediment loads.
The water quality improvements for each stream/river reach or segment s are weighted its length (Length sr ), and then added together to get the reach-weighted average clarity change for each regional council area r; as in the following equation, where ${N_r}$ denotes the total number of river segments in region r:
A summary of these average clarity improvements for each regional council area appears in Figure 4. Most of the regional councils see a small average change in clarity, of under 0.1 m, with the largest change in Waikato, at 0.154 m. These changes are proportionally smaller than the changes desired by the national policy statement, pictured in Appendix C, with some changes smaller than those presented in our choice experiment questions. This exercise further illustrates the difficulty in achieving long-term goals for water quality.
Although we have data on changes in clarity, we also need changes in nutrients and E. coli. This is a common problem with monetizing water quality policy: the need to convert between different parameters (Walsh and Wheeler, Reference Walsh and Wheeler2013; US EPA, 2015). It is likely that the policies used to improve sediment or clarity will also improve E. coli and nutrients. For instance, to achieve sediment load targets, Neverman et al. (Reference Neverman, Djanibekov, Soliman, Walsh, Spiekermann and Basher2019) simulate the impact of whole farm planning and afforestation, which will also improve E. coli and nutrient leaching to waterways.
To calculate the subsequent changes in E. coli and nutrients associated with the clarity improvements, we use data from NZ Statistics, who publish modeled segment-level data on several water quality parameters, resulting in almost 600,000 observations for each parameter.Footnote 16 For E. coli, total nitrogen, and total phosphorous, we use a regression to model the reduced-form relationship between each indicator (WQ) and clarity, as shown in equation (6). Several control variables are included in X: elevation and dummies for stream order and the dominant surrounding land cover. The regression also includes regional council fixed effects, and the E. coli regression includes dummies for the baseline “letter grade” of the stream (Appendix C). Note that to properly model the ecological relationship between these variables, a more in-depth approach should be used. However, for the purposes of this benefit transfer, these reduced-form regressions establish a reasonable relationship:
Regression results appear in Appendix E and exhibit significant negative relationships between the natural log of clarity and each indicator. The estimated relationship with E. coli is slightly lower than Davies-Colley et al. (Reference Davies-Colley, Valois and Milne2018); however, they use a simple correlation coefficient of −0.54. Our estimated coefficients are used to translate the change in clarity into the other indicators. Using the government thresholds referenced above, we can then determine which changes result in a segment moving from exceeding the threshold to not exceeding. For instance, with the E. coli government criteria, river and stream segments are assigned a letter grade from A to E, with D and E being unsafe for swimming.Footnote 17 If the forecast E. coli change bumps the waterbody from unsafe for swimming to safe, it is counted as no longer exceeding the unsafe threshold. The TP and TN results are combined into a nutrients indicator so that if either exceeds its threshold the waterbody is still counted as exceeding the acceptable limit for nutrients. The projected increase in rivers meeting the safe/acceptable E. coli and nutrient criteria appear in Figure 5. The values in Figures 4 and 5 highlight the difficulty in achieving meaningful water quality changes. Only 3–5 regional councils in each graph see water quality changes that are within the scope of the attributes presented in our survey (see Table 2).
The average clarity changes are monetized using the results of our preferred Model (5), the regional council level averages of the relevant interacted variables for each model, and the number of households from Stats NZ.Footnote 18 Based on equation (4), the annual benefits for regional council area r in year t is calculated for each water quality attribute, using clarity as an example in equation (6), where the middle term is the mean clarity (or E. coli or nutrients) change from equation (5):
where q denotes the corresponding water quality attributes (clarity, nutrients, or E. coli), and $H{H_{rt}}$ is the number of households in regional council area r in year t. The estimated average annual benefits at the household level from the change in each quality attribute appear in Table 8, with regional council-level benefits in Table 9. Waikato had the largest estimated clarity changes (Figure 5), which translate into the highest household-level benefits for two of three water quality attributes (Table 8). The household-level nutrient benefits for Waikato are notably higher than others due to the large MWTP for nutrients and the largest nutrients policy change. At the regional council level (Table 9), the largest total benefits accrue in Waikato, with approximately $46.6 million in benefits in our preferred model. Marlborough had the lowest annual benefits at $45 thousand.
The estimated national annual benefits of this policy change is approximately $115 million (Table 9). To illustrate the sensitivity of our overall estimates to model choices, Table 10 contains the total national benefits based on each model. The table shows the highest benefit estimates from the more parsimonious models (1) and (2). Our preferred model, the model that best fit the data based on BIC and AIC, yields total benefit estimates squarely in the middle of all our specifications.Footnote 19
Discussion
Overall, our estimated choice model results show consistent positive values for several dimensions of water quality, and a benefit transfer exercise demonstrated substantial benefits from even small water quality improvements. It is worth noting that although we calibrated the attribute levels in our choice sets based on official government targets, many of the regional council-level changes in our policy simulation were still below those levels. This further highlights the difficulty in achieving policy-relevant changes in water quality in practice.
To further put these results into context, our estimates can be compared to recent New Zealand policy action on water quality. Auckland Council recently implemented a vote on additional taxes to improve water quality in that region.Footnote 20 The goal of the additional taxes was to raise $400 million over 10 years through taxes on residential and business properties, as summarized in Table 11. The funds would be used for new stormwater infrastructure and other policies and programs dedicated to reducing wastewater, sediment, and other pollution.Footnote 21 The vote passed with approximately 65% of people voting for the rates. The vote presented residents with a choice of the status quo versus a water quality tax where both water quality and household costs would increase. This revealed preference setting overlaps with our SP discrete choice experiment. Many of the same water quality issues apply, such as reduced beach closures, reduced septic tank overflows, reduced fecal contamination, reduced sediment contamination, rehabilitation of urban and rural streams, and better stormwater infrastructure. In material distributed from the council about the targeted rate, the council noted that an average valued home would pay an additional amount of $66 per year.Footnote 22 That vote directly illustrates a positive WTP for water quality improvements in the Auckland Region and reinforces the plausibility of our estimates. Our policy illustration suggested the average Auckland resident is willing to pay NZ $74.53 ($49.94 USD) per year for moderate improvements in clarity, nutrients, and E. coli. Without details on projected water quality improvements to result from the Auckland tax, we cannot carry out a formal test for convergent validity, but this comparison does lend credibility to our SP-based estimates.
Conclusion
This paper reports the results of a choice experiment that focused on three water quality parameters: nutrients, clarity, and E. coli. The choice experiment was administered to a national sample of respondents with the goal of instituting a rigorous study aimed at future benefit transfer. Several aspects of our approach should serve as a guide for future studies with a similar goal. First, the study used water quality measures that are not only salient and understandable to respondents, but that can be directly linked to policy for analysis. This was done by focusing on parameters that people care about, are relatively straightforward to communicate, and are relevant policy levers. Each of the three measures in our survey are targeted by the New Zealand government’s water quality goals. Our experimental design also posed water quality changes that are more in line with the size of improvements experienced or projected from actual policy. Many previous studies have analyzed large changes in water quality that would be difficult to achieve in reality. Finally, the choice experiment focused on water quality improvements in freshwater rivers and streams at the regional council level. Since regional councils are typically responsible for implementing water quality policies passed by the central New Zealand government, this represents a realistic management unit. Furthermore, New Zealand is unique in that its regional council borders are aligned with catchment boundaries, so there is very little cross-border pollution.
Across several model specifications, we find significant and positive values for improvements in all three water quality parameters. Our results also suggest that WTP varies with the types of recreation that a user engages in and across regions, as well as education and existing knowledge about water quality. Although not always the case (e.g., (Breffle and Morey Reference Breffle and Morey2000)), the environmental SP literature has generally suggested that experience and familiarity with a commodity is associated more stable or well-defined preferences, as well as higher WTP values (Boyle et al., Reference Boyle, Welsh and Bishop1993; Adamowicz, Reference Adamowicz1994; Whitehead et al., Reference Whitehead, Blomquist, Hoban and Clifford1995; Cameron and Englin, Reference Cameron and Englin1997; Hanley et al., Reference Hanley, Kriström and Shogren2009; Dissanayake and Ando, Reference Dissanayake and Ando2014; Czajkowski et al., Reference Czajkowski, Hanley and LaRiviere2015). Some of our findings are consistent with the latter; for example, our results suggest that marginal utilities for reduced nutrients and improved clarity increase with awareness and familiarity of these attributes. We also find that passive users (i.e., those that enjoy nature viewing, biking, or walking) have a higher value for reduced E. coli contamination and improved water clarity compared to less experienced nonusers. But at the same time, we also find that awareness of the negative effects of E. coli is associated with a lower marginal utility for improvements. Despite our best attempts to parametrically control for experience, familiarity, and other potential sources of heterogeneity, our random parameter specifications reveal significant unobserved preference heterogeneity that is accounted for but remains unexplained.
The utility of the results are demonstrated using a policy simulation of water clarity improvements based on recent government modeling (Hicks et al., Reference Hicks, Semadeni-Davies, Haddadchi, Shankar and Plew2019) aimed at achieving catchment-level sediment load targets. This exercise highlighted the difficulty in specifying the size of the water quality changes on a survey, as several of the simulated regional council-level improvements in water quality were still lower than the changes specified in our experimental design. We estimate the changes in clarity, E. coli, and nutrients associated with those sediment reductions across New Zealand and apply our results in a benefit transfer exercise. The estimated annual average national benefits of a fully implemented policy are approximately NZ $115 million ($77 million USD) using our preferred model. Although we do not have estimates of the costs for those improvements, the monetized benefits should serve as a useful comparison. The benefit transfer exercise we demonstrate is straightforward and should be applicable to many upcoming policy proposals put forth by the central and regional council governments in New Zealand. The water quality indicators use here are also targeted by other governments, so the results should be more broadly applicable. For instance, both the US and European Union target Nutrients, clarity, and E. coli using standards through EPA rules (US EPA, 2010, 2015) and the Water Framework Directive (Perni et al., Reference Perni, Barreiro-Hurlé and Martínez-Paz2020).
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/age.2023.20
Data availability statement
The data that support the findings of this study were obtained in a confidential survey administered by Manaaki Whenua. Restrictions apply to the availability of these data, which were used under license for this study. Data are available from the authors only with the permission of Manaaki Whenua.
Acknowledgments
We would like to thank Geoff Kerr, Ronlyn Duncan, and Chris Moore for comments on earlier versions of the manuscript.
Funding statement
This work was partially funded by Ministry of Business, Innovation, and Employment-funded program “Smarter Targeting of Erosion Control” (grant contract # CO9X1804), as well as Manaaki Whenua-Landcare Research funding. The views expressed in this paper are those of the author(s) and do not necessarily represent those of the US Environmental Protection Agency (EPA). In addition, although the research described in this paper may have been funded entirely or in part by the US EPA, it has not been subjected to the Agency’s required peer and policy review. No official Agency endorsement should be inferred.
Competing interests
The authors do not have any competing interests.