Introduction
Large wild herbivores are primary consumers that play an important role in ecosystems and provide a substantial economic resource for many communities. However, human land use has caused fragmentation, degradation and loss of habitat for these species (Ceballos & Ehrlich, Reference Ceballos and Ehrlich2002). Furthermore, as a result of more effective hunting techniques, overexploitation has become the most important threat after habitat destruction for the survival of large herbivores (Groom, Reference Groom2006). Consequently many ungulates in Asia are confined to protected areas and are limited to small populations (Baskin & Danell, Reference Baskin and Danell2003). Thus conservation actions are required to ensure the long-term survival of these animals.
One of the most important factors in successfully managing and conserving a species is accurate identification of its distribution (Boitani et al., Reference Boitani, Sinibaldi, Corsi, De Biase, Carranza and Ravagli2008). To accomplish this we must understand how environmental factors determine the habitat of a species. Advances in habitat suitability modelling techniques combined with geographical information systems provide accessible tools for identifying suitable habitats and predicting potential species distribution (Anderson et al., Reference Anderson, Lew and Peterson2003; Gavashelishvili & Javakhishvili, Reference Gavashelishvili and Javakhishvili2010). However, predictions of species distribution can vary significantly with different modelling techniques (Thuiller et al., Reference Thuiller, Araújo, Pearson, Whittaker, Brotons and Lavorel2004; Pearson et al., Reference Pearson, Thuiller, Araújo, Martinez-Meyer, Brontons and McClean2006) and it is difficult to select the most realistic. Araújo & New (Reference Araújo and New2007) advocated the use of ensemble forecasting, which often generates a more robust prediction than a single model (Araújo & New, Reference Araújo and New2007; Marmion et al., Reference Marmion, Parviainen, Luoto, Heikkinen and Thuiller2009; Thuiller et al., Reference Thuiller, Lafourcade, Engler and Araújo2009; Oppel et al., Reference Oppel, Meirinho, Ramírez, Gardner, O'Connell, Miller and Louzao2012).
The sambar deer Rusa unicolor is a large ungulate distributed throughout South and South-East Asia (Leslie, Reference Leslie2011). Although this species became a pest after introduction to countries such as Australia (Gormley et al., Reference Gormley, Forsyth, Griffioen, Lindeman, Ramsey, Scroggie and Woodford2011), populations in its native range have declined, with many local-level extinctions as a result of extensive hunting and habitat loss (Timmins et al., Reference Timmins, Steinmetz, Sagar Baral, Samba Kumar, Duckworth and Anwarul Islam2008). It remains regionally abundant only in well-secured (i.e. protected or remote) areas. The sambar deer is categorized as Vulnerable on the IUCN Red List (Timmins et al., Reference Timmins, Steinmetz, Sagar Baral, Samba Kumar, Duckworth and Anwarul Islam2008). Although some countries, such as Thailand, have banned hunting, the recovery rate of sambar deer populations remains slow, requiring further evaluation of population distributions and dynamics (Steinmetz et al., Reference Steinmetz, Chutipong, Seuaturien, Chirngsaard and Khaengkhetkarn2009).
The Formosan sambar deer R. unicolor swinhoii is a subspecies endemic to Taiwan (Wilson & Reeder, Reference Wilson and Reeder2005). It is categorized as a rare and valuable species in the List of Protected Species in Taiwan (Forest Bureau of Taiwan, 2009). In the decades before the 1990s there was a decline in the number and geographical distribution of the Formosan sambar deer in Taiwan, reflecting similar trends in other areas of the sambar deer's range. However, in the mid 1990s there was a slight increase in Formosan sambar deer populations in Taiwan (Timmins et al., Reference Timmins, Steinmetz, Sagar Baral, Samba Kumar, Duckworth and Anwarul Islam2008). Surveys indicated that Formosan sambar deer were mainly distributed in the Central Mountain Range (Wang et al., Reference Wang, Wang, Kuo, Wu, Chen and Tsai2002; Fig. 1a), although the surveys were limited in effort relative to the size of the island.
Several studies in India (Porwal et al., Reference Porwal, Roy and Chellamuthu1996; Kushwaha et al., Reference Kushwaha, Khan, Habib, Quadri and Singh2004) and Australia (Forsyth et al., Reference Forsyth, McLeod, Scroggie and White2009; Gormley et al., Reference Gormley, Forsyth, Griffioen, Lindeman, Ramsey, Scroggie and Woodford2011) have described the distribution patterns and habitat selection of sambar deer. In Taiwan suitable habitat for the species lies above 1,000 m (Wang et al., Reference Wang, Wang, Kuo, Wu, Chen and Tsai2002). In addition, the climate and vegetation types are subject to variation along the elevation gradient and are different to those of other countries. Furthermore, Taiwan has high population and road densities, which have caused extensive habitat destruction. Therefore investigations of habitat selection by sambar deer in highly disturbed areas are required. Our aims in the study reported here were to evaluate habitat suitability and to map the distribution pattern of Formosan sambar deer throughout Taiwan. We use the results to identify the most important sites for the management of this threatened subspecies.
Study area
The 35,801 km2 island of Taiwan lies off the south-eastern coast of mainland China. Topography is high and steep, with five mountain ranges (Fig. 1a), and altitudes of 0–3,952 m. The vegetation changes with altitude, from broadleaf forests to coniferous forests and then to scrub (Su, Reference Su and Peng1992). Two-thirds of the island is covered by forested mountains. Most of the coastal plains are occupied by human settlements. There are 89 protected areas, covering a total of c. 6,951 km2 (Fig. 1b). The climate is tropical marine, with warm and humid weather (mean annual temperature in the lowlands is c. 23 °C; mean annual precipitation is > 2,500 mm; Central Weather Bureau of Taiwan, 2013). However, the weather is cold in the high mountains, where it snows during winter. Snow cover duration is short, from several days to 2 months, depending on height and latitude.
Methods
Data collection
The locational data were primarily assimilated from our field observations of sambar deer (2008–2012). Data from field studies at other sites were also included (Pei et al., Reference Pei, Chou, Chen, Guo and Liu2002, Reference Pei, Chen and Chen2003; Pei, Reference Pei2004; Wu & Shi, Reference Wu and Shi2006; Lee et al., Reference Lee, Lin and Chi2007; Wu & Yao, Reference Wu and Yao2007; C.Y. Lin, pers. comm.). Although the collection of multiple datasets using different techniques prevented a standardized evaluation, it allowed the incorporation of data from many sites and environments. Two different survey techniques were used. The first involved the use of line transects to obtain data on deer presence. Absence data were not collected because it is difficult to confirm the absence of a species in such surveys. Transect lines were located on hiking and hunting trails. At each study site we surveyed several transect lines, to cover the various environments of that site. We recorded 1,582 coordinates of sambar deer tracks and signs (i.e. sightings, vocalizations, scats, footprints, tree rubbing and shed antlers) using a global positioning system. The second technique involved the use of 258 camera traps to obtain deer presence–absence data. Camera traps were laid 10–100 m away from the transect lines on animal trails. Camera-trap photographs of sambar deer were classified as presence data. If a camera trap operated for > 20 days without any photograph of sambar deer, this was regarded as a record of absence. In total 1,840 records of sambar deer were gathered, comprising 1,645 presence records and 195 absence records.
These records were transformed to a resolution of 1 km2; i.e. records within the same grid were incorporated into a single presence or absence record. A total of 361 grid cells of 1 km2 were sampled, representing 1% of the total land area of Taiwan. Line transect records that overlapped with grid cells categorized as ‘absent’ were re-examined, and five of these grid cells were recategorized as ‘presence’. Overall, 241 of the grid cells had records of presence and 120 of absence (Fig. 1b).
Environmental variables
We used 10 environmental variables with potential importance for sambar deer habitat suitability (Kushwaha et al., Reference Kushwaha, Khan, Habib, Quadri and Singh2004; Forsyth et al., Reference Forsyth, McLeod, Scroggie and White2009; Gormley et al., Reference Gormley, Forsyth, Griffioen, Lindeman, Ramsey, Scroggie and Woodford2011): mean elevation, standard deviation of elevation, distance to water body, annual mean temperature, annual precipitation, vegetation type, forest area, road density, distance to road, and human settlement cover (Table 1). All variables were obtained from the ecological and environmental geographical information system database for Taiwan (Lee et al., Reference Lee, Liao, Lee, Pan, Fu and Chen1997) and transformed to a resolution of 1 × 1 km using ArcGIS 9.3 (ESRI, Redlands, USA) and IDRISI Andes (Clark Labs, Worcester, USA).
Statistical analyses and development of models
We divided the data into training and testing data. We used the heuristic method provided by Huberty (Reference Huberty1994) to determine the ratio of testing data to the complete data set:
where p is the number of environmental variables. We used 10 environmental variables, therefore the ratio of training to testing data should be 3:1. We analysed the distribution of the sambar deer using logistic regression, discriminant analysis, ecological-niche factor analysis, genetic algorithm for rule-set production, and maximum entropy (see Appendix for descriptions of each). Logistic regression and discriminant analysis are presence–absence models, and the other three are presence-only models. These models are regularly used to predict species distributions (Teixeira et al., Reference Teixeira, Ferrand and Arntzen2001; Hirzel et al., Reference Hirzel, Hausser, Chessel and Perrin2002; Brotons et al., Reference Brotons, Thuiller, Araújo and Hirzel2004; Lee et al., Reference Lee, Lue and Wu2006).
We used different methods for each model to determine the importance of each environmental variable for the distribution of sambar deer. In the analysis of maximum likelihood estimates we used Wald χ 2 statistics; in discriminant analysis we used standardized canonical discriminant function coefficients; for ecological-niche factor analysis we used the factor scores; and for the maximum entropy model we used (1) jack-knife analysis of the mean gain with the training and test data, in addition to the area under the receiver-operating-characteristic curve (AUC), and (2) the mean percentage contribution of each environmental variable (Phillips et al., Reference Phillips, Anderson and Schapire2006). We were unable to evaluate the importance of variables for the genetic algorithm for rule-set production because of software limitations.
To evaluate the performance of each model, we used the AUC (Fielding & Bell, Reference Fielding and Bell1997), plotting the true-positive fraction against the false-positive fraction for all test points across all possible probability thresholds. The area under the curve measurement takes values between 0 and 1, with a value of 0.5 indicating that a model is no better than random. It is independent of prevalence and is considered an effective measure of the performance of ordinal score models (Manel et al., Reference Manel, Williams and Ormerod2001; McPherson et al., Reference McPherson, Jetz and Rogers2004).
For conservation purposes it is usually desirable to distinguish suitable from unsuitable areas by setting a threshold. If the predicted probability of occurrence is larger than the threshold then it is considered to be a prediction of presence (Pearson et al., Reference Pearson, Dawson and Liu2004). We calculated kappa statistics under different probabilities of occurrence and selected the probability that generated the maximum kappa statistic as the threshold for each model (Freeman & Moisen, Reference Freeman and Moisen2008).
To obtain the most robust prediction map we used ensemble forecasting, as described by Araújo & New (Reference Araújo and New2007). We calculated ensemble forecasting as a weighted mean by weighting each model based on its area under the curve (Araújo & New, Reference Araújo and New2007; Marmion et al., Reference Marmion, Parviainen, Luoto, Heikkinen and Thuiller2009; Thuiller et al., Reference Thuiller, Lafourcade, Engler and Araújo2009; Oppel et al., Reference Oppel, Meirinho, Ramírez, Gardner, O'Connell, Miller and Louzao2012). The habitat suitability indices of ensemble forecasting ranged from 0 to 1. We summarized the areas of suitable habitat and optimal habitat using the suitability index thresholds 0.33 and 0.67, respectively. We selected these thresholds based on our knowledge of the Formosan sambar deer in Taiwan.
Results
The five habitat suitability models had areas under the curve of 0.894, 0.885, 0.807, 0.777 and 0.908. Each model predicted a different distribution pattern for the sambar deer (Fig. 2) but all except the genetic algorithm for rule-set production indicated that distance to road and the mean elevation are the most important factors predicting habitat suitability for sambar deer (Table 2). A composite map was produced by ensemble forecasting (Fig. 3). The results showed that ensemble forecasting performed better than any of the individual models (AUC = 0.921).
There were 7,865 grid cells categorized as suitable habitat for the sambar deer, of which 4,464 were regarded as optimal habitat. The most suitable deer habitat is in the Central Mountain Range and Xue Mountain Range; c. 70% (5,355 of 7,865 km2) of suitable habitat lies in protected areas. The ensemble model indicated that sambar deer prefer habitat at medium to high elevation (> 1,500 m) and areas that lie away from roads. The mean elevation of suitable habitat is 2000 ± 600 m, with the predicted distribution including all areas > 3,000 m. The mean distance of suitable habitats to roads is 8.5 ± 4.6 km. In general, the suitability of habitat for sambar deer increases with increasing elevation and distance from roads.
There are five main patches of suitable habitat for sambar deer in Taiwan (Fig. 3). Two of these patches are in the Xue Mountain Range and the others are in the Central Mountain Range and Yu Mountain Range. These patches are separated by three major highways: the Central Cross-Island Highway, the Southern Cross-Island Highway and Highway No. 7A (Fig. 3). In addition, three small patches of suitable habitat are located in the Ali Mountain Range, the Coastal Mountain Range and the Chatianshan Nature Reserve (Figs 1 & 3). Suitable habitat in areas of low elevation is scarce.
Discussion
The sambar deer has not been recorded in all of the 7,865 km2 of habitat in Taiwan predicted to be suitable. For example, deer have not been detected in the Ali Mountain Range (Lin, Reference Lin1997) or Chatianshan Nature Reserve (Wang, Reference Wang1994). There are large areas of suitable habitat in the Xue Mountain Range but the population of sambar deer there is small (Fig. 1b; Yen, unpubl. data). Absence or sparse occurrence of the sambar deer is probably a result of high hunting pressure. An aboriginal tribe at Chatianshan and in the Xue Mountain Range has a long and prevalent hunting tradition, and local wildlife resources are often overexploited. Reintroduction (e.g. Klar et al., Reference Klar, Fernandez, Kramerschadt, Herrmann, Trinzen, Buttner and Niemitz2008; Kuemmerle et al., Reference Kuemmerle, Perzanowski, Chaskovskyy, Ostapowicz, Halada and Bashta2010) of the sambar deer to suitable unoccupied habitats would not necessarily be appropriate because the species is not under immediate threat of extinction in Taiwan, and hasty introductions could cause unanticipated damage to the local environment (Côté et al., Reference Côté, Rooney, Tremblay, Dussault and Waller2004). We believe that monitoring the expansion of sambar deer populations and any associated environmental impacts is a more appropriate management technique at present. Of the 7,865 km2 of habitat predicted to be suitable, 30% is located outside nature reserves. The largest patch (c. 260 km2) is on Mt Baigu, and other patches are located to the north-east of Taroko National Park, to the east of Yuli Wildlife Refuge, to the east and west of Guanshan Wildlife Refuge and to the west of Yushan National Park (Figs 1 & 3). We recommend evaluating the establishment of a nature reserve at Mt Baigu and expansion of other wildlife refuges and national parks. The sambar deer is a flagship species for conservation in Taiwan and the protection of its habitats would benefit other large mammals such as Reeves’ muntjac Muntiacus reevesi, Formosan serow Capricornis swinhoei and black bear Ursus thibetanus. Our finding that distance to road and mean elevation are the most important factors determining habitat suitability is similar to the findings of Kushwaha et al. (Reference Kushwaha, Khan, Habib, Quadri and Singh2004), who suggested that sambar deer avoid direct contact with humans, preferring areas of higher elevation. Our map derived from ensemble forecasting shows that three highways separate potential habitat into five main patches. These highways were constructed c. 40–50 years ago. Traffic, human settlements, lights, noise, dogs and the presence of tourists along the roads cause disturbance to wild animals (Debeljak et al., Reference Debeljak, Dzeroski, Jerina, Kobler and Adamic2001; Klar et al., Reference Klar, Fernandez, Kramerschadt, Herrmann, Trinzen, Buttner and Niemitz2008), and areas near the roads are vulnerable to poaching activity. Such human disturbance interrupts connectivity between patches separated by roads, and we hypothesize that gene flow between these patches has been limited in recent decades. The division of a species into small populations results in genetic characteristics being strongly influenced by inbreeding and genetic drift (Frankham, Reference Frankham1996). It may therefore be important to establish connections, such as bridges or tunnels at main crossing points along roads, between patches (Kuemmerle et al., Reference Kuemmerle, Perzanowski, Chaskovskyy, Ostapowicz, Halada and Bashta2010; Monterrubio-Rico et al., Reference Monterrubio-Rico, Renton, Ortega-Rodríguez, Pérez-Arteaga and Cancino-Murillo2010). We recommend that a number of suitable habitat sites that are in close proximity to the three highways should be selected to monitor the population expansion.
Elevation is another important determinant of sambar deer habitat suitability; the species prefers areas of medium to high elevation. A study by Podchong et al. (Reference Podchong, Schmidt-Vogt and Honda2009) also indicated that geographical parameters affect sambar deer distribution. We suggest that the preference for higher elevations may be attributable in part to land exploitation at lower elevations. The sambar deer formerly occurred down to 300 m in Taiwan (Kano, Reference Kano1940), and the bones of sambar deer have been found at low-elevation archaeological sites (Chen, Reference Chen2000). Most areas of low elevation are now exploited and, because of dense human populations, there is no intact habitat available for the species in these areas. Forest cover and annual precipitation have been found to be important determinants of habitat suitability for the sambar deer (Kushwaha et al., Reference Kushwaha, Khan, Habib, Quadri and Singh2004; Gormley et al., Reference Gormley, Forsyth, Griffioen, Lindeman, Ramsey, Scroggie and Woodford2011) but these two variables did not influence the delimitation of habitat suitability in our modelling, possibly because their occurrence is correlated with elevation. Slope aspect is a predictor of the abundance of sambar deer (Forsyth et al., Reference Forsyth, McLeod, Scroggie and White2009) but we did not use this variable in our modelling because it is not suitable for use at a resolution of 1 km2 grid cells. In our modelling of the habitat potentially suitable for the sambar deer, ensemble forecasting performed better than the five individual models. Although the value of AUC for the maximum entropy model was only 0.013 lower than that for ensemble forecasting, the prediction of ensemble forecasting was more comprehensive. Ensemble forecasting can produce more robust predictions of species characteristics (Araújo & New, Reference Araújo and New2007; Marmion et al., Reference Marmion, Parviainen, Luoto, Heikkinen and Thuiller2009; Thuiller et al., Reference Thuiller, Lafourcade, Engler and Araújo2009; Oppel et al., Reference Oppel, Meirinho, Ramírez, Gardner, O'Connell, Miller and Louzao2012).
The location data used in this study were assimilated from many independent field surveys because we wanted to analyse potential deer habitats across the maximum possible range of areas. Use of the sampling methods of Forsyth et al. (Reference Forsyth, McLeod, Scroggie and White2009) and Gormley et al. (Reference Gormley, Forsyth, Griffioen, Lindeman, Ramsey, Scroggie and Woodford2011) would provide data for more robust prediction models. However, the complex topography and low road density in the mountainous areas in Taiwan limit the application of such sampling methods, which would need to be modified before they could be used in any future studies. Our map of the potential distribution of the sambar deer in Taiwan and our indication of areas that are priorities for increased monitoring and/or protection provide a baseline for further research, and our modelling approach could be used elsewhere in the species’ range. As a result of our studies and recommendations the national parks’ administration is beginning a project to extensively monitor sambar deer population dynamics and habitat use.
Acknowledgements
We thank P.F. Lee for providing constructive comments at an early stage of the paper and for support with the environmental variables database. We thank C.Y. Lin for providing unpublished data, W. McShea for helpful discussions on the manuscript, and the reviewers for their useful comments. We are also grateful to the Taroko National Park and the Forestry Bureau of Taiwan for their funding of the field data collection.
Appendix
Descriptions of mathematical models
Logistic regression is a tool for analysing the effects of one or several independent variables, either discrete or continuous, on a dichotomic (presence/absence) or polychotomic dependent variable (Hosmer & Lemeshow, Reference Hosmer and Lemeshow1989). Logistic regression takes the following form:
where π (x) represents the probability of occurrence of the target species and g (x) is obtained using a regression equation of the form
where β 0 is a constant and β 1, β 2,…, β p are the coefficients of independent variables x 1, x 2,…, x p, respectively (Hosmer & Lemeshow, Reference Hosmer and Lemeshow1989). Logistic regression analyses were performed using SAS v. 9.0 (SAS Institute, Cary, USA).
Discriminant analysis is used to classify a set of observations into predefined classes that are based on a set of variables (McLachlan, Reference McLachlan2004). It constructs a set of linear functions of the environmental variables, known as discriminant functions, whereby
where b 1, b 2,…, b n are discriminant coefficients, x 1, x 2,…, x n are the environmental variables and c is a constant. These discriminant functions are used to predict the class of a new observation with an unknown class. For a k class problem, k discriminant functions are constructed. Given a new observation, all of the k discriminant functions are evaluated and the observation is assigned to class i if the ith discriminant function has the highest value.
Ecological-niche factor analysis compares the distributions of the environmental variables in the presence dataset with those in the whole study area (Hirzel et al., Reference Hirzel, Hausser, Chessel and Perrin2002). This technique summarizes environmental variables into a few uncorrelated factors that explain most of the information. The output includes eigenvalues and factor scores. The first factor is the marginality factor, which describes the difference between the mean habitat in the study area and the species mean. The remaining factors are the specialization factors, which describe how specialized the species is with reference to the available habitat range in the study area (Hirzel et al., Reference Hirzel, Hausser, Chessel and Perrin2002). We performed this analysis using Biomapper 4.0 (Hirzel et al., Reference Hirzel, Hausser and Perrin2007). After computing the factor scores we used the algorithm of the medians to draw a habitat suitability map for sambar deer (Hirzel et al., Reference Hirzel, Hausser, Chessel and Perrin2002).
The genetic algorithm for rule-set production creates ecological niche models for species (Stockwell & Peters, Reference Stockwell and Peters1999). The models describe environmental conditions under which a species should be able to maintain populations. The algorithm searches iteratively for non-random correlations between presence and environmental variables by using four types of rules: atomic, logistic regression, bioclimatic envelope and negated bioclimatic envelope. Predicted presence is defined by these rules. We used Desktop GARP 1.1.6 (University of Kansas Center for Research, Kansas, USA) and followed the normal procedure for implementation. The output is a binary map; hence, we applied a modification of the best-subsets procedure described by Anderson et al. (Reference Anderson, Lew and Peterson2003). We ran 200 models and selected the 20 that had the highest predicted accuracy. The final prediction was produced by summing the 20 selected models, which yielded prediction values in the range 0–20.
Maximum entropy is a machine-learning technique that is based on the principle of maximum entropy (Pearson et al., Reference Pearson, Dawson and Liu2004). It estimates the probability distribution of maximum entropy for each environmental variable across the study area with presence-only data (Pearson et al., Reference Pearson, Dawson and Liu2004, Reference Pearson, Thuiller, Araújo, Martinez-Meyer, Brontons and McClean2006). This distribution is calculated with the constraint that the expected value of each environmental variable under this estimated distribution matches its empirical mean (Pearson et al., Reference Pearson, Thuiller, Araújo, Martinez-Meyer, Brontons and McClean2006). Habitat suitability maps were calculated by applying Maxent models to all grids in the study area, using a logistic link function to yield probability values ranging from 0 to 1. Maximum entropy performs well with small sample sizes (Elith et al., Reference Elith, Graham, Anderson, Dudík, Ferrier and Guisan2006; Hernandez et al., Reference Hernandez, Graham, Master and Albert2006). We developed our models using Maxent v. 3.3.1 (Princeton University, Princeton, USA).
Biographical sketches
Shih-Ching Yen studies the behaviour, ecology and management of sambar deer and sika deer. He is currently studying the habitat selection and space use of sambar deer, using radiotelemetry. Ying Wang studies the ethology of birds and mammals and is involved in wildlife management and training aboriginal hunters as ecotourism guides. Heng-You Ou studies species distribution modelling and spatial statistics.