Introduction
Geographical analysis and planning of population-level health and health services has a long history, for example, monitoring equity in health outcomes and access to health services. Applications include, for example, the estimation of general practice (GP) catchment areas (eg, Jenkins and Campbell, Reference Jenkins and Campbell1996), population accessibility to primary health care facilities (Field, Reference Field2000; Luo and Wang, Reference Luo and Wang2003; Jordan et al., Reference Jordan, Roderick, Martin and Barnett2004a; Saxena et al., Reference Saxena, Car, Eldred, Soljak and Majeed2007; Bottle et al., Reference Bottle, Millett, Xie, Saxena, Wachter and Majeed2008), monitoring socio-demographic variations in health screening uptake (Muggli et al., Reference Muggli, McCloskey and Halliday2000) and monitoring of primary health care inequalities (Sigfrid et al., Reference Sigfrid, Turner, Crook and Ray2006; Strong et al., Reference Strong, Maheswaran and Radford2006). Indeed much literature appearing in mainstream medical, epidemiological and health services journals are well within the scope of health geography (eg, Damiani et al., Reference Damiani, Propper and Dixon2005; Bowling et al., Reference Bowling, Barber, Morris and Ebrahim2006; Congdon, Reference Congdon2006). The last ten to fifteen years have seen substantial advances in data availability, methods and relevant technologies for these purposes, along with corresponding analytical boons and pitfalls.
Monitoring locality health profiles and planning population-level health care require routine, aggregated datasets, which allow the mapping and analysis of population variations in health status and use of health services. In the UK, there is a wide array of relevant published data sources, including Hospital Episode Statistics (HES), GP registers, census data, deprivation indicators and socio-economic classification schemes. With respect to the numerous sources of health and health care data that can be used for population studies and these have been described in depth elsewhere (Thiru et al., Reference Thiru, Hassey and Sullivan2003; Kalthenthaler et al., Reference Kalthenthaler, Maheswaran and Beverley2004; Millett et al., Reference Millett, Zelenyanszki, Binysh, Lancaster and Majeed2005; Gnani and Majeed, Reference Gnani and Majeed2006). Hence, we focus on non-medical data sources that can be combined with such datasets to monitor locality health profiles and plan primary health care services effectively. Table 1 provides guidance on where to obtain data, products and services. In this review paper recent, current and future developments are discussed, with an emphasis on data issues, to provide not only an introduction for those new to the field but also a professional update for those already working in the domain.
CAS = Census Area Statistics; GPRD = General Practitioner Research Database; NS-SEC = National Statistics Socio-economic Classification; OA = Output Area; SOA = Super Output Area; GIS = geographic information systems; OAC = Output Area Classification; WIMD = Welsh Index of Multiple Deprivation; SIMD = Scottish Index of Multiple Deprivation; NIMDM = Northern Ireland Multiple Deprivation Measure; IMD = Index of Multiple Deprivation; ESRI = Environmental Systems Research Institute.
Census data
The population census is the most commonly used source of socio-demographic data. In the UK, simultaneous decennial censuses are conducted by the three national statistical agencies, the Office for National Statistics (ONS) in England and Wales, the General Register Office Scotland and the Northern Ireland Statistics and Research Agency (NISRA); similar censuses are carried out in a number of countries such as Australia, the Republic of Ireland and USA, though some countries including Finland and the Netherlands rely on administrative data in order to derive population statistics rather than carrying out a periodic census (Martin, Reference Martin2006). The three UK censuses cover a broad range of demographic, social and economic topics, with most questions being consistent across the UK. Census results are made available as a variety of output products, which include individual responses aggregated across geographical areas. For the most recent census, that of 2001, the most geographically detailed products are the Key Statistics and Census Area Statistics produced for census units called Output Areas (OAs) each containing a mean of around 300 people; these OAs are aggregated by each respective census agency to form larger census units in a geographical hierarchy moving from the local to the national (Martin, Reference Martin2002). Cross-tabulations of census results, known as Standard Tables, are available for the various tiers of government organisation from wards upwards. Since 2004 there has been data, based on the 2001 censuses, published on daily commuting patterns at the local authority level, including net flows by age band, enabling estimation of daytime population numbers at the local authority level. While net commuter patterns are not relevant to many health-related contexts they are relevant to some issues, for example, environmental exposures relating to respiratory diseases. The 2001 censuses also included two new questions that have relevance to health services planning, adding to the existing one regarding long-term illness: one on general health and another on the provision of unpaid care for people with limiting long-term illness and disabled people (Dixie and Dorling, Reference Dixie and Dorling2002).
Although the ten years between censuses means that data are often out of date towards the end of the period, the highly detailed census counts are unmatched by any other data source, both in terms of small geographical areas and numerous cross-tabulated variables. It is this characteristic that causes census data to remain as an important foundation for government policy and resource allocation in preference to more recent but less comprehensive sources. The national census offices of the UK also produce mid-year population estimates annually by modeling population change based on the most recent census as a base point and making use of the results of postcensus revisions, the mid-2007 estimates being the most recent, for example, the most recent small area estimates for Scotland are the Mid-2007 Population Estimates Scotland (General Register Office for Scotland, 2008).
The problem of census under-enumeration varies geographically as well as by age and sex. The census is considered to cover 98% of the total population of the UK but undercounting is marked amongst young men living in inner city areas; some estimates put under-enumeration among men aged between 25 and 29 years in the major metropolitan areas at around 20% (Office of Population Censuses and Surveys, 1994). The published 2001 census results represent the actual counts, combined with a sophisticated estimation methodology known as the ‘One Number Census’ (Steele et al., Reference Steele, Brown and Chambers2002) intended to impute additional individuals and households most likely to be missing from each area. While coverage across the UK is considered to be good, particular enumeration problems were recognised in certain areas, for instance further corrections have to be applied to the counts for 15 local authorities in England post-2001 census, most notably for Westminster and Manchester. The census agencies of the UK are also making efforts to provide population estimates for small areas in the inter-censal period, making use of the results of postcensus revisions.
Preparations are now underway for the 2011 censuses (Martin, Reference Martin2007) and a 2008 Census White Paper (Office for National Statistics, 2008) sets out the current proposals for England and Wales following an extended period of consultation on future question topics and holding field tests. England and Wales tested their pilot proposals in May 2007 covering 100 000 households in five local authority areas. Scotland piloted its proposals in April 2006 covering 50 000 households in five areas, and on 29th March 2009 conducted a full census rehearsal. Of particular significance to the health and healthcare research and development is the inclusion of a potential question on income, a variable that has been shown in a number of studies outside of the UK to be one of the strongest socio-economic indicators of health outcomes (Carstairs, Reference Carstairs2000: 62–63; Gatrell, Reference Gatrell2002: 126–127). However, in the 2011 pilot census income was deemed to have too great a negative impact on overall response rates in the 2006–07 test-bed populations’ response rates in England, and the question has been dropped in England and Wales. Intriguingly, the inclusion of the same question in the US census has not been a matter of public controversy and income data have been used in both epidemiological and healthcare research for many years (Lynch et al., Reference Lynch, Kaplan, Pamuk, Cohen, Heck, Balfour and Yen1998; Sanmartin et al., Reference Sanmartin, Ross, Tremblay, Wolfson, Dunn and Lynch2003; Subramanian and Kawachi, Reference Subramanian and Kawachi2006). It is also interesting to note that the inclusion of the same question in the 2006 test in Scotland did not result in a reduced response rate and an income question may still feature in the 2011 Scottish census (General Register Office for Scotland, 2007). All of the 2011 UK census proposals retain the 2001 questions on self-reported general health and limiting long-term illness, and also add new topics of relevance to aspects of locality health profiling. For example, the proposals include the questions on citizenship and national identity, as well as on second residences.
Socio-economic and behavioural indicators
Deprivation indices
There have been several area-based indices developed for measuring community deprivation using combinations of census variables, such as the Carstairs, Jarman and Townsend indices (Senior, Reference Senior2002). The Carstairs index, for example, is a well-established gauge of material deprivation and uses four variables (overcrowding, the proportion of people in low social classes, lack of car ownership and unemployment rates). It should be noted that the Jarman Underprivileged Area index, the first deprivation index produced in the UK and incidentally originally created to predict GP workloads in London for primary health care research purposes rather than to measure material deprivation, is now no longer used due to criticisms regarding its limited applicability outside of inner metropolitan areas.
A newer generation of indices uses a combination of census and non-census data derived from government administrative information, of which the latest are the Indices of Deprivation (Neighbourhood Renewal Unit, 2004). Each part of the UK has its own variant tailored to its regional context (latest versions in brackets): England (Index of Multiple Deprivation, 2007), Scotland (Scottish Index of Multiple Deprivation, 2006), Wales (Welsh Index of Multiple Deprivation, 2005) and Northern Ireland (Northern Ireland Multiple Deprivation Measure, 2005). Each of these indices is a composite of domains derived from constituent variables. For example, the English Index of Multiple Deprivation 2007 is composed of seven Domains (barriers to housing and services, crime, education, employment, environment, income and health) derived from 37 constituent variables, including a Health Deprivation and Disability Domain comprised of four community health indicators. The inclusion of non-census data offers an advantage over traditional census-based measures of deprivation, which become less accurate over time from the previous census. Examples of non-census datasets included in the current indices of deprivation are employment information (eg, number of persons on incapacity benefit) and educational information (eg, proportion of school-leavers aged sixteen) most of which are updated much more frequently than census data.
A key application of area deprivation data when combined with health and health care datasets is monitoring inequalities in primary health care delivery and quality. An illustrative case is Sigfrid et al. (Reference Sigfrid, Turner, Crook and Ray2006) in which the relationships between Quality and Outcomes Framework diabetes exception reporting, Diabetes UK data and ID 2004 scores for Lower Super Output Areas (LSOAs) in the Brighton and Hove City Primary Care Trust (PCT) were statistically analysed. The findings suggested that although there was socio-economic equity in the achievement of diabetes management targets, exception reporting was higher in areas with a higher deprivation score. It should be noted that with the additional use of geodemographics data (discussed later in this review) in such analyses a geographically more precise identification of priority areas is possible.
The domains of the current generation of indices can be used individually or in combination, and thus give great flexibility. Research also suggests that this type flexible of index is more effective than the traditional indices at measuring deprivation in both rural and urban areas (Asthana et al., Reference Asthana, Halliday and Brigham2002) particularly due to the inclusion of accessibility variables. Previous indices have been more effective at identifying deprivation variables most relevant to health outcomes in urban areas than rural areas (Cox, Reference Cox1998; Barnett et al., Reference Barnett, Roderick, Martin and Diamond2001) and although the current range of indices incorporate variables more relevant to rural contexts nevertheless present indices are still more effective in the context of health care research and development for urban areas than rural areas (Jordan et al., Reference Jordan, Roderick and Martin2004b; Niggebrugge et al., Reference Niggebrugge, Haynes, Jones, Lovett and Harvey2005). The current deprivation indices are calculated and made available for groups of OAs termed LSOAs. It should be noted that when used for health analyses different deprivation indices have shown a high degree of correlation with each other. When choosing measures of deprivation, both conceptual and practical considerations should be taken into account. For instance, the English Index of Multiple Deprivation 2007 includes a ‘health and disability’ domain, which means that analyses using the index may overestimate associations between deprivation and health (Adams and White, Reference Adams and White2006), although the health domain can be excluded or individual domains used instead. There also are caveats to using deprivation indices in general. A notable obstacle has been the current lack of suitable population estimates for LSOAs (Morgan and Baker, Reference Morgan and Baker2006). As a result of small area analyses, particularly for rare health conditions, can be problematic in localities with transient populations such as inner city areas, resulting in a lack of robustness in denominator population estimates. To counteract these problems the ONS have now released an experimental series of small area population estimates, which are available at the LSOA level. The new dataset is based on a ratio-change methodology, which seeks to update census-based initial counts with change data drawn from multiple administrative sources, including the National Health Service (NHS) central register (Bates, Reference Bates2006). However, while this methodological development is of great importance to small area studies it is essential to note that these inter-censual small area estimates are synthetic and are not actual census small area population counts.
Socio-economic classifications
Mapping and analysing the geographic distribution of socio-economic groups is an alternative and complementary approach to the use of deprivation indices. The traditional methods of classifying people into socio-economic groups were the Registrar General’s Social Class (SC) based on occupation and the less commonly used Socio-economic Groups (SEG). Limitations have been found in both measures by academics and policy makers due to changes over the decades in the composition of the workforce and the range of occupations available. This has resulted in the ONS producing the National Statistics Socio-economic Classification (NS-SEC) a more contemporary system to reflect present-day society (Rose and O’Reilly, Reference Rose and O’Reilly1998). One of the prime motivations for the structuring of the NS-SEC was to classify society into groups that better predict health outcomes than either the SC or the SEG did. In particular, small areas with a significant component of the NS-SEC class 8 also have high levels of healthcare utilisation. Table 2 compares the SC and NS-SEC approaches.
SC = social class based on occupation; ONS = Office for National Statistics; NS-SEC = National Statistics Socio-economic Classification.
Geodemographic datasets
Whereas deprivation indices seek to place every small area on a single deprivation scale, there has also been extensive interest in area classifications, which allocate each small area to a neighbourhood type termed geodemographic classifications (Brown and Batey, Reference Brown and Batey1994; Birkin and Clarke, Reference Birkin and Clarke1998; Birkin et al., Reference Birkin, Clarke and Douglas2002; Harris et al., Reference Harris, Sleight and Webber2005). Geodemographics is in effect an area-based form of segmentation, to use social marketing terminology, which the NHS is showing increasing interest in (eg, Abbas et al., Reference Abbas, Carlin, Cunningham, Dedman and McVey2009). While geodemographics has its roots in academic geography it has been of particular interest to the marketing industry, which has made use of census and non-census data derived from credit-checking agencies, retail behaviour and consumer surveys, amongst other sources, to create ‘lifestyle’ segments. As a result the majority of current geodemographic classifications have been produced in the commercial sector; examples include ACORN, MOSAIC, PinPoint and Super Profiles. The use of these commercial datasets is growing in the public health sector (eg, Aveyard et al., Reference Aveyard, Manaseki and Chambers2002; Birkin et al., Reference Birkin, Clarke, Gibson, Dewhurst and Bobby2005; Powell et al., Reference Powell, Tapp, Orme and Farr2007). Nevertheless, there are non-commercial classification systems, for example following the 2001 censuses ONS produced an entirely new, census-based geodemographic classification of OAs, which is freely available (Vickers and Rees, Reference Vickers and Rees2007). The ONS produced this classification for public sector research and planning, termed the National Statistics Output Area Classification (OAC), which has its own user support group (see Table 1). However, it should be noted that this dataset though available at a finer spatial scale than deprivation indices is still at a coarser scale than commercially available geodemographic classifications.
All of these neighbourhood classifications, both commercial and non-commercial, provide a complementary resource base for health needs assessment, giving indirect insight into geographic patterns in health-related behaviours and attitudes (eg, socio-economic gradients in diet and smoking). Geodemographic data can be combined with health information for mapping disease risk, monitoring health profiles and access to services, for example Webber (Reference Webber2004), examined HES admissions by neighbourhood using MOSAIC. An advantage of commercial systems using non-census data is that their socio-demographic classifications can be estimated for areas even smaller than census OAs, hence theoretically capture diversity within small areas, and additionally are continually updated using a variety of both commercial and administrative data sources (Muggli et al., Reference Muggli, McCloskey and Halliday2000; Longley, Reference Longley2005). An example of the application of geodemographic datasets to monitoring locality health profiles is Powell et al. (Reference Powell, Tapp, Orme and Farr2007) in which a commercial geodemographic system was used to identify highly localised pockets of populations at high risk of developing diabetes mellitus within a PCT’s boundaries. As Powell et al. (ibid: 33–34) note, the identification of such pockets can aid in the social marketing of behavioural interventions.
However, there is much disagreement over the value of the use of geodemographic datasets in health contexts. Much of this relates to methodological issues with a some geographers considering that because the indicators used in commercial geodemographic datasets were not developed for the purpose of analysing retail behaviour and not for identifying health-related behaviours their use is theoretically unsound (eg, Twigg et al., Reference Twigg, Moon and Jones2000: 1110); other researchers considering that intra-group variation is greater than inter-group variation in published, commercially available geodemographic classifications (Voas and Williamson, Reference Voas and Williamson2001). There is also the fact that commercially available geodemographic classification systems sold to healthcare organisations have an underlying commercial motivation behind them and primary health care researchers and developers should always take this factor into account when evaluating the merits of purchasing such classifications. The ‘jury is still out’ on the exact contribution that geodemographics has to make to not only primary health care but also public health in general. Nevertheless, its potential merits further research and development as the OAC has shown. It is also important to note that with the current emphasis on lifestyles and healthy choices geodemographics is likely to play an increasing role in primary health care research and development (Department of Health, 2004).
Area and rural–urban classifications
For those performing analyses requiring an understanding of the overall socio-economic character of areas the national statistics offices of the UK provide two forms of classification, the first summarising the overall economic and demographic character of areas and the second classifying areas on an rural to urban spectrum. The former is termed the National Statistics 2001 Area Classification and has been produced to cover all of the UK, the OAC discussed in the previous section forming its foundations (see Table 1). It is available at different levels of geographic aggregation including Health Areas in the form of primary care organisations. These Health Areas summarise area socio-demographic character and are structured into three geographic levels of classification: supergroup, group and subgroup. Example classes include manufacturing towns, coastal and countryside and industrial hinterlands. Unsurprisingly such classifications are of particular interest to human geographers and sociologists, but are also of relevance to the health sector analysis as Doran et al. (Reference Doran, Drever and Whitehead2006) have shown in their comparison of health outcomes across local authorities in England.
The latter type of area classification are rural–urban classifications, which are especially relevant with respect to capturing ‘rurality’ effects on health outcomes, a variable which even current deprivation indices have thus far been found slightly wanting as discussed in the section on deprivation indices. The latest rural–urban classifications for the UK are the ONS Rural and Urban Area Classification 2004 for England and Wales, the NISRA Urban–Rural Classification 2005 and the Scottish Executive’s Urban–Rural Classification 2003–04 (Bibby and Shepherd, Reference Bibby and Shepherd2004; Northern Ireland Statistics and Research Agency, 2005; Scottish Executive, 2008). Again, as with the National Statistics 2001 Area Classification, there are geographic tiers to the rural–urban classifications. An example of the use of a rural–urban classification in health care research is Jordan et al. (Reference Jordan, Roderick, Martin and Barnett2004a) a study which included analysis of accessibility to GP services in southwest England.
Neighbourhood statistics
A recent development in the availability of socio-demographic data in the UK is the provision of Neighbourhood Statistics by the national statistics agencies covering all geographical scales in the UK from the lowest socio-demographic units to the national level. The neighbourhood statistics datasets include a range of information, including those covered in the census and the Indices of Deprivation, which characterise both population and environmental characteristics of areas. The provision of a range of socio-demographic indicators in a single dataset customisable to a given locality is a great boon and is especially valuable for integrating primary health care provision within urban and rural planning, a topic returned to later in this review.
Boundary data
Administrative, census and postcode locations are fundamental to mapping health to both visualise geographic patterns and analyse spatial clustering. The geographic hierarchy of boundaries within the UK is complex, with differences of detail between the countries in question and within them. Essentially, boundaries are available for government administrative units at every level down to census OAs. However, this neighbourhood statistics geography is intentionally static to allow the production of non-disclosive and comparable data series, while the units of primary health care organisation change repeatedly and their boundaries are not necessarily aligned with the hierarchy built on OAs as its foundation. These OAs were, wherever possible, built from whole unit postcodes (the smallest element in the postcode hierarchy) and represent consistent-sized populations of around 300 persons. They were intended as flexible statistical spatial blocks rather than analysis units in their own right (Majeed et al., Reference Majeed, Cook, Poloniecki and Martin1995; Scrivener and Lloyd, Reference Scrivener and Lloyd1995; Martin, Reference Martin2004). To aid this goal further, a lower tier of super output areas, LSOAs (‘Datazones’ in Scotland) have been produced by aggregating 2001 OAs to produce areas with a mean population size of 1500 and a middle tier (MSOAs, Middle Super Output Area) with a mean population size of 7500. Digital boundary data are freely available for 2001 OAs and ONS are proposing to retain a stable OA geography as far as possible for the 2011 census. In addition to data published for these geographical areas, there is extensive use of the postcode as a geographical reference and Royal Mail, Ordnance Survey and ONS work together to produce the Office for National Statistics Postcode Directory (http://www.ons.gov.uk/about-statistics/geography/products/geog-products-postcode/nspd/index.html), a listing of all postcodes giving their Ordnance Survey grid reference and their membership of a very large number of geographical areas, including all levels of the neighbourhood statistics geography and also current and past primary health care organisations.
A present, though perhaps only a short-term, problem with boundary datasets as applied to primary health care contexts, is that the current PCT boundaries have not been published and only pre-2005 boundaries are available except for London where the boundaries are unchanged. This situation is a result of the number of PCTs in England having been reduced from 303 to 152 in number during 2006, as part of a move to make PCT boundaries coterminous with those of local authorities to facilitate integrated urban and rural planning. Those needing to use current ones outside of London are obliged to construct the boundaries themselves, though considering the fluctuations in PCT boundaries over the last decade it is our recommendation to perform analyses at the local authority level to permit the study of change over time. A separate limitation that the reader may also have noted that socio-demographic boundaries in general are inflexible and additionally do not reflect any ‘real world’ physical boundaries. While for most planning and evaluation purposes this is not a significant issue, those requiring more refined, complex analyses which take into account the problems posed by inflexible boundaries the use of spatial smoothing techniques is paramount. A discussion of the application of spatial statistics in health geography is beyond the scope of this data sources review and instead readers are directed towards Elliott et al. (Reference Elliott, Wakefield, Best and Briggs2000) and Cromley and McLafferty (Reference Cromley and McLafferty2002).
Linking datasets
Combining clinical data with those of other types (eg, census, geodemographic) not only enables services planning and research but also potentially creates new, value-added datasets. For instance by combining geographic data on lifestyles and behaviours with those on health, geographical-based health profiles and healthcare needs assessment datasets can be produced (Dedman et al., Reference Dedman, Jones, Tocque and Bellis2006). An example of this is the locality health profiles produced by the Association of Public Health Observatories (Pencheon, Reference Pencheon2008). Of key relevance to linking existing datasets is the National Statistics Postcode Directory, where a postcode or recognised area code is available in each dataset the directory can be used to link records directly or to reweight data from one area unit to another. Researchers within UK academic settings can take advantage of the GeoConvert facility to undertaken such linkage (Mimas GeoConvert at http://geoconvert.mimas.ac.uk/). Where direct lookup in this way is not possible, the researcher must perform GIS analysis to determine the best match between the available data.
The socio-demographic data sources discussed in this review can also be linked with transport and health datasets to identify localities with both high health care needs and limited geographical access to service provision, and indeed much of this work has focused on primary health care (eg, Luo and Wang, Reference Luo and Wang2003; Guagliardo, Reference Guagliardo2004; Luo, Reference Luo2004; Wang and Luo, Reference Wang and Luo2005; Mobley et al., Reference Mobley, Root, Anselin, Lozano-Gracia and Koschinsky2006). Further to this forecasting future demand for primary health care services in areas currently undergoing neighbourhood regeneration is increasingly becoming an integrated part of planning wider social infrastructures especially in spearhead groups (Blackman, Reference Blackman2006) and relates directly to the information toolkits at the disposal of health care planners. It is in this respect that neighbourhood statistics are invaluable. A current example of this is the London Thames Gateway Social Infrastructure Framework (LTG-SIF). This urban regeneration programme covers much of east London is funded by the London Development Agency and local PCTs, and is coordinated by the National Health Service London Healthy Urban Development Unit in part aims to effectively plan future health services in the area by working closely with local urban planners to model the implications of new housing developments for health care demand and supply (National Health Service London Healthy Urban Development Unit, 2006a; 2006b). In data terms this means integrating population growth models based on census information with morbidity trends, deprivation scores and transport datasets in order to optimally plan new educational centres, green spaces, GP clinics and other facilities in an integrated framework to produce sustainable communities.
A simple example of a data linkage between socio-demographic and health care data in a GIS environment, using spatial regression, is provided in Figure 1, illustrating the strength of the association between limiting long-term illness prevalence and NS-SEC class 8 (never worked and long-term unemployed; see Table 2) of the NS-SEC for LSOAs across southwest England, with PCT boundaries overlain. Notice the markedly higher regression residuals in largely rural Cornwall in comparison to highly metropolitan Greater London, demonstrating the weaker association in rural areas between long-term morbidity and a key indicator of area socio-economic composition. There are not only clear implications for relying solely upon socio-economic indicators for making resource allocation decisions for rural areas, as discussed in the section on deprivation indices. The figure also shows the value of looking at socio-demographic data sources from a geographical perspective.
Nevertheless, making use of the full range of datasets discussed in this review is not always straightforward. For instance geodemographic datasets derived from consumer questionnaires and administrative sources at present cannot match the population coverage and quality assurance levels achieved by the decennial censuses. The substantial increase in information available for small areas is a welcome development; however, this trend in turn raises questions about statistical robustness and interpretation. For example consideration of local factors, such as the presence of nursing homes in some areas, must be taken into account in the interpretation of geographic patterns. Additionally, the limitations of aggregated, socio-demographic datasets can in some circumstances compound the variable data quality of clinical and health care datasets; hence thoughtful use of these sources, especially when linking them with health-related data, is imperative.
Currency of data also plays a role in limiting the effectiveness of these datasets. While geographical planning and evaluation of primary health care does not require real-time data unlike say emergency response services, nevertheless the fact that there is significant variation in how current the socio-demographic data sources discussed in this review are makes analysis more complex. For example, census data which, even with annual mid-year estimates become increasingly inaccurate in terms of demographic structure the further from the year of the census one goes. Localities with a significant turnover of population are especially problematic for monitoring locality health profiles and planning health care services (Hennell, Reference Hennell2004) particularly with respect to forecasting future demand for health care. By contrast with the census situation, some variables in the current indices of deprivation are updated annually. As a result there is inevitably a certain, perhaps unquantifiable, amount of temporal mismatch between the different datasets. This complicates small area analyses especially considering that the census provides unrivalled coverage and quality assurance in comparison to other datasets such as the frequently updated geodemographic classifications available commercially.
A key issue in terms of monitoring equity in health care is that the identification of spatial variation in the need for health services in rural areas is more complex given the available data sources than for urban settings. While a new generation of flexible deprivation indices, geodemographic sources and the NS-SEC collectively provide a more effective information toolkit for monitoring and planning rural health care, other variables such as access to proximate out-of-hours services, distance to healthcare centres and public transport infrastructure often play a role in the effectiveness of health services delivery (Baird et al., Reference Baird, Donnelly, Miscampell and Wemyss2000). This situation is made more complex by the fact that although there is an access to services domain in each of the Indices of Deprivation, it should be noted that the access domain in each of these indices uses straight line distances rather than road network distances and it has been noted that the former only has moderate correlation with the latter in many localities (Jordan et al., Reference Jordan, Roderick, Martin and Barnett2004a; Reference Jordan, Roderick and Martin2004b; Niggebrugge et al., Reference Niggebrugge, Haynes, Jones, Lovett and Harvey2005), a problem for monitoring inequalities in health care accessibility.
It should be noted that the aggregated data sources discussed, while well suited for resource allocation purposes are not always fit for purpose in other health-related analyses, for example aetiological studies. We stress that household-level or individual-level data need to be used in conjunction with aggregate, contextual datasets for researching associations in health and behavioural outcomes in individuals and groups (eg, Nieuwenhuijsen, Reference Nieuwenhuijsen2000). Obtaining data on individuals from healthcare records, questionnaires and/or interviews requires prior ethical approval but can provide valuable individual-level information to complement aggregated datasets, for example Blackman et al. (Reference Blackman, Harvey, Lawrence and Simon2001) and Day (Reference Day2008).
In the longer term, the 2008 Treasury Sub-committee of Inquiry on Counting the population has recommended that following 2011 ONS move towards the replacement of the conventional census by data sourced from administrative records (House of Commons Treasury Committee, 2008). Such development is currently at a very early stage, but would have major implications for the entire data landscape presented in this review and all involved in primary health care research and development should seek to engage in the forthcoming debate. In terms of the more immediate future there is currently a consultation programme for Census 2011, with many issues relevant to the health care sector being covered, for example debate regarding proposed new census questions for inclusion such as that on multi-residency (Cabinet Office, 2008). Of notable interest to primary health care researchers and developers will be the consultation over whether to collect data on numbers of legal short-term migrants, due to take place in April and May 2010. This potential short-term migrant dataset could have great value for services planning; however, there will inevitably be question marks over the accuracy of such a dataset. Further details on how to become engaged in the Census 2011 consultation can be obtained by visiting the ONS website (http://www.ons.gov.uk/census/2011-census/consultations/index.html). The consultation process is now underway and will continue up until autumn and winter 2010.
Conclusions
Developing routine, published health and geographic data sources to study the distribution of socioeconomic and health inequalities may assist local and population-based health intervention strategies to be more efficiently targeted. The advent of OAs/SOAs, a new generation of deprivation measures, the NS-SEC and progress in GIS, have delivered a powerful suite of data, tools, methods and an associated body of literature for primary health care research and development. Geodemographic profiling in particular is likely to play an increasing role in local strategies.
Acknowledgement
The authors thank Allan Baker of the Office for National Statistics (ONS) Mortality Statistics Team for valuable comments on drafts of this review paper.
Declarations
Funding: Dr Saxena holds an NIHR postdoctoral fellowship.
Competing interests: The authors declared no competing interests.
Ethical approval: The authors declared that ethical approval is not required.