Hostname: page-component-cd9895bd7-hc48f Total loading time: 0 Render date: 2024-12-24T20:17:39.249Z Has data issue: false hasContentIssue false

Development of primary care assessment tool–adult version in Tibet: implication for low- and middle-income countries

Published online by Cambridge University Press:  01 July 2019

Wenhua Wang*
Affiliation:
Department of Family Medicine, McGill University, Montreal, Canada
Jeannie Haggerty
Affiliation:
Department of Family Medicine, McGill University, Montreal, Canada
*
Author for correspondence: Wenhua Wang, Hayes Pavilion, Suite 4764, 3830 Avenue Lacombe, Montreal, Quebec, Canada H3T 1M5. E-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Aim:

To conduct advanced psychometric analysis of Primary Care Assessment Tool (PCAT) in Tibet and identify avenues for metric performance improvement.

Background:

Measuring progress toward high-performing primary health care can contribute to the achievement of sustainable development goals. The adult version of PCAT is an instrument for measuring patient experience, with key elements of primary care. It has been extensively used and validated internationally. However, only little information is available regarding its psychometric properties obtained based on advanced analysis.

Methods:

We used data collected from 1386 primary care users in two prefectures in Tibet. First, iterative confirmatory factor analysis examined the fit of the primary care construct in the original tool. Then item response theory analysis evaluated how well the questions and individual response options perform at different levels of patient experience. Finally, multiple logistic regression modeling examined the predicative validity of primary care domains against patient satisfaction.

Findings:

A best final structure for the PCAT-Tibetan includes 7 domains and 27 items. Confirmatory factor analysis suggests good fit for a unidimensional model for items within each domain but doesn’t support a unidimensional model for the entire instrument with all domains. Non-parametric and parametric item response theory analysis models show that for most items, the favorable response option (4 = definitely) is overwhelmingly endorsed, the discriminability parameter is over 1, and the difficulty parameters are all negative, suggesting that the items are most sensitive and specific for patients with poor primary care experience. Ongoing care is the strongest predictor of patient satisfaction. These findings suggest the need for some principles in adapting the tool to different health system contexts, more items measuring excellent primary care experience, and update of the four-point response options.

Type
Research
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s) 2019

Introduction

The contribution of primary care to health system performance has been widely examined nationally and internationally (Starfield et al., Reference Starfield, Shi and Macinko2005; Kringos et al., Reference Kringos, Boerma, Van der zee and Groenewegen2013). A systematic review of 36 studies in low- and middle-income countries showed that strong primary care leads to improved and more equitable health outcomes, especially in infants and children (Macinko et al., Reference Macinko, Starfield and Erinosho2009). Another study of 31 high-income countries in Europe showed that strong primary care system is associated with improved population health outcomes, reduced socioeconomic inequality in health outcomes, fewer unnecessary hospitalizations, and slower increases in the overall health-care expenditures (Kringos et al., Reference Kringos, Boerma, Van der zee and Groenewegen2013). Recognizing the value and effectiveness of primary care, many countries including China have identified primary care transformation as a major component of health reform.

Measuring progress towards high-performing primary health care can contribute to the achievement of sustainable development goals. Measuring the quality of primary care from patients’ perspective can provide actionable and comprehensive performance information to guide primary care reform efforts. Patients are the best evaluators of key aspects of their health care, including accessibility, continuity, interpersonal communication, respectfulness, family-centered care, whole-person care, and cultural sensitivity (Haggerty et al., Reference Haggerty, Burge, Levesque, Gass, Pineault, Beaulieu and Santor2007). To strengthen people-centered integrated primary care system building, there is increasing interests in patient experience measurement globally (Kruk et al., Reference Kruk, Pate and Mullan2017). Many instruments have been developed, such as Primary Care Assessment Tool [(PCAT); Shi et al., Reference Shi, Starfield and Xu2001].

The PCAT developed by Barbara Starfield has been extensively used internationally (Shi et al., Reference Shi, Starfield and Xu2001). Inspired by the World Health Organization definition of primary care, the PCAT was originally developed to measure the extent to which primary care is achieved from user perspective in the United States. Seven primary care domains were included in the original English PCAT-adult version: first contact utilization and access, ongoing care, coordination with specialists, comprehensiveness of service available and provided, family centeredness, community orientation, and cultural competency (Shi et al., Reference Shi, Starfield and Xu2001).

The original English PCAT-adult version has been translated into many languages. The PCAT validation studies were mostly conducted in the following countries: China (Wang et al., Reference Wang, Shi, Yin, Lai, Maitland and Nicholas2014; Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013; Wang et al., Reference Wang, Wong, Wong, Wang, Wei, Li, Tang and Griffiths2015a; Wei et al., Reference Wei, Li, Yang, Wong, Chong, Shi, Wong, Xu, Zhang, Tang, Li, Meng and Griffiths2015; Mei et al., Reference Mei, Liang, Shi, Zhao, Wang and Kuang2016), Canada (Haggerty et al., Reference Haggerty, Burge, Beaulieu, Pineault, Beaulieu, Levesque, Santor, Gass and Lawson2011b), South Korea (Lee et al., Reference Lee, Choi, Sung, Kim, Chung, Kim, Jeon and Park2009), Argentina (Berra et al., Reference Berra, Hauser, Audisio, Mantaras, Nicora, De Oliveira, Starfield and Harzheim2013; Vazquez Pena et al., Reference Vazquez Pena, Harzheim, Terrasa and Berra2017), Spain (Pasarin et al., Reference Pasarin, Berra, Rajmil, Solans, Borrell and Starfield2007; Berra et al., Reference Berra, Rocha, Rodriguez-Sanz, Pasarin, Rajmil, Borrell and Starfield2011), Brazil (Macinko et al., Reference Macinko, Almeida and De Sa2007), Japan (Aoki et al., Reference Aoki, Inoue and Nakayama2016), Vietnam (Hoa et al., Reference Hoa, Tam, Peersman, Derese and Markuns2018), Turkey (Lağarlıa et al., Reference Lağarlia, Eserb and Baydurc2014), and South Africa (Bresick et al., Reference Bresick, Sayed, Le Grange, Bhagwan and Manga2015). These validation studies suggest reasonable psychometric properties in the specific country context, but some common problems emerge, especially in factor structure and with response options. For example, factor analytic models do not support the underlying theoretical domains in several language versions (Lee et al., Reference Lee, Choi, Sung, Kim, Chung, Kim, Jeon and Park2009; Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013; Mei et al., Reference Mei, Liang, Shi, Zhao, Wang and Kuang2016). Response distributions tend to skew toward more favorable answers in most language versions, including English, compromising the formal statistical assumptions in the most psychometric analysis (Shi et al., Reference Shi, Starfield and Xu2001; Macinko et al., Reference Macinko, Almeida and De Sa2007; Pasarin et al., Reference Pasarin, Berra, Rajmil, Solans, Borrell and Starfield2007; Lee et al., Reference Lee, Choi, Sung, Kim, Chung, Kim, Jeon and Park2009; Berra et al., Reference Berra, Rocha, Rodriguez-Sanz, Pasarin, Rajmil, Borrell and Starfield2011; Reference Berra, Hauser, Audisio, Mantaras, Nicora, De Oliveira, Starfield and Harzheim2013; Haggerty et al., Reference Haggerty, Beaulieu, Lawson, Santor, Fournier and Burge2011a; Reference Haggerty, Burge, Beaulieu, Pineault, Beaulieu, Levesque, Santor, Gass and Lawson2011b; Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013; Lağarlıa et al., Reference Lağarlia, Eserb and Baydurc2014; Wang et al., Reference Wang, Shi, Yin, Lai, Maitland and Nicholas2014; Bresick et al., Reference Bresick, Sayed, Le Grange, Bhagwan and Manga2015; Aoki et al., Reference Aoki, Inoue and Nakayama2016; Mei et al., Reference Mei, Liang, Shi, Zhao, Wang and Kuang2016; Vazquez Pena et al., Reference Vazquez Pena, Harzheim, Terrasa and Berra2017).

Some of these issues were also found in the Tibetan version of PCAT that was translated from one (Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013) of the three available Chinese versions (Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013; Wei et al., Reference Wei, Li, Yang, Wong, Chong, Shi, Wong, Xu, Zhang, Tang, Li, Meng and Griffiths2015; Wang et al., Reference Wang, Wong, Wong, Wang, Wei, Li, Tang and Griffiths2015a). The Chinese version developed by Yang et al. (Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013) was adapted from adult version of PCAT longer version. The initial PCAT-Tibet included six domains of first contact, continuity, coordination, comprehensiveness, family-centeredness, and community orientation. The Tibet Autonomous Region is located in southwestern China, at an average elevation of 4000 m and an area about one eighth of China’s area, with a population of 3 million scattered over this large region. Since the national health reform initiated in China in 2009 (Yip et al., Reference Yip, Hsiao, Chen, Hu, Ma and Maynard2012) township health centers in Tibet were identified as the main primary care provider, and they received strong support from regional and national governments, including more health staff and salary increases, need-based training, and capital investments in infrastructure and necessary equipment. Actionable information about the effectiveness of these programs was needed for directing the resource allocation and to address the specific weakness in primary care service delivery.

The initial validation of PCAT-Tibetan suggested departures from the original factor structure and confirmed the skewed responses seen in other studies (Wang et al., Reference Wang, Shi, Yin, Lai, Maitland and Nicholas2014). For example, three original PCAT domains (first contact, continuity, and coordination) were split across five domains in the PCAT-Tibetan; and the original comprehensiveness domain was represented by two domains (Wang et al., Reference Wang, Shi, Yin, Lai, Maitland and Nicholas2014). These issues in psychometric properties in common with results from validation studies in other PCAT versions suggest a need for advanced psychometric analysis to examine the appropriateness of domains and items in relative to the original PCAT. In order to further examine the construct validity and reliability of PCAT-Tibetan version, this article reports on the results of confirmatory factor analysis to test the congruence between the theoretical primary care domains and the empirical results in Tibet, and on item response theory analysis to examine item performance and the appropriateness of item response options.

Methods

This was a further analysis of the initial PCAT-Tibetan validation study that was conducted among 1386 patients who visited their primary care providers in three different types of health facilities in two prefectures in 2013. This survey was administered through face-to-face interview. The detailed information about sampling and data collection can be found in our previous publication (Wang et al., Reference Wang, Shi, Yin, Lai, Maitland and Nicholas2014).

Maximum likelihood (ML) imputation method was used to replace the missing values; match age, sex, education, self-rated health status; 295 respondents were excluded because of missing values for the matching variables, leaving 1091 respondents in subsequent analyses. Those excluded were more likely to be female and less educated. To examine the robustness of our conclusion, we excluded all respondents who had at least one missing value on any item (listwise deletion) and repeated all data analyses (n = 729), which did not alter any of the general conclusions. Values were also imputed for the ‘not sure/don’t remember’ response option as an alternative to attributing the pre-set value of 2.5. In general, estimates produced with listwise deletion are less efficient than other methods of handling missing data. Therefore, we reported only results using database with ML imputation (Enders, Reference Enders2001).

In user-evaluation research, it is common to treat report and rating values as quasi-cardinal (DeVellis, Reference DeVellis1991). The items measuring patient experience in this study were strictly ordinal level, so we treated them as interval level, which was consistent with previous validation studies of original PCAT version in the United States (Shi et al., Reference Shi, Starfield and Xu2001) and the PCAT versions in other languages (Lee et al., Reference Lee, Choi, Sung, Kim, Chung, Kim, Jeon and Park2009; Haggerty et al., Reference Haggerty, Burge, Beaulieu, Pineault, Beaulieu, Levesque, Santor, Gass and Lawson2011b; Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013; Aoki et al., Reference Aoki, Inoue and Nakayama2016; Mei et al., Reference Mei, Liang, Shi, Zhao, Wang and Kuang2016). This approach was also used by the validation study of World Health Organization’s health system responsiveness survey (Valentine et al., Reference Valentine, Bonsel and Murray2007). Therefore, the four-point Likert response scale in PCAT-Tibetan version was treated as continuous variable.

First, the inter-item correlation and exploratory factor analysis (principle component analysis) by domain was conducted to flag items with low correlations (Pearson <0.20) or low factor loading (<0.30) for potential deletion. We repeated the exploratory factor analysis after deleting each item with low factor loading until all retained items had a factor loading of at least 0.30. The results of exploratory factor analysis guided subsequent confirmatory factor analysis.

Then confirmatory factor analysis using structural equation modeling was used to test the goodness of fit of the items to the theoretical domains of the original PCAT. Subsequently, confirmatory factor models were adjusted iteratively based on fit and judgment until the goodness-of-fit statistics were optimized. We also assessed the entire instrument including all domains in our data analysis. First, we included all domains in confirmatory factor analysis. Then we only included the four core domains (first contact, continuity, coordination, and comprehensiveness) in confirmatory factor analysis. The following goodness-of-fit statistics were used: normed fit index (NFI) ≥0.9 indicating good fit, comparative fit index (CFI) ≥0.9 indicating good fit, standardized root mean square residual (SRMR) ≤0.05 indicating acceptable fit.

For each domain confirmed as being unidimensional, we examined the distribution of response options of individual items within each domain based on domain performance using nonparametric item response theory analysis. However, we cannot get the exact information of each item performance. To be more precise, two parameter estimates (discriminability and difficulty) were subsequently generated using Samejima’s (Reference Samejima1969) grade response model.

We estimated the correlations between domains to examine how domains were related and whether they demonstrated distinctiveness.

Finally, we used logistic regression modeling to examine how primary care domains were associated with patients’ satisfaction with service attitude (one item) and perceived technical quality (one item) of their primary care provider. The five-point Likert response scale for the two items was dichotomized to indicate satisfaction (very satisfied or satisfied) versus dissatisfaction. All domains were put in the same model, and age, sex, education, and self-rated health status were included in the model as covariates. Education level was categorized into three groups: illiterate, primary school, junior high school and above. Self-rated health status was measured by a visual analogue scale with end points of 0 and 100, where 0 corresponds to ‘the worst health status’, and 100 corresponds to ‘the best health’.

Descriptive, correlation, and exploratory factor analysis were conducted with SPSS 22.0. ML imputation and confirmatory factor analysis were conducted with LISREL9.1. Nonparametric and parametric item response theory analyses was conducted with SPSS 22.0 and MULTILOG 7.03.

Results

Table 1 summarizes the response distribution and descriptive statistics of each item in initial PCAT-Tibetan. The percentage of true missing values for each item ranged from 0.9% to 4.4%. Most items are slightly negatively skewed, and with over 50% respondents reporting the most favorable response option ‘4 = Definitely’ on 22 of 36 items. The percentage of respondents choosing the response option of ‘not sure/don’t remember’ is higher, particularly for items in first contact access, coordination, and community orientation, suggesting that patients may not be the best information source for these domains or may not have direct experience with all aspects elicited.

Table 1. Statements and descriptive statistics by items in PCAT-Tibetan version a

a n = 1386. Mean, median, and standard deviation were calculated using the database with missing values imputed by maximum likelihood method (n = 1091).·Item labels being struck through (eg, FCA5) are deleted from the final version.

NS/DNK = not sure/don’t know.

For first contact access, two of the six items were deleted because of the unacceptable goodness-of-fit statistics (FCA5 and FCA6); and two items fit better in first contact utilization (FCA1 and FCA4), leaving only two items in first contact access and four in first contact utilization. In the eight-item ongoing care, two items were deleted because of unacceptable goodness-of-fit statistics (OC1 and OC7); and a further item (OC8) was removed to improve the goodness-of-fit statistics, leaving a five-item ongoing care scale. The eight-item comprehensiveness subscale showed unacceptable goodness-of-fit statistics on a single factor (NFI = 0.27, CFI = 0.27, SMRM = 0.19); and after removing four of the poorest fitting items, a final four-item comprehensiveness subscale demonstrated good fit (NFI = 0.98, CFI = 0.99, SMRM = 0.02). There is no change in items in other domains. The detailed iterative process to finalize the original factor structure was reported in Supplemental Table S1.

Table 2 shows the goodness-of-fit statistics of confirmatory factor analysis for each domain in the final structural model. The fit statistics of first contact utilization and coordination do demonstrate a moderate fit. For ongoing care, comprehensiveness, family centeredness, and community orientation, all the four confirmatory factor analysis models demonstrate a good fit. The empirical results suggest a best final structure for the PCAT-Tibetan of 7 domains and 27 items instead of 36. At least three items are needed for confirmatory factor analysis. First contact access only includes two items, so we did not have confirmatory factor analysis results.

Table 2. Summary of results from final model in confirmatory factor analysis for each domain and internal consistency a , b

a n = 1091. At least three items are needed for confirmatory factor analysis: first contact access only includes two items, so we did not have confirmatory factor analysis results.

b Weighted least square estimation method was used for confirmatory factor analysis using LISREL version 9.1.

c SRMR = standardized root mean square residual; ≤0.08 indicating acceptable fit.

d NFI = normed fit index; ≥0.9 indicating good fit.

e CFI = comparative fit index; ≥0.9 indicating good fit.

However, either the model including all domains (NFI = 0.46, CFI = 0.46, SMRM = 0.23) or the model including only the four core domains (NFI = 0.55, CFI = 0.56, SMRM = 0.14) shows unacceptable goodness-of-fit statistics on a single factor, which suggest that the domains included in original PCAT may not measure a common single construct in Tibet context, and it is not appropriate to report a total score.

The mean of each domain score is lower than the median and negatively skewed, indicating most of patients reporting favorable response answers. The first contact utilization score is the highest (3.66±0.48), while the first contact access score is the lowest (2.94±0.95). Cronbach α is over 0.70, indicating good internal consistency of items for all domains except first contact access (0.66) and ongoing care (0.66).

Non-parametric item response theory graphs were modeled on each unidimensional domain to provide further insight into item performance and reliability. In most items, the option characteristic curve for the response option ‘2 = Probably not’ is overshadowed by other options (see example in Figure 1a), indicating that nowhere along the primary care experience continuum was this option more likely to be chosen than other options, raising the question of the appropriateness of a four-point response scale. Only a few items perform optimally, such that the probability of choosing each response option is highest in a unique zone of primary care experience continuum, reflecting clearly ordinal response behavior appropriate to the assigned value for each option. Figure 1b shows a well-performing item from community orientation. A problem common to all items is that the extreme response option ‘4 = definitely’ covers a large area of primary care experience continuum and is most likely to be endorsed, even at below-average primary care experience level, suggesting that additional response options may be desirable.

Figure 1. Response graph by non-parametric item response theory analysis contrasting poorly and well-performing items. (a) Option characteristic curves for item CD4 in coordination ‘Was your primary care provider interested in the quality of care there?’ are modeled as a function of total scores on these measures (bottom axis). Results show difficulties with some options from this item. The probability of endorsing option 2 is relatively small compared to other options. (b) Option characteristic curves for item C01 in community orientation ‘Does your primary care provider do survey in the community to find out about health problems he or she should know about?’ are modeled as a function of total scores on these measures (bottom axis). Results show that this item performs well relative to other items.

The results from parametric item response theory analysis provide further evidence to confirm the findings from non-parametric item response theory analysis (Table 3). The discriminability parameter is over 1 for all items except one item on ongoing care, indicating that response options discriminate well between low and high levels of primary care experience. However, the difficulty parameters for almost all items are negative, indicating that positive ratings (b2, b3) are endorsed at less than average performance, reinforcing the pattern observed in Figure 1a. Likewise the information curves show that the majority of items are most informative in the negative zone of the underlying construct. Only CO1, illustrated in Figure 1b, shows that each response option corresponds to a distinct zone of the domain, including the most positive experience. Together these suggest that the items are most sensitive and specific for patients with poor primary care experience.

Table 3. Item performance for each item within its domain, showing discriminability (a) and difficulty (b) parameters a

a n = 1091. Domains were scored by averaging the value of individual items. Item response scale ranged from 1 to 4. Higher score meant better patient experience.

b IRT parameter estimates were generated using Samejima’s (Reference Samejima1969) grade response model and using IRT software MULTILOG 7.03.

a = discriminability; value >1 indicating minimal discriminability. b = difficulty; highest probability of endorsing.

The Pearson correlations between the domains indicate the distinctiveness of each domain. Correlation coefficients between domains range from 0.23 to 0.61 and are lower than Cronbach α of each domain. Coordination is most highly correlated with family centeredness (0.61) and ongoing care (0.60). First contact utilization is also highly correlated with family centeredness (0.54) and ongoing care (0.55). First contact access, comprehensiveness, and community orientation have lower correlation with other domains.

Table 4. Odds ratios (95% confidence intervals) of patient satisfaction associated with each unit increase in primary care domain score after adjusting for sex, age, education, and health status in logistic model

Finally, Table 4) shows the extent to which a unit increase in each domain score increases the odds of patient satisfaction. Although most patients are satisfied with the service attitude (82.6%) and the technical quality (80.2%) of their primary care provider, a higher PCAT score generally is associated with higher satisfaction. For example, every unit increase in ongoing care score increases the likelihood of being satisfied with service attitude by 2.65 times.

Discussion

This advanced psychometric analysis of the PCAT-Tibetan versions provides further insight into some of the problematic psychometric properties found in the initial validation analysis, and it suggests avenues to improve the metric performance of the tool. Despite the metric problems, the PCAT-Tibetan domains of first-contact, ongoing care, and coordination with specialists are associated with an increased likelihood of patient satisfaction. This illustrates the potential of the PCAT and underlines the importance of improving the tool to address some of the metric problems. Some of the problems, such as the skewness of item response distribution found in many items, are shared with the original PCAT and other versions; and our results suggest some solutions that could improve performance. Others may be specific to the PCAT-Tibetan version – such as the non-optimal resolution of items relating to First Contact constructs – suggest the need for some principles in adapting the tool to different health system contexts.

The widespread use of the PCAT to evaluate primary care in many countries provides an opportunity to compare primary care across different contexts and to support a worldwide movement to improve primary care. Our results show not only the association of the PCAT-Tibetan with patient satisfaction but also other analysis that demonstrated the capacity to distinguish between health-care organizations in Tibet and other regions in China (Wang et al., Reference Wang, Wong, Wong, Wei, Wang, Li, Tang, Gao and Griffiths2013; McCollum et al., Reference Mccollum, Chen, Chenxiang, Liu, Starfield, Jinhuan and Tolhurst2014; Wang et al., Reference Wang, Shi, Yin, Mao, Maitland, Nicholas and Liu2015b; Hu et al., Reference Hu, Liao, Du, Hao, Liang and Shi2016; Feng et al., Reference Feng, Shi, Zeng, Chen and Ling2017). Other studies have shown the capacity of different versions of the PCAT to differentiate between delivery models. For instance, in the United States, community health centers have been showed to provide better quality primary care than health maintenance organizations, especially in continuity, coordination, and comprehensiveness (Shi et al., Reference Shi, Starfield, Xu, Politzer and Regan2003). Patients mainly receiving care from private general practitioners in Hong Kong reported better primary care experiences than those mainly receiving care from public general outpatient clinics, especially in accessibility and interpersonal relationships (Wong et al., Reference Wong, Kung, Griffiths, Carthy, Wong, Lo, Chung, Goggins and Starfield2010). In South Korea, among four types of primary care clinics staffed by family physicians, health cooperative clinics displayed the best primary care performance, while public health center clinics showed the worst performance (Sung et al., Reference Sung, Suh, Lee, Ahn, Choi and Lee2010).

The negative skewness of item response distribution in many items was noted in the original validation of the long PCAT version (Shi et al., Reference Shi, Starfield and Xu2001) and has been found in most other validation studies (Macinko et al., Reference Macinko, Almeida and De Sa2007; Lee et al., Reference Lee, Choi, Sung, Kim, Chung, Kim, Jeon and Park2009; Berra et al., Reference Berra, Rocha, Rodriguez-Sanz, Pasarin, Rajmil, Borrell and Starfield2011; Haggerty et al., Reference Haggerty, Burge, Beaulieu, Pineault, Beaulieu, Levesque, Santor, Gass and Lawson2011b; Berra et al., Reference Berra, Hauser, Audisio, Mantaras, Nicora, De Oliveira, Starfield and Harzheim2013; Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013; Lağarlıa et al., Reference Lağarlia, Eserb and Baydurc2014; Aoki et al., Reference Aoki, Inoue and Nakayama2016; Mei et al., Reference Mei, Liang, Shi, Zhao, Wang and Kuang2016) but may be more extreme in contexts such as China where low literacy requires face-to-face administration (Haggerty et al., Reference Haggerty, Beaulieu, Lawson, Santor, Fournier and Burge2011a). The item response theory analysis shows that the PCAT is most reliable in identifying negative experience of care, but the low information yield in the above average range of the domains means that it will have limited sensitivity for detecting improvements in care. Some researchers suggested that new response categories should be developed to minimize the favored response (Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013), but this would be challenging using the current response scale as it is difficult to imagine an intermediate category between ‘probably yes’ and ‘definitely yes’. The acceptable discriminability parameters for most items reflect adequate capacity to discriminate between poor and average performance on domains even though the four-point response values are not optimal. This suggests that some form of dichotomous scoring could be applied to each item to give greater weight to the more informative negative responses rather than averaging across all response options. This approach is used in a Brazil study (Macinko et al., Reference Macinko, Almeida and De Sa2007) and also in the Consumer Assessment of Healthcare Providers and Systems (CAHPS) (Dyer et al., Reference Dyer, Sorra, Smith, Cleary and Hays2012). Finally, to increase discriminability and improve the potential for sensitivity to improvement, it would be good to develop more items to measure excellent primary care experience. The community orientation item about home visits (CO1) is an example of an item that discriminates clearly between average and good primary care.

Another metric issue results from offering the ‘not sure/don’t remember’ option. The high rate of endorsing this response option is common in many language versions, especially in Asian countries including Korean, Japanese and Chinese. (Lee et al., Reference Lee, Choi, Sung, Kim, Chung, Kim, Jeon and Park2009; Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013; Wang et al., Reference Wang, Shi, Yin, Lai, Maitland and Nicholas2014; Aoki et al., Reference Aoki, Inoue and Nakayama2016; Mei et al., Reference Mei, Liang, Shi, Zhao, Wang and Kuang2016). Although respondents appreciate having such response option (Haggerty et al., Reference Haggerty, Beaulieu, Lawson, Santor, Fournier and Burge2011a), its management is analytically challenging. Although the PCAT scoring manual suggests attributing a value of 2.5, this is not supported for all items in a previous response theory analysis of English and French versions of the PCAT (Haggerty et al., Reference Haggerty, Burge, Beaulieu, Pineault, Beaulieu, Levesque, Santor, Gass and Lawson2011b). A more usual practice would be to treat the values as missing and attribute values using more ML methods. However, the approach used (excluding these as missing values) will impact on the factor analysis and the subsequent conclusions about the validity of PCAT version. Again, this seems to call for further work and international collaborations on response options that accord with patient experience in different contexts.

Collaborative international work could also address principals for measuring and/or comparing domains that affected health system specificities, for instance, in the access domain. First contact in the PCAT-Tibetan fails to meet optimal psychometric standards of construct validity and internal consistency. Similar problems were also found in other PCAT validation studies in China (Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013; Mei et al., Reference Mei, Liang, Shi, Zhao, Wang and Kuang2016) and other countries in Asia (Lee et al., Reference Lee, Choi, Sung, Kim, Chung, Kim, Jeon and Park2009; Lağarlıa et al., Reference Lağarlia, Eserb and Baydurc2014; Aoki et al., Reference Aoki, Inoue and Nakayama2016). Two studies of the PCAT-Chinese version found that Cronbach α was only 0.38 (Mei et al., Reference Mei, Liang, Shi, Zhao, Wang and Kuang2016) and 0.48 (Yang et al., Reference Yang, Shi, Lebrun, Zhou, Liu and Wang2013) for first contact utilization; similar values were found in the PCAT-Turkish (Lağarlıa et al., Reference Lağarlia, Eserb and Baydurc2014). The validation of PCAT-Korean concluded that first contact could not be assessed using a traditional scale with multiple correlated items, and it was treated to be a composite domain consisting of five independent single-item subscales (Lee et al., Reference Lee, Choi, Sung, Kim, Chung, Kim, Jeon and Park2009). The exploratory factor analysis of PCAT-Japanese collapsed first contact utilization and first contact access into one scale (Aoki et al., Reference Aoki, Inoue and Nakayama2016). These results are not surprising, given that access reflects the fit between how services are organized and the perceived need of the population (Penchansky and Thomas, Reference Penchansky and Thomas1981; Khan and Bhardwaj, Reference Khan and Bhardwaj1994). Consequently, the access dimension will be sensitive to contextual differences in how services are organized. Although these subscales perform well in studies in the North American context (Shi et al., Reference Shi, Starfield and Xu2001; Haggerty et al., Reference Haggerty, Levesque, Santor, Burge, Beaulieu, Bouharaoui, Beaulieu, Pineault and Gass2011c), the first contact items in original PCAT may not be appropriate in other countries with different health system organization and patient expectations. Making appointment in advance and gatekeeping by primary care worker do not pertain to many countries. For example, geographical accessibility is a major constraint for local people to get health-care services in Tibet, but it was not addressed in the PCAT. Context-specific items should be explored based on the operational definitions of first contact. Similar issues are likely to pertain to comprehensiveness and coordination with specialists.

In contrast, the domain of ongoing care most strongly predicts patient satisfaction with primary care provider in Tibet. This is consistent with a previous systematic review showing that the most important determinants of satisfaction are the interpersonal relationships and their related aspects of care (Crow et al., Reference Crow, Gage, Hampson, Hart, Kimber, Storey and Thomas2002). Recognizing the benefit of continuity of care, China is developing a family doctor contract service model to build a long-term trust doctor–patient relationship. Under this model, residents can sign a contract with a family doctor working at a community health center and be eligible for a service contract package including basic medical care, public health, and health management service. Although countries differ in the organizational support of continuing provider–patient relationships, our study and studies in other countries point to the value of a stable long-term relationship and mutual interactive communication between patient and primary care workers (Baker et al., Reference Baker, Mainous, Gray and Love2003; Paddison et al., Reference Paddison, Abel, Roland, Elliott, Lyratzopoulos and Campbell2015). Nonetheless, measures may be improved by accounting for cultural specificities in interpersonal communication and therapeutic relationships that impact on both patient experience and outcomes.

This study contributes to the growing international work supporting the relevance and need for valid and reliable measures of the patient experience of health care. Several lessons from PCAT-Tibetan validation study may be shared with our colleagues in low- and middle-income countries. First, the items or content of instruments from developed countries may not be appropriate in other countries. Some specific features of local primary care system may not be reflected in the translated instruments. For example, no items in PCAT could reflect the feature of geographical accessibility, which is a major aspect in Tibet. Some items in comprehensiveness domain are not appropriate in Tibet. Therefore, instead of adapting existed instrument directly, qualitative research is needed to understand the local population preference of primary care first. Second, more items to measure excellent primary care experience should be developed. Most existing items have adequate capacity to discriminate between poor and average performance on different primary care domains. However, items that discriminate clearly between average and good primary care are needed. Further research is required to explore the characteristics of some exemplars with good primary care experience and what the good primary care is from their narratives. Third, the choice of response categories should be careful. The current four-point response scale and its wording in PCAT may not be appropriate in some countries. Several factors could be considered when exploring the appropriate response categories, such as literacy level, response tendency, and judgment making of local population. This could be done through qualitative research.

Finally, a summary score of overall primary care experience including all domains, which is often the most used metric when assessing a health system, is not supported by our analysis of PCAT-Tibetan version. However, this psychometrically validated 27-item Tibetan version of PCAT will be useful in monitoring and evaluating the performance of primary care system in Tibet in specific areas, especially in accessibility, continuity, and coordination, which are the priorities of current health reform efforts in Tibet. The health service research is underdeveloped in Tibet, and there is no instrument measuring patient experience that could be used when this study was conducted. We hope this study could bring more researchers’ attention into primary care performance evaluation in Tibet. We also recognize that different policy interventions to achieve primary care functions are inter-related, but each policy has its own priorities. For example, the family doctor contract service model is being developed and expanded now to improve performance in accessibility and continuity; and the transformation of tiered health service delivery system aims to promote collaboration between different health-care providers and to improve coordination. Under this context, PCAT-Tibetan version is a potential useful instrument to evaluate the effectiveness of these policy interventions.

Acknowledgements

We thank Fatima Bouharaoui for statistical advice.

Financial Support

W.W. received salary support to conduct the analysis from McGill Research Chair in Family and Community Medicine at St. Mary’s Hospital and Steinberg Global Health Postdoctoral Fellowship, McGill University, Canada.

Conflict of Interest

None.

Ethical Standards

The Ethics Committee of Tibet Autonomous Regional Health and Family Planning Commission approved the study.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/S1463423619000239.

References

Aoki, T, Inoue, M and Nakayama, T (2016) Development and validation of the Japanese version of Primary Care Assessment Tool. Family Practice 33, 112–17.CrossRefGoogle ScholarPubMed
Baker, R, Mainous, AG 3rd, Gray, DP and Love, MM (2003) Exploration of the relationship between continuity, trust in regular doctors and patient satisfaction with consultations with family doctors. Scandinavian Journal of Primary Health Care 21, 2732.CrossRefGoogle ScholarPubMed
Berra, S, Hauser, L, Audisio, Y, Mantaras, J, Nicora, V, De Oliveira, MM, Starfield, B and Harzheim, E (2013) (Validity and reliability of the Argentine version of the PCAT-AE for the evaluation of primary health care). Revista Panamericana de Salud Pública 33, 3039.CrossRefGoogle Scholar
Berra, S, Rocha, KB, Rodriguez-Sanz, M, Pasarin, MI, Rajmil, L, Borrell, C and Starfield, B (2011) Properties of a short questionnaire for assessing primary care experiences for children in a population survey. BMC Public Health 11, 285.CrossRefGoogle Scholar
Bresick, G, Sayed, AR, Le Grange, C, Bhagwan, S and Manga, N (2015) Adaptation and cross-cultural validation of the United States Primary Care Assessment Tool (expanded version) for use in South Africa. African Journal of Primary Health Care & Family Medicine 7, e1e11.CrossRefGoogle ScholarPubMed
Crow, R, Gage, H, Hampson, S, Hart, J, Kimber, A, Storey, L and Thomas, H (2002) The measurement of satisfaction with healthcare: implications for practice from a systematic review of the literature. Health Technology Assessment 6, 1244.CrossRefGoogle ScholarPubMed
DeVellis, RF (1991) Scale development: theory and applications, vol. 26 (applied social research methods series). London: Sage Publications.Google Scholar
Dyer, N, Sorra, JS, Smith, SA, Cleary, PD and Hays, RD (2012) Psychometric properties of the Consumer Assessment of Healthcare Providers and Systems (CAHPS(R)) Clinician and Group Adult Visit Survey. Medical Care 50 (Suppl), S2834.CrossRefGoogle ScholarPubMed
Enders, CK (2001) A primer on maximum likelihood algorithms available for use with missing data. Structural Equation Modeling 8, 128–41.CrossRefGoogle Scholar
Feng, S, Shi, L, Zeng, J, Chen, W and Ling, L (2017) Comparison of primary care experiences in village clinics with different ownership models in Guangdong Province, China. PLoS One 12, e0169241.CrossRefGoogle ScholarPubMed
Haggerty, J, Burge, F, Levesque, JF, Gass, D, Pineault, R, Beaulieu, MD and Santor, D (2007) Operational definitions of attributes of primary health care: consensus among Canadian experts. The Annals of Family Medicine 5, 336–44.CrossRefGoogle ScholarPubMed
Haggerty, JL, Beaulieu, C, Lawson, B, Santor, DA, Fournier, M and Burge, F (2011a). What patients tell us about primary healthcare evaluation instruments: response formats, bad questions and missing pieces. Healthcare Policy 7, 6678.Google ScholarPubMed
Haggerty, JL, Burge, F, Beaulieu, MD, Pineault, R, Beaulieu, C, Levesque, JF, Santor, DA, Gass, D and Lawson, B (2011b). Validation of instruments to evaluate primary healthcare from the patient perspective: overview of the method. Healthcare Policy 7, 3146.Google ScholarPubMed
Haggerty, JL, Levesque, JF, Santor, DA, Burge, F, Beaulieu, C, Bouharaoui, F, Beaulieu, MD, Pineault, R and Gass, D (2011c) Accessibility from the patient perspective: comparison of primary healthcare evaluation instruments. Healthcare Policy 7, 94107.Google Scholar
Hoa, NT, Tam, NM, Peersman, W, Derese, A and Markuns, JF (2018) Development and validation of the Vietnamese primary care assessment tool. PLoS One 13, e0191181.CrossRefGoogle ScholarPubMed
Hu, R, Liao, Y, Du, Z, Hao, Y, Liang, H and Shi, L (2016) Types of health care facilities and the quality of primary care: a study of characteristics and experiences of Chinese patients in Guangdong Province, China. BMC Health Services Research 16, 335.CrossRefGoogle ScholarPubMed
Khan, AA and Bhardwaj, SM (1994) Access to health care. Evaluation and the Health Professions 17, 6076.CrossRefGoogle ScholarPubMed
Kringos, DS, Boerma, W, Van der zee, J and Groenewegen, P (2013) Europe’s strong primary care systems are linked to better population health but also to higher health spending. Health Affairs (Millwood) 32, 686–94.CrossRefGoogle ScholarPubMed
Kruk, M, Pate, M and Mullan, Z (2017) Introducing The Lancet Global Health Commission in High-Quality Health Systems in the SDG era. The Lancet Global Health 5, e4801.CrossRefGoogle ScholarPubMed
Lağarlia, T, Eserb, E and Baydurc, H (2014) Psychometric properties of the Turkish adult consumer version of the Primary Care Assessment Tool (PCAT-TR). Turkish Journal of Public Health 12, 162–77.Google Scholar
Lee, JH, Choi, YJ, Sung, NJ, Kim, SY, Chung, SH, Kim, J, Jeon, TH, Park, HK and Korean Primary Care Research G (2009) Development of the Korean primary care assessment tool – measuring user experience: tests of data quality and measurement performance. International Journal for Quality in Health Care 21, 103–11.CrossRefGoogle ScholarPubMed
Macinko, J, Almeida, C and De Sa, PK (2007) A rapid assessment methodology for the evaluation of primary care organization and performance in Brazil. Health Policy Plan 22, 167–77.CrossRefGoogle Scholar
Macinko, J, Starfield, B and Erinosho, T (2009) The impact of primary healthcare on population health in low- and middle-income countries. The Journal of Ambulatory Care Management 32, 150–71.CrossRefGoogle ScholarPubMed
Mccollum, R, Chen, L, Chenxiang, T, Liu, X, Starfield, B, Jinhuan, Z and Tolhurst, R (2014) Experiences with primary healthcare in Fuzhou, urban China, in the context of health sector reform: a mixed methods study. The International Journal of Health Planning and Management 29, e10726.CrossRefGoogle ScholarPubMed
Mei, J, Liang, Y, Shi, L, Zhao, J, Wang, Y and Kuang, L (2016) The development and validation of a rapid assessment tool of primary care in China. BioMedical Research International 2016, 6019603.CrossRefGoogle ScholarPubMed
Paddison, CAM, Abel, GA, Roland, MO, Elliott, MN, Lyratzopoulos, G and Campbell, JL (2015) Drivers of overall satisfaction with primary care: evidence from the English General Practice Patient Survey. Health Expectations 18, 1081–92.CrossRefGoogle ScholarPubMed
Pasarin, MI, Berra, S, Rajmil, L, Solans, M, Borrell, C and Starfield, B (2007) (An instrument to evaluate primary health care from the population perspective). Atención Primaria 39, 395401.Google Scholar
Penchansky, R and Thomas, JW (1981) The concept of access: definition and relationship to consumer satisfaction. Medical Care 19, 127140.CrossRefGoogle ScholarPubMed
Samejima, F (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement 34, (4, Pt. 2), 100.Google Scholar
Shi, L, Starfield, B and Xu, J (2001) Validating the adult primary care assessment tool. The Journal of Family Practice 50, 161175.Google Scholar
Shi, L, Starfield, B, Xu, J, Politzer, R and Regan, J (2003) Primary care quality: community health center and health maintenance organization. The Southern Medical Journal 96, 787–95.CrossRefGoogle ScholarPubMed
Starfield, B, Shi, L and Macinko, J (2005) Contribution of primary care to health systems and health. The Milbank Quarterly 83, 457502.CrossRefGoogle ScholarPubMed
Sung, NJ, Suh, SY, Lee, DW, Ahn, HY, Choi, YJ, Lee, JH and Korean Primary Care Research G (2010) Patient’s assessment of primary care of medical institutions in South Korea by structural type. International Journal for Quality in Health Care 22, 493–9.CrossRefGoogle ScholarPubMed
Valentine, N, Bonsel, GJ and Murray, CJL. (2007) Measuring quality of health care from the user’s perspective in 41 countries: psychometric properties of WHO’s questions on health systems responsiveness. Quality of Life Research 16, 1107–25.CrossRefGoogle ScholarPubMed
Vazquez Pena, F, Harzheim, E, Terrasa, S and Berra, S (2017) [Psychometric validation in Spanish of the Brazilian short version of the Primary Care Assessment Tools-users questionnaire for the evaluation of the orientation of health systems towards primary care]. Atención Primaria 49, 6976.Google ScholarPubMed
Wang, HH, Wong, SY, Wong, MC, Wei, XL, Wang, JJ, Li, DK, Tang, JL, Gao, GY and Griffiths, SM (2013) Patients’ experiences in different models of community health centers in southern China. The Annals of Family Medicine 11, 517–26.CrossRefGoogle ScholarPubMed
Wang, W, Shi, L, Yin, A, Lai, Y, Maitland, E and Nicholas, S (2014) Development and validation of the Tibetan primary care assessment tool. BioMed Research International 2014, 308739.Google ScholarPubMed
Wang, W, Shi, L, Yin, A, Mao, Z, Maitland, E, Nicholas, S and Liu, X (2015) Primary care quality between traditional Tibetan Medicine and Western Medicine Hospitals: a pilot assessment in Tibet. International Journal for Equity in Health 14, 45.CrossRefGoogle ScholarPubMed
Wang, HH, Wong, SY, Wong, MC, Wang, JJ, Wei, XL, Li, DK, Tang, JL and Griffiths, SM (2015a) Attributes of primary care in community health centres in China and implications for equitable care: a cross-sectional measurement of patients’ experiences. QJM 108, 549–60.CrossRefGoogle ScholarPubMed
Wang, W, Shi, L, Yin, A, Mao, Z, Maitland, E, Nicholas, S and Liu, X (2015b) Primary care quality among different health care structures in Tibet, China. BioMedical Research International 2015, 206709.Google ScholarPubMed
Wei, X, Li, H, Yang, N, Wong, SY, Chong, MC, Shi, L, Wong, MC, Xu, J, Zhang, D, Tang, J, Li, DK, Meng, Q and Griffiths, SM (2015) Changes in the perceived quality of primary care in Shanghai and Shenzhen, China: a difference-in-difference analysis. Bulletin of the World Health Organization 93, 407–16.CrossRefGoogle ScholarPubMed
Wong, SY, Kung, K, Griffiths, SM, Carthy, T, Wong, MC, Lo, SV, Chung, VC, Goggins, WB and Starfield, B (2010) Comparison of primary care experiences among adults in general outpatient clinics and private general practice clinics in Hong Kong. BMC Public Health 10, 397.CrossRefGoogle ScholarPubMed
Yang, H, Shi, L, Lebrun, LA, Zhou, X, Liu, J and Wang, H (2013) Development of the Chinese primary care assessment tool: data quality and measurement properties. International Journal for Quality in Health Care 25, 92105.CrossRefGoogle ScholarPubMed
Yip, WC, Hsiao, WC, Chen, W, Hu, S, Ma, J and Maynard, A (2012) Early appraisal of China’s huge and complex health-care reforms. The Lancet 379, 833–42.CrossRefGoogle ScholarPubMed
Figure 0

Table 1. Statements and descriptive statistics by items in PCAT-Tibetan versiona

Figure 1

Table 2. Summary of results from final model in confirmatory factor analysis for each domain and internal consistencya,b

Figure 2

Figure 1. Response graph by non-parametric item response theory analysis contrasting poorly and well-performing items. (a) Option characteristic curves for item CD4 in coordination ‘Was your primary care provider interested in the quality of care there?’ are modeled as a function of total scores on these measures (bottom axis). Results show difficulties with some options from this item. The probability of endorsing option 2 is relatively small compared to other options. (b) Option characteristic curves for item C01 in community orientation ‘Does your primary care provider do survey in the community to find out about health problems he or she should know about?’ are modeled as a function of total scores on these measures (bottom axis). Results show that this item performs well relative to other items.

Figure 3

Table 3. Item performance for each item within its domain, showing discriminability (a) and difficulty (b) parametersa

Figure 4

Table 4. Odds ratios (95% confidence intervals) of patient satisfaction associated with each unit increase in primary care domain score after adjusting for sex, age, education, and health status in logistic model

Supplementary material: File

Wang and Haggerty supplementary material

Table S1

Download Wang and Haggerty supplementary material(File)
File 23.5 KB