Collection of high-quality dietary data in large populations is a challenging priority in nutritional epidemiology, in both aetiological research and surveillance studies. Bias due to measurement error of dietary factors is now widely acknowledged because no instrument to assess dietary intake is perfectly accurate( Reference Kipnis, Midthune and Freedman 1 , Reference Schatzkin, Subar and Moore 2 ). Beyond unreliable descriptions of usual intakes, estimates of relationships between diet and disease may be attenuated or biased towards the null, and measurement error causes a loss of power to detect significant associations( Reference Kipnis, Midthune and Freedman 3 ).
The main dietary tools used in nutritional epidemiology are either contemporaneous dietary records (DR) or retrospective instruments such as multiple 24 h recalls or FFQ. Until recently, repeated 24 h recalls or records on non-consecutive days were not used as main instruments for assessing diet in many cohort studies because of the substantial costs of repeated assessment to ensure reliable usual intake estimation. Instead, dietary exposure was mostly assessed through FFQ( Reference Subar, Kipnis and Troiano 4 ), despite evidence that repeated 24 h recalls, taking into account the day-to-day variation, outperform FFQ in the accurate assessment of individual usual intake( Reference Carroll, Midthune and Subar 5 – Reference Schatzkin, Kipnis and Carroll 7 ).
The development of new technologies has led to an increasing number of innovative assessment tools, including online options, which are promising for applying in large-scale epidemiological studies( Reference Hercberg 8 , Reference Illner, Freisling and Boeing 9 ). In this context, Web-based self-administered tools for DR or 24 h dietary recalls could allow for accessing accurate dietary data on large samples with substantial resource savings. However, it is first necessary to validate such tools against objective markers of dietary intake.
‘Recovery biomarkers’ such as urinary N, K and Na are likely to closely reflect true dietary intake of these nutrients, and errors in measuring intake and urinary biomarkers are likely to be independent of each other( Reference Kipnis, Subar and Midthune 10 ). This contrasts ‘concentration biomarkers’, such as plasma vitamins or fatty acids, which are subject to metabolic regulation and do not always correlate closely with intakes of their corresponding nutrients( Reference Freedman, Kipnis and Schatzkin 11 ). Recovery biomarkers have been used in various dietary instrument validation studies (FFQ and 24 h recalls), including the OPEN (Observing Protein and Energy Nutrition) Study( Reference Subar, Kipnis and Troiano 4 , Reference Lissner, Troiano and Midthune 12 ), the EFCOVAL (European Food Consumption Validation) Study( Reference Crispim, de Vries and Geelen 13 ), the Women's Health Initiative Nutritional Biomarker Study( Reference Neuhouser, Tinker and Shaw 14 ) and the AMPM (Automated Multiple-pass Method)( Reference Rhodes, Murayi and Clemens 15 ), where the difference between reported and measured intakes could be estimated, as well as correlations between intakes and biomarker values.
NutriNet-Santé is the first Web-based, prospective cohort study that aims to investigate the relationship between nutrition and health( Reference Hercberg, Castetbon and Czernichow 16 ). Diet is assessed by three non-consecutive days of records at baseline and, again, at each year of follow-up. The dietary recording is self-administered through a specific Web-based tool, which has shown high agreement with an interview with a dietician as shown by median intra-class correlation and Pearson's correlation coefficients of 0·7–0·8( Reference Touvier, Kesse-Guyot and Mejean 17 ). However, this comparison study was not able to estimate the ability of the tool to assess true intake.
In the present study, we aimed to investigate the validity of a Web-based, self-administered DR tool of protein, K and Na intakes, as assessed by three non-consecutive DR days, against two non-consecutive measures of 24 h urinary biomarkers (24 h U) of these nutrients.
Materials and methods
Study population and ethics statement
The study participants were volunteers who participated in the NutriNet-Santé Study, an ongoing Web-based cohort study launched in France in May 2009, whose aims and methods have been described elsewhere( Reference Hercberg, Castetbon and Czernichow 16 ). Briefly, using a dedicated web site, adult volunteers (aged >18 years) were followed for at least 10 years (recruitment still ongoing). Informed consent was obtained electronically from all participants. All procedures were approved by the International Research Board of the French Institute for Health and Medical Research (IRB Inserm no. 0000388FWA00005831) and the French National Information and Citizen Freedom Committee ‘CNIL’ (no. 908450 and 909216). At inception, participants completed a set of questionnaires assessing demographic, socio-economic and lifestyle factors, dietary intake measurements (three non-consecutive DR days), physical activity, anthropometry and health status. Dietary intake was evaluated again annually, and questionnaires on health status are sent on a regular basis.
A randomly selected sample of 1400 NutriNet-Santé Study participants living in Paris and greater area (for logistical reasons), stratified by sex, age ( < 45, >45 years) and educational level (primary and secondary up to some college, university graduate), were invited by e-mail to take part in the Dietary Validation Study. The objective was to recruit 200 participants. Since recovery biomarkers have been shown to be robust markers of dietary intake in individuals who are weight-stable and not experiencing illness( Reference Bingham 18 ), exclusion criteria were as follows: self-reported metabolic disease (diabetes, heart failure, kidney failure or intestinal malabsorption, e.g. Crohn's disease); adherence to a weight-loss diet with observed weight loss >1·5 kg/week over the past 4 weeks; currently pregnantor breast-feeding.
To ensure the validity of biomarkers derived from 24 h urine collections using para-amino benzoic acid (PABA), allergy to PABA was also an exclusion criterion. Participants were already enrolled in the NutriNet-Santé Study, and, thus, all had at least basic computer knowledge and no difficulty in understanding or reading French. The protocol was approved by the Consultative Committee Protecting Participants in Biomedical Research, Saint-Louis, Paris (no. 2011/22) and the ‘CNIL’ (DR-2012-467). Participants who completed the study received 100€ as compensation for the burdensome protocol.
Study design
Recruitment was carried out between October 2012 and April 2013. Interested subjects responded by e-mail and were subsequently contacted by telephone to check eligibility and to schedule their clinic visits and dates of DR and 24 h U. The study consisted of two visits at the clinical centre (Hôtel Dieu hospital, Paris), both in a fasting state (minimum of 6 h). At the first visit, clinical measurements were taken (blood pressure and heart pulse, height and weight). Participants were given instructions for the 24 h U collection and a physical activity questionnaire on occupational, transport and leisure time physical activity during the last 4 weeks to fill in at home (paper, self-administered) before the second visit. To complete the three DR days, a specific login and password was given to the participants. The second visit was scheduled approximately 3 weeks later. Between the two visits, three DR on non-consecutive days were self-administered through the specific Web-based tool. Two 24 h urine samples were collected per participant, covering the same 24 h periods as the first and the third DR days, with a time lag of approximately 2 weeks between the first and third DR. This scheme corresponds to the design participants follow in the NutriNet-Santé Study: three DR days randomly allocated for 2 weeks.
Dietary data collection
The Web-based tool was designed for self-administration and based on a secured user-friendly interface, designed by Medical Expert Systems©. Participants report all foods and beverages (type and quantity) consumed during all eating occasions during 24 h from midnight to midnight. Participants first enter a list of every food item consumed at all eating occasions that they can recall via one of the following two ways: a food browser (foods are grouped by category) or a search engine that accepts spelling errors. Participants then estimate portion sizes of foods with the help of photographs, derived from a previously validated picture booklet that represents more than 250 generic foods( Reference Le Moullec, Deheeger and Preziosi 19 ), corresponding to more than 2000 specific food items, presented in three different portions sizes. Along with the two intermediate and two extreme quantities, there are seven choices of amounts. Participants could also directly enter the quantity of foods consumed in grams or a measure of volume, use purchased units or describe intake in standard household units (e.g. teaspoons and tablespoons). Finally, after all food items and quantities have been entered, a summation is provided and participants have to review and describe if additional salt was consumed, and, if so, in what quantity (household units or grams). For each participant, daily nutrient intakes were calculated using the ad-hoc NutriNet-Santé composition table( 20 ). An intake below 2092 kJ/d (500 kcal/d) for women or 3347 kJ/d (800 kcal/d) for men was considered implausible and excluded( Reference Willett 21 ), and the final analyses included only participants with at least two valid DR. Two DR were collected on weekdays and one on a weekend day.
24 h Urine collections and recovery biomarkers
At the first clinic visit, participants received instructions, materials (containers, four PABA pills) and a questionnaire for each 24 h urine collection. They were instructed to discard the first urine of the day of collection, and then to collect all urine passed during the next 24 h, up to the first urine passed on the next morning, which was also collected. During the day of collection, the container was kept at room temperature with the instruction to keep it in a dark place. To verify the collection samples, participants were asked to take two 100 mg PABA tablets on the day of collection and were informed that this process was to check the completeness of the collection as it may aid the collection of accurate samples( Reference Subar, Midthune and Tasevska 22 ). On the questionnaire, participants had to provide the times when collection started and finished (the following day), the time at which PABA pills were taken, any missing void and medications taken on that day. Urine samples were processed straight after collection the following morning: they were weighed, carefully mixed and aliquoted into 1 ml samples and stored at − 80°C. In May 2013, all samples were transported to appropriate laboratories.
Urinary N concentration was measured by pyrochemoluminescence on an Antek 9000 analyzer, which produces results very well correlated with the reference method (the Kjeldahl technique)( Reference Neveux, David, Cynober and Cynober 23 ), at Cochin Hospital, Université Paris Descartes. K and Na concentrations were measured by ion-selective electrodes (Siemens Dimension Vista) at the laboratory of Nutrition Hormonology in the CHU (Centre Hospitalier Universitaire) of Grenoble. Creatinine concentration, used as a marker to check for validity of urine collection, was measured by alkaline picrate kinetic (Siemens Dimension Vista) also in Grenoble. The CV of these analyses (intra-assay precision) was < 3 %.
Covariate assessment
Height of the participants was measured without shoes to the nearest 0·5 cm by a trained technician, using a wall-mounted stadiometer( Reference Lohman, Roche and Martorell 24 ). Weight (to the nearest 0·1 kg) of the participants (wearing only underwear) was measured with a calibrated impedance body composition analyzer (BC-418MA; Tanita©). BMI was calculated as the weight (kg) divided by the squared height (m2). Dietary supplement use, frequency and type were determined by a questionnaire.
Statistical analysis
Description of the characteristics of the study participants (means and standard deviations or n and %) was compared between men and women using Kruskal–Wallis (when normality was not met) or t test for continuous variables and χ2 tests for categorical variables.
Assuming that approximately 81 % of N is excreted via urine in 24 h, and that proteins contain 16 % of N( Reference Bingham 25 ), that 77 % of K( Reference Tasevska, Runswick and Bingham 26 ) and 86 % of Na( Reference Rhodes, Murayi and Clemens 15 )are excreted in 24 h, we could calculate the biomarker-based intakes:
24 h Urine collections were determined as valid using the following criteria: collection time between 22 and 26 h; urine volume ≥ 500 ml( Reference Rhodes, Murayi and Clemens 15 ); reported missing urine (estimated volume missed void >5 % total volume) and creatinine >10 or >15 mg/kg for women and men, respectively( Reference Stein 27 ). If one or more of the listed criteria was not met, then the 24 h U collection was considered invalid. The following sensitivity analyses were conducted: (1) exclusion of urine samples with >1 reported missing void because people admitting one missing void might be actually more diligent or have missed only a small volume compared with those reporting more than one missing void( Reference Subar, Midthune and Tasevska 22 ); (2) exclusion of participants with one invalid urine measure.
All intake and excretion values were log-transformed to improve normality. Intra-cluster correlation coefficients between U1 and U2 (using the mean of two measurements), and between the three DR (using the mean of three measurements), were calculated with the SAS macro %ICC9( Reference Spiegelman 28 ). Mean protein, K and Na intakes based on up to 3 d of DR (R ij for an individual i on a day j), and excretion on up to 2 d of 24 h U (M ij ), were calculated on the log-transformed values and exponentiated to obtain geometric means and 95 % CI. For an individual, the log-ratio $$\,log\,(\overline{ R }_{ i }/\overline{ M }_{ i }) $$ was calculated, where $$\overline{ R }_{ i } $$ is the individual mean of up to three DR and $$\overline{ M }_{ i } $$ is the mean of up to two 24 h U. After exponentiation of the sample mean log-ratio, with a ratio of 1 representing no difference between intake and excretion, we expressed the distance to the reference in percent, e.g. a ratio of 0·90 (90 %) is equivalent to a relative difference of − 10 %. Misreporting refers to the presence of a significant difference.
A ratio below 70 % indicated the presence of severe under-reporting, between 70 and 80 % moderate under-reporting, between 80 and 120 % correct reporting, and above 120 % over-reporting bias( Reference Arab, Tseng and Ang 29 ).
We calculated the ratio across age categories ( ≤ 45 and >45 years) and across BMI categories ( < 25, 25–29·9, and ≥ 30 kg/m2) and compared them using ANOVA after assumptions were checked.
To assess the validity of the DR tool, we calculated Pearson's correlation coefficients and their confidence interval using the Fisher's Z transformation, both unadjusted and adjusted for age, BMI, physical activity and energy intake (by the residual method( Reference Willett 21 )).
To examine the structure of the measurement errors, a complex measurement error was assumed( Reference Kipnis, Subar and Midthune 10 ). It is described in online Supplementary material. This allowed for the calculation of the correlations between reported and true intakes on the same given day (assesses if the instrument measures what it is supposed to be measured), and correlation coefficients between usual reported intake and true intake, as well as attenuation factors (λ)( Reference Kipnis, Subar and Midthune 10 ). Attenuation factors represent the attenuation of the strength of the relationship between nutrient intake and a disease of interest; a value closer to 1 meaning that there is less attenuation (with 1 representing no attenuation at all). Although no exact cut-off exists to interpret correlation and attenuation coefficients, a value of at least 0·40 would avoid needing hugely inflated sample sizes to observe significant diet–disease relationship( Reference Freedman, Commins and Moler 30 ); hence, values ≥ 0·40 were deemed acceptable/fair, ≥ 0·60 as high and < 0·40 as low.
All analyses were performed using SAS version 9.3 (SAS Institute, Inc.), the significance level was two-sided and set at P= 0·05.
Results
Subject characteristics
Of the 1400 individuals contacted by e-mail, 237 (16·9 %) responded. Of these, seven (3 %) were ineligible and thirty-one (13 %) were not able to attend the planned clinic visits; hence, 199 participants were included in the study.
A total of 398 24 h U specimens were available. Both 24 h U measurements were invalid for four female participants and one male participant; hence, these five participants were excluded from the analysis. One man had one invalid 24 h U and two implausible DR, and was thus excluded. Finally, 193 subjects were included in the analysis. Twenty-five subjects had data for only one 24 h U because fourteen (7·3 %) first 24 h U and eleven (5·7 %) second 24 h U were considered invalid.
Participant characteristics are presented in Table 1. The sample was composed of 47·7 % of females, who did not differ from males in terms of age (50·5 (sd 16·4) years) or BMI (24·0 (sd 3·5) kg/m2). Obesity (BMI ≥ 30 kg/m2) was more common in women than in men (12 v. 3 %); however, overweight (25 ≤ BMI < 30 kg/m2) was more common in men (36 v. 18 %). Women had a higher frequency of dietary supplement use (36 v. 24 %). Men had higher energy intake (10 000 kJ in men v. 7172 kJ in women). Energy from protein was lower for men than women; however, energy from fat and carbohydrates were not appreciably different.
MET, metabolic equivalents of task; LTPA, leisure time physical activity.
* The difference between men and women was estimated using t test and χ2 tests as appropriate.
† Mean intake was calculated from three non-consecutive dietary record days.
‡ Percentage of energy intake (excluding alcohol).
Intakes of protein, potassium and sodium and misreporting
Intakes of protein, K and Na based on three DR days and two 24 h U excretion are summarised in Table 2. Intra-cluster correlation coefficients between U1 and U2 were 0·60 for proteins, 0·45 for K and 0·36 for Na, and between three diet records 0·52, 0·54 and 0·47 for protein, K and Na, respectively.
24 h U, 24-h urine collection.
* The difference between men and women was estimated using t test.
† Mean difference in % was calculated from the log ratio of mean reported intake (non-consecutive DR) over mean biomarker intake (24 h U (24 h urinary biomarkers)) following the formula $$100\,\left [exp\left (\frac { \sum _{ i = 1}^{ n }log\left (\frac {\overline{ R }_{ i }}{\overline{ M }_{ i }}\right )}{ n }\right ) - 1\right ] $$ , where $$\overline{ R }_{ i } $$ is the geometric mean of DR for an individual i across the three measurements; $$\bar {>M} _{ i } $$ is the geometric mean of 24 h U for an individual i across the two measurements; and n the number of individuals in the sample. A mean log ratio of zero would represent no difference in reporting compared with the biomarker measure. The exponentiation allows to express it as a ratio whose reference value is 1, and we further expressed it as a per cent difference, e.g. a ratio of 0·90 is a per cent difference of − 10 %.
Men and women under-reported their protein intake ( − 14·4 and − 13·9 %, respectively, NS between-sex difference, P= 0·88). Men showed non-significant over-reporting for K and Na intakes, while women under-reported these two nutrients.
Misreporting was greater in women aged >45 years than those aged ≤ 45 years for intakes of protein ( − 17 v. − 8 %, P= 0·047) and Na ( − 15 v. +3 %, P= 0·04); however, no significant difference across age categories was observed for K, and no misreporting differences were observed for males. By BMI categories, misreporting of Na intake was greater for obese women than overweight or normal-weight men, although the difference did not reach statistical significance ( − 25 % in obese, − 2 % in overweight and − 7 % in normal weight, P= 0·13).
The frequency of misreporting is summarised in Table 3. The difference between men and women was non-significant; however, a trend was observed for K with more men over-reporting (24·5 %) than women (20·9 %), and for Na with more women severely under-reporting (29·7 %) than men (16·7 %).
* Based on the log ratio of mean reported intake (non-consecutive DR) over mean biomarker intake (24h U (24 h urinary biomarkers)). Ratio < 70 %: severe under-reporter; 70 % < ratio < 80 %: moderate under-reporter; 80 % < ratio < 120 %: normo-reporter; ratio>120 %: over-reporter.
† The difference between men and women was estimated using Fisher's exact test.
Correlations and attenuation
Correlation coefficients between intake (DR) and excretion (24 h U) are summarised in Table 4. Higher correlations were observed for men than for women for all the three nutrients.Crude correlations ranged from 0·45 (Na) to 0·63 (K) for men, and from 0·27 (Na) to 0·54 (protein) for women. Adjusted correlations for age, BMI, level of education and energy intake were higher than the crude coefficients for women, but lower for men.
* Pearson's correlation adjusted for energy intake by the residual method, age, BMI and level of education.
Sensitivity analyses taking into account only the first and third DR, which correspond to the days of 24 h U collection, showed overall similar results for relative differences and correlations; the only notable exception was a lower correlation between Na intake and excretion in men (r 0·17).
Taking into account the complex measurement error model, we calculated the correlations between reported intake by one DR and true intake on the same day (Table 5). These coefficients were higher than crude correlations for women, and similar to those for men.
* Correlation coefficient between DR and true intake on the same given day as estimated by the model accounting for the reference biomarkers (24h U (24 h urinary biomarkers)) as reference measurement. For more details on calculation, see online Supplementary material.
Finally, correlations between intake of the average of three DR and true usual intake (Table 6) were high for protein in both men and women (>0·60), very high for K in men, while only fair in women, and fair (men) to poor (women) for Na. Attenuation factors ranged from 0·23 (Na, women) to 0·60 (K, men).
* Correlation coefficient between the average of three non-consecutive-day DR and true usual intake as estimated by the model accounting for the reference biomarkers (average of three 24h U (24 h urinary biomarkers)) as reference measurement.
† Interpretation of attenuation factor: a value closer to 1 indicates lower attenuation of the true relationship between intake and disease. For more detail on calculation, see online Supplementary material.
Discussion
The present validation study is the first to examine the structure of the measurement error with repeated Web-based, self-administered, non-consecutive-day DR, allowing for the estimation of the correlations with true intakes of protein, K and Na. Only a few studies( Reference Subar, Kipnis and Troiano 4 , Reference Crispim, de Vries and Geelen 13 , Reference Arab, Tseng and Ang 29 , Reference Bingham, Gill and Welch 31 – Reference Slimani, Bingham and Runswick 35 ) have assessed the validity of repeated short-term instruments, such as 24 h recalls, against biomarkers, and none has validated Web-based self-administered non-consecutive DR.
Misreporting of protein, potassium and sodium intakes
We found that on average, men under-reported protein but slightly over-reported their K and Na intake, whereas women under-reported protein, K and Na intakes. Correlation coefficients indicated that three non-consecutive 24 h diet records self-administered via the Web-based tool perform well for the estimation of protein and K intakes, and fairly well for estimating Na intake.
The EFCOVAL and the OPEN studies aimed to validate two 24 h recalls, administered by a dietitian, against urinary biomarkers. Results in the French EFCOVAL centre showed under-reporting of − 12·1 % for protein and − 17·1 % for K in men and − 12·8 for protein and − 13·0 % for K in women, respectively( Reference Crispim, de Vries and Geelen 13 ). For protein, the results are similar to our findings; however, for K, under-reporting was much more prominent in the EFCOVAL Study than in the present study. In the American OPEN Study, under-reporting of protein was also similar ( − 11 to − 12 %)( Reference Subar, Kipnis and Troiano 4 ). Regarding Na, the United States Department of Agriculture AMPM validation study( Reference Rhodes, Murayi and Clemens 15 ), with two 24 h urine collections covering the same time period as two 24 h recalls, showed greater under-reporting ( − 7 % for men and − 10 % for women) than in the present study. Protein, K and Na find their main source in very different food groups, and represent different aspects of diet quality, so it is not surprising that dietary misreporting differs across nutrients, as suggested elsewhere( Reference Freedman, Commins and Moler 36 ).
Crude correlation coefficients in EFCOVAL Study were 0·65 (protein) and 0·62 (K) in men and 0·46 (protein) and 0·61 (K) in women, respectively, which is slightly higher than those in the present study. However, correlation coefficients for protein found in the present study are somewhat higher than usually reported in other validation studies including short-term instruments (24 h recalls), such as the OPEN Study (r 0·41 for men and r 0·26 for women)( Reference Subar, Kipnis and Troiano 4 ), the DEARR (Dietary Evaluation and Attenuation of Relative Risk) study (r 0·29)( Reference Shai, Rosner and Shahar 34 ) or the UK arm of EPIC (European Prospective Investigation into Cancer and Nutrition) (r 0·10 for one 24 h recall)( Reference Bingham, Gill and Welch 31 ), and are more similar to the one observed with a 7-d diary (r 0·65)( Reference Bingham, Gill and Welch 31 ).
Greater misreporting and lower correlation coefficients for all the three nutrients (protein, K and Na) were observed in women than in men in the present study, which is fairly consistent with most of the validation studies of short-term instruments for protein( Reference Subar, Kipnis and Troiano 4 , Reference Crispim, de Vries and Geelen 13 ), K( Reference Crispim, de Vries and Geelen 13 ) or Na( Reference Rhodes, Murayi and Clemens 15 ). Although the present study does not allow exploring this aspect in depth, differences in social desirability is a potential explanation because of the societal pressure placed on women to be slim. Women, more than men, may under-report to prevent being seen as indulging in an undesirable behaviour, such as eating unhealthy food or overeating( Reference Novotny, Rumpler and Riddick 37 , Reference Johnson, Goran and Poehlman 38 ).
We found no significant difference in misreporting of protein, K or Na according to BMI categories. However, for protein, the trend was towards more under-reporting of intake among the overweight or obese individuals than among the normal-weight individuals. Given the very low number of obese men (n 3) in the study, we carried out the analyses between normal-weight (BMI < 25 kg/m2) and overweight/obese (BMI ≥ 25 kg/m2) men and showed the same non-significant trend ( − 18 % in overweight v. − 12 % in normal weight, P= 0·16). This follows the trend observed in the OPEN Study: lower correlation coefficients between reported protein intake (average of two 24 h recalls) and biomarkers in obese than in non-obese men (r 0·217 v. 0·483, P= 0·05)( Reference Lissner, Troiano and Midthune 12 ). For K, BMI classification did not seem to influence misreporting. For Na, the AMPM validation study found that overweight and obese men and women under-reported more than their normal-weight counterparts. This finding is similar to the trend observed in the present study for women. Across age categories, in the AMPM, females under 50 years tended to under-report Na intake more than their elder counterparts ( − 15 v. − 5 %) whereas we found the opposite. This can be explained by a lower computer knowledge among the older participants( Reference Klovning, Sandvik and Hunskaar 39 ), and these results are consistent with those of the comparison study of our tool with a 24 h recall assessment by a dietitian, where the proportion of ‘novice or inexperienced with computer’ was higher among women than men( Reference Touvier, Kesse-Guyot and Mejean 17 ).
Besides, it is known that dietary misreporting (particularly energy under-reporting) is more frequent among the elderly( Reference Bazelmans, Matthys and De 40 , Reference Yannakoulia, Tyrovolas and Pounis 41 ). The present study includes six participants aged ≥ 75 years (three men and three women). When we excluded them from the main analysis, the results remained unchanged. However, among these six participants, we observed greater under-reporting of K ( − 13·4 % in men and − 14·6 % in women), protein for men ( − 19·3 %) and Na for women ( − 36·8 %), although the Kruskal–Wallis test showed no significant difference (all P>0·05), which is likely to be due to a lack of power. These results may imply that extra attention should be paid to the quality of dietary data when studying diet–disease associations among the elderly.
Correlations with true intake and structure of the measurement error
Correlations between reported intake and true intake were not estimated in the EFCOVAL or AMPM studies, but were estimated in the OPEN Study( Reference Schatzkin, Kipnis and Carroll 7 ). It was estimated that four 24 h recalls could lead to a correlation coefficient of 0·508 (men) and 0·440 (women) with true intake of protein. The correlations between the average of three non-consecutive-day records and true intake observed in the present study (0·61 in men and 0·64 in women) are higher and actually outperform the prediction by Schatzkin et al. ( Reference Schatzkin, Kipnis and Carroll 7 ) with a theoretically infinite number of 24 h recalls (0·597 for men and 0·584 for women).
Attenuation factors found in the present study are similar to the estimates from four 24 h recalls in the OPEN Study for protein in men (0·37), and higher in women (0·43 in our study v. 0·32 in OPEN Study)( Reference Schatzkin, Kipnis and Carroll 7 ), a higher value indicating less bias in estimating diet–health relationships. For K, we found a higher attenuation factor for men, i.e. less bias, than in the OPEN Study (0·60 v. 0·32), but a slightly lower factor for women (0·29 v. 0·33)( Reference Subar, Kipnis and Troiano 4 ). No comparison can be made for Na since, to our knowledge, no other study has estimated attenuation factors for this nutrient.
Finally, this is the first study to assess the correlations between Web-based, self-reported and true intake on a given day, which is a method for evaluating how well the instrument measures its target, without penalising the correlation for the fact that dietary intake may exhibit considerable daily variability. The correlation coefficients were high for protein in both sexes, high for K in men and fair in women, and fair for Na in both men and women. Coefficients were lower for women than men, indicating a lower intrinsic validity of the instrument for women than for men.
Methodological considerations
The main strength of the present study is the use of objective biomarkers, namely 24 h U protein, K and Na, collected on the same day of diet record, in a repeated fashion, which allowed for the estimation of the extent of misreporting, as well as same-day correlations and for usual intake with a complex measurement error model. Accuracy – i.e. completeness – of the 24 h urine collections was assessed comprehensively by different criteria: creatinine (five invalid), total volume (one invalid); self-report of missing voids (twenty-three invalid). Also, although PABA was not assayed, participants were asked to take the PABA pills during the collection which potentially has a ‘placebo effect’ to engage in more compliant behaviour( Reference Subar, Midthune and Tasevska 22 ). Results of both sensitivity analyses using different criteria for exclusion were identical for women; however, slightly lower correlation coefficients and attenuation factors were observed for men. This seems to imply that our strategy of exclusion of invalid urine was an adequate balance between accuracy and statistical power.
Finally, as our strategy of excluding DR days with implausibly low energy intake may introduce bias, we repeated the analyses including the three implausible DR, which did not change the results.
The main limitation of the present study is the absence of use of a recovery biomarker for energy intake, namely double-labelled water, which requires a much more costly and burdensome protocol. Hence, although protein intake, given its energy content, can be used as a proxy of energy intake, we cannot extrapolate the results on protein intake to other macronutrients or total energy intake, as suggested by the OPEN Study results( Reference Subar, Kipnis and Troiano 4 , Reference Kipnis, Subar and Midthune 10 ). An important issue in validating dietary assessment tool is the current paucity of valid recovery biomarkers; however, emerging food metabolomics studies may be a promising way to assess nutritional intake through biomarkers( Reference Beckmann, Lloyd and Haldar 42 ).
Caution is advised when extrapolating from the results of the present validation study to the general population because it was carried out on a relatively small sample of subjects. These were volunteers and probably differed in terms of socio-economic, demographic and lifestyle characteristics from the general population. However, we carried out our sampling strategy in order to have a wide spectrum of age, education level and equal numbers of men and women so that validity could be assessed irrespective of these parameters.
We showed that the Web-based, repeated, non-consecutive-day DR tool used in the NutriNet-Santé cohort study performs well in estimating protein and K intakes and fairly well in estimating Na intake. Furthermore, three repeated DR appear to be valid to estimate usual intakes of protein and K, although caution is advised regarding the generalisability of these findings to other nutrients and to the general population.
Supplementary material
To view supplementary material for this article, please visit http://dx.doi.org/10.1017/S0007114515000057
Acknowledgements
The authors thank Dr Amy Subar from the US National Cancer Institute, Bethesda, MD, for her help in designing the present study protocol and Drs Victor Kipnis, Douglas Midthune and Professor Laurence Freedman for adapting their measurement error model to the present study design and providing SAS codes. They also thank all the staff involved in this study, especially Karine Prevost, technician, the team of dieticians, as well as Mehdi Menai and Rachida Mehroug.
Details of measurement error models are available as online supplementary material with the online posting of this paper.
The present study was funded by the Institut de Veille Sanitaire (InVS) and supported by grants from the Région Ile de France (CORDDIM).
The NutriNet-Santé Study was supported by the following institutions: Ministère de la Santé (DGS), InVS, Institut National de la Prévention et de l'Education pour la Santé (INPES), Fondation pour la Recherche Médicale (FRM), Institut National de la Santé et de la Recherche Médicale (INSERM), Institut National de la Recherche Agronomique (INRA), Conservatoire National des Arts et Métiers (CNAM) and Université Paris 13.
The authors' contributions are as follows: C. L., E. K.-G., K. C., V. D., M. V., P. G. and S. H. were responsible for developing the design and protocol of the study; C. L. conducted the research, carried out data checking and analyses, and was responsible for drafting the manuscript; K. C., V. D., M. V., G. M. C., P. G., S. H., E. K.-G., F. L. and P. F. were involved in interpreting the results and editing of the manuscript; F. L. and P. F. carried out the biomarker analyses. All authors read and approved the final manuscript.
The authors declare that there are no conflicts of interest.