Validation and reproducibility of dietary assessment methods in adolescents: a systematic literature review

Garden Tabacchi; Emanuele Amodio; Maria Di Pasquale; Antonino Bianco; Monèm Jemni; Caterina Mammina

doi:10.1017/S1368980013003157

Validation and reproducibility of dietary assessment methods in adolescents: a systematic literature review

Published online by Cambridge University Press: 18 November 2013

Monèm Jemni and

Garden Tabacchi*: Affiliation:
Department of Sciences for Health Promotion and Mother Child Care ‘G. D'Alessandro’, University of Palermo, Via Del Vespro 133, 90127 Palermo, Italy
Emanuele Amodio: Affiliation:
Department of Sciences for Health Promotion and Mother Child Care ‘G. D'Alessandro’, University of Palermo, Via Del Vespro 133, 90127 Palermo, Italy
Maria Di Pasquale: Affiliation:
Department of Sciences for Health Promotion and Mother Child Care ‘G. D'Alessandro’, University of Palermo, Via Del Vespro 133, 90127 Palermo, Italy
Antonino Bianco: Affiliation:
Sport and Exercise Sciences Unit, University of Palermo, Palermo, Italy
Monèm Jemni: Affiliation:
School of Science, University of Greenwich at Medway, London, UK
Caterina Mammina: Affiliation:
Department of Sciences for Health Promotion and Mother Child Care ‘G. D'Alessandro’, University of Palermo, Via Del Vespro 133, 90127 Palermo, Italy
*: *Corresponding author: Email [email protected]

Article contents

Abstract
Objective
Design
Setting
Subjects
Results
Conclusions
Methods
Results
Discussion
Conclusions
References

Rights & Permissions

Abstract

Objective

The aim of the present work was to determine what dietary assessment method can provide a valid and accurate estimate of nutrient intake by comparison with the gold standard.

Design

A MEDLINE, EMBASE, ISI Web of Science, Cochrane and related references literature review was conducted on dietary assessment methods for adolescents reporting the validity and/or reproducibility values. A study quality assessment on the retrieved FFQ was carried out according to two different scoring systems, judging respectively the quality of FFQ nutrition information and of FFQ validation and calibration.

Setting

The present review considered adolescents attending high schools and recruited in hospitals or at home.

Subjects

The target of the review was the healthy adolescent population in the age range 13–17 years.

Results

Thirty-two eligible papers were included and analysed separately as ‘original articles’ (n 20) and ‘reviews’ (n 12). The majority (n 17) assessed the validation and reproducibility of FFQ. Almost all studies found the questionnaires to be valid and reproducible (r > 0·4), except for some food groups and nutrients. Different design and validation issues were highlighted, such as portion-size estimation, number of food items and statistics used.

Conclusions

The present review offers new insights in relation to the characteristics of assessment methods for dietary intake in adolescents. Further meta-analysis is required although the current review provides important indications on the development of a new FFQ, addressing the need for a valid, reproducible, user-friendly, cost-effective method of accurately assessing nutrient intakes in adolescents.

Keywords

Dietary assessment Adolescents Validation Reproducibility

Type: Assessment and methodology
Information: Public Health Nutrition , Volume 17 , Issue 12 , December 2014 , pp. 2700 - 2714

DOI: https://doi.org/10.1017/S1368980013003157 [Opens in a new window]
Copyright: Copyright © The Authors 2013

Adolescence is a critical period that is characterized by cognitive, emotional and social development and exposure to a significant turnover in lifestyle, including food intake and diet habits. Irregular meals, snacking and meal skipping, which characterize teenagers, often do not allow an accurate dietary assessment⁽ Reference Lietz, Barton and Longbottom ¹ ⁾ and therefore the need to develop valid and reproducible instruments for this purpose is increasing. Different dietary assessment methods among adolescents have been extensively described and validated, such as food records (FR), FFQ, diet histories (DH) and 24 h recalls (24-HR). The FR is not used in large population studies for several reasons⁽ Reference Rockett, Berkeya and Colditz ² ⁾: it can be quite expensive; it requires the participant to be literate and motivated; it involves trained staff; and it needs a computerized program specific to recording diet records. Thus, the FR is preferably used at the individual level and is generally considered a good reference instrument against which to validate other dietary methods to be used at the large population level, together with biomarker measurements⁽ Reference Lampe and Rock ³ ⁾. The most used dietary assessment methods for large-scale surveys are therefore FFQ, 24-HR and DH, which present advantages such as cost-effectiveness, although they are affected by weaknesses⁽ Reference Thompson and Subar ⁴ ^, Reference Ngo, Engelen and Molag ⁵ ⁾ that can produce misreporting. A recent review showed that the major factors influencing under- and over-reporting in recall methods are due to the reliance on respondents’ memory and ability to estimate portion sizes⁽ Reference Poslusna, Ruprich and de Vries ⁶ ⁾. Subjects’ compliance with recording their food intake is often a problem, and this is especially problematic when they are required to keep records for longer periods of time⁽ Reference Gibson ⁷ ⁾. Another issue is the time and monetary cost for the collection and processing of dietary intake information, which can be overtaken by the use of new technologies, such as questionnaires using web-based methods. Some studies state that the web-based computerized assessment represents an element of innovation for data collection, with the advantages of cost-efficiency, reductions in data entry and data coding time, automatic flagging of missing data, accessibility by the entire population, possibility of long-term data collection and simplification of the self-monitoring process, which increases compliance and the validity of self-reported food intake⁽ Reference Kroeze, Werkman and Brug ⁸ ⁾. According to a recent review conducted by the Innovation of Dietary Assessment Methods for Epidemiological Studies and Public Health (IDAMES)⁽ ⁹ ⁾, this method compares reasonably well with more traditional approaches; moreover it is suitable for adolescents, since the age at which a child becomes an accurate self-reporter of his/her own dietary intake has been estimated to be approximately 12 years, although this varies by dietary assessment method⁽ Reference Livingstone, Robson and Wallace ¹⁰ ⁾.

Since dietary methods validated and used for adolescents are different worldwide, a comparison of data is often difficult or unfeasible; standardized surveillance systems are needed, in order to collect valid and accurate estimates of food and nutrient intakes. A standardized and sustainable collection of data on adolescents’ food consumption and lifestyles is useful to understand the diet-related public health problems and implement appropriate actions for the prevention of the related diseases. The ASSO (Adolescents and Surveillance System for the Obesity prevention) Project, funded by the Italian Ministry of Health and supported by different national and international partners, falls within this context, with the purpose of developing a system for a standardized collection of dietary intake and lifestyle data in adolescents. It has the potential to provide the National Health System with a structure that allows a continuous and permanent nutritional surveillance on the school population, and aspires to propose an example of good practice by delivering a tool for an effective nutritional surveillance. In order to establish the best specifically designed tool for the assessment of food and nutrient intakes by comparison with the gold standard measure in large populations of adolescents aged 13–17 years, a systematic literature review on the dietary assessment instruments found to be valid and reproducible was performed within Project ASSO and is described in the present paper.

Methods

Literature search and systematic review

The literature search was conducted on the electronic databases MEDLINE, EMBASE, ISI Web of Science and Cochrane. In the MEDLINE and Cochrane databases, besides free text terms, Medical Subject Headings (MeSH) and MeSH Major Topics were included in the syntax. A sensitivity check was executed by deleting terms in the syntax systematically to see if important articles were missed with the current syntax. The search was focused on studies published in the 10 years between 2001 and 2011. No restriction criteria were applied for the country, while limits were imposed on the language by restricting the publications to the English, Italian, Spanish and French idioms. Studies that met all of the following inclusion criteria were included in the review: describing dietary assessment methods developed for epidemiological purposes; targeting adolescent populations in the age range 13–17 years; and reporting the validity and/or reproducibility of the method v. one reference method.

Key search terms, used alone and in combination, included the following: terms referred to the type of dietary method (questionnaire, 24-HR, 24 h recall, 24-h recall, FFQ, history, record, diary); terms including diet, nutrition, food, intake; and terms related to the validation and reliability of the methods (validity, validation, reliability, reproducibility, calibration). Additional searches were carried out on websites of national and international organizations (e.g. universities and relevant professional societies or organizations) and the grey literature was also considered.

The retrieved records were sent to Endnote^® (version X4·02), where the duplicates were removed.

After this, an initial screening of titles and abstracts was performed in order to check exclusion criteria. When a title or abstract could not be rejected with certainty, the paper was included in the eligibility papers and the full text was further evaluated.

Articles were excluded in the following cases: population age not in the range 13–17 years; non-healthy subjects; hospitalized or not free-living subjects; pregnant adolescent women; refugees; vulnerable populations such as low income or rural; specific ethnicity; overweight/obese subjects; athletes; vegetarians; dietary instrument specific only to certain nutrients (folate, vitamins, calcium, fat, protein, etc.), specific only to certain foods (alcohol, beverages, fruit and vegetables, sugary snacks, seafood, etc.) or specific only to energy and fast-foods consumption; feeding study or intervention study; subjects with eating disorders; study relative to eating or health behaviour; psychometric tests e.g. for craving; subjects with food allergies; study relative to intake of particular substances (acrylamide, etc.); questionnaire only for physical activity assessment; questionnaire only for nutrition knowledge assessment; study aimed at perceptions; study where only parental reporting on the child's diet was considered; study with only food insecurity measurement; and study with only portion-size estimation.

The full texts of the articles assessed for eligibility were examined through a second screening, in order to evaluate the relevance of the papers. Some articles and the relative full version of the questionnaires were obtained through direct contact with the author. Articles were excluded if a relative comparison of validity and/or reproducibility was not made for the dietary instrument.

The reference lists of articles retrieved for inclusion in the review were hand-searched to identify other relevant articles.

If for the same study there was a series of similar articles, they were all screened and considered for analysis, in order to avoid possible data loss. The literature search and the systematic review were conducted by two independent investigators, after a standardization of the procedure. In the case of any incongruity, the two investigators came to an agreement after further analysis and discussion.

Once papers were identified as relevant, data were extracted into an Excel^® database.

Study quality assessment

A study quality assessment of the retrieved articles was carried out by two of the investigators independently, according to two different scoring systems. The reduced summary score described by Dennis et al.⁽ Reference Dennis, Snetselaar and Nothwehr ¹¹ ⁾ judges the quality of nutrition information from FFQ in epidemiological studies by applying a priori defined criteria and is based on the following aspects: the number of food items, the administration mode (e.g. interviewer v. self-administered mode) and whether it is a quantitative instrument. The reduced summary score was ranked as ‘high’ or ‘low’ quality, with a tally of 5 or more ranked as ‘high’, for a total possible score of 8 points.

Since the present analysis is focused on the assessment of the quality of validation and calibration studies of FFQ, with the aim of including, excluding or weighting the studies that utilize an FFQ in the current review, we used an additional scoring system proposed more recently by Serra-Majem et al.⁽ Reference Serra-Majem, Frost Andersen and Henríque Sánchez ¹² ⁾. This system considers the following variables: type of sample and sample size of the study; statistics to assess validity (e.g. comparisons between methods’ means, medians or difference; crude, energy-adjusted, de-attenuated or intra-class correlation) and statistics to assess agreement or misclassification; administration mode; seasonality considered in the validation design; and supplements included and validated. According to Serra-Majem et al.⁽ Reference Serra-Majem, Frost Andersen and Henríque Sánchez ¹² ⁾, scores could range from 0 (poorest quality) to a maximum of 7 (highest quality). This allows for the classification of validation studies according to their methodological quality. The summary score was ranked as ‘very good/excellent’ with a tally of 5 or more; ‘good’ with a score between 3·5 and 5; ‘acceptable/reasonable’ with a score between 2·5 and 3·5; or poor with a score of less than 2·5.

For the studies that used semi-quantitative methods other than FFQ, only the scoring system proposed by Serra-Majem et al.⁽ Reference Serra-Majem, Frost Andersen and Henríque Sánchez ¹² ⁾ was applied.

Results

As shown in Fig. 1, a total of 480 articles were retrieved after duplicates were removed and sixty-eight were included in the review when specific exclusion criteria were applied. A further screening procedure based on the full-text evaluations identified thirty-two eligible papers that were included in the qualitative synthesis and analysed separately as ‘original articles’ (n 20) and ‘reviews’ (n 12; Fig. 1).

Fig. 1 Selection process flow of the original articles and reviews on the validation and/or reproducibility of dietary assessment methods in adolescents

Original articles

General overview

An overview of the retrieved twenty original articles⁽ Reference Lietz, Barton and Longbottom ¹ ^, Reference Ambrosini, de Klerk and O'Sullivan ¹³ ^– Reference Watson, Collins and Sibbritt ³¹ ⁾ is shown in Table 1.

Table 1 Overview of the twenty eligible original articles

NR, not reported; FLVS II, Fleurbaix Laventie Ville Santé Study II; SBNM, School-Based Nutrition Monitoring; EPIC, European Prospective Investigation into Cancer and Nutrition; GUTS, Growing Up Today Study; HBSC, Health Behaviour in School-Aged Children; HELENA, Healthy Lifestyle by Nutrition in Adolescence; ACAES, Australian Child and Adolescent Eating Survey; BKQ, Block Kids Questionnaire; 24-HR, 24 h recall; SNAP™, Synchronized Nutrition and Activity Program™; GAFFQ, Greek Youth Adolescent's FFQ; DH, diet history; AFFQ, FFQ for Adolescents; YANA-C, Young Adolescents’ Nutrition Assessment on Computer; FR, food record; WFR, weighed food record; YAQ, Youth/Adolescent Questionnaire; FBC, food behaviour checklist.

The majority (n 17)⁽ Reference Lietz, Barton and Longbottom ¹ ^, Reference Ambrosini, de Klerk and O'Sullivan ¹³ ^– Reference Matthys, Pynaert and De Keyzer ²⁰ ^, Reference Papadopoulou, Barboukis and Dalkiranis ²² ^– Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Slater, Philippi and Fisberg ²⁶ ^, Reference Vereecken and Maes ²⁷ ^, Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ^– Reference Watson, Collins and Sibbritt ³¹ ⁾ of them were identified as studies assessing the validation and reproducibility of FFQ against reference dietary instruments (Table 2), while the remaining three studies considered questionnaires other than FFQ analysed for their validity and reproducibility against different reference methods⁽ Reference Moore, Ells and McLure ²¹ ^, Reference Sjoberg and Hulthe ²⁵ ^, Reference Vereecken, Covents and Matthys ²⁸ ⁾ (Table 3).

Table 2 Characteristics of the seventeen FFQ validation studies

NR, not reported; PB, paper based; WB, web based; IW, interviewer administered; SA, self-administered; FR, food record; FDA, Food and Drug Administration; FLVS I, Fleurbaix Laventie Ville Santé Study I; MAFF, Ministry of Agriculture, Fisheries, and Foods; NNS, National Nutrition Survey; ABS, Australian Bureau of Statistics; NHANES, National Heatlh and Nutrition Examination Survey; USDA, US Department of Agriculture; WFR, weighted food record; 24-HR, 24 h recall; YAQ, Youth/Adolescent Questionnaire; FBC, food behaviour checklist; YANA-C, Young Adolescents’ Nutrition Assessment on Computer; CC, correlation coefficient; LOA, limits of agreement; κ _w, weighted kappa coefficient; NS, not stated; ICC, intra-class correlation coefficient.

Table 3 Characteristics of the three validation studies of other questionnaires than the FFQ

24-HR, 24 h recall; DH, diet history; NR, not reported; WB, web based; PB, paper based; SA, self-administered; IW, interviewer administered; USDA, US Department of Agriculture; FR, food record; LOA, limits of agreement; CC, correlation coefficient; κ _w, weighted kappa coefficient; NS, not stated.

The outcome in some cases included the values of validity and reproducibility of the instrument to assess both food and nutrient intakes⁽ Reference Cullen, Watson and Zakeri ¹⁶ ^, Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ^, Reference Sjoberg and Hulthe ²⁵ ^, Reference Vereecken, Covents and Matthys ²⁸ ^– Reference Watanabe, Yamaoka and Yokotsuka ³⁰ ⁾, while some studies considered only the food intake⁽ Reference Hoelscher, Day and Kelder ¹⁸ ^, Reference Matthys, Pynaert and De Keyzer ²⁰ ^– Reference Moore, Ells and McLure ²¹ ^, Reference Vereecken and Maes ²⁷ ⁾ and some others only the nutrient intake⁽ Reference Lietz, Barton and Longbottom ¹ ^, Reference Ambrosini, de Klerk and O'Sullivan ¹³ ^– Reference Bertoli, Petroni and Pagliato ¹⁵ ^, Reference Hong, Dibley and Sibbritt ¹⁹ ^, Reference Papadopoulou, Barboukis and Dalkiranis ²² ^– Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Slater, Philippi and Fisberg ²⁶ ^, Reference Watson, Collins and Sibbritt ³¹ ⁾.

Since healthy adolescents represent the target of the present review, the most common setting where the questionnaires were administered was the school (Table 1). In some studies the setting was a hospital and in these cases only the healthy subjects selected by the author were considered, while in a few other cases the household environment (direct or telephone interview) was chosen (Table 1).

FFQ analysed by their intrinsic characteristics

The seventeen studies where FFQ were found to be reasonably valid and reproducible were analysed on the basis of their intrinsic characteristics: number of food groups and food items; consumption interval; paper-based or web-based format; interview or self-administered mode; portion size estimation; food composition databases used for the nutrient conversion; administration duration; and number of FFQ administered and interval for the retest (Table 2).

The described FFQ were mostly semi-quantitative, whereby the instrument addressed both the frequency and the amount consumed for each food item⁽ Reference Willett ³² ⁾. A quite high variability was highlighted between the studies. Foods were gathered into groups that ranged in number between ten⁽ Reference Lietz, Barton and Longbottom ¹ ⁾ and twenty-four⁽ Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ⁾ food groups; the number of food items included in the different FFQ ranged between twenty-six⁽ Reference Rockett, Berkey and Colditz ²³ ⁾ and 212⁽ Reference Ambrosini, de Klerk and O'Sullivan ¹³ ⁾, with an average of 104. As an FFQ may not be suitable for recalling diet in the distant past⁽ Reference Fraser, Lindsted and Knutsen ³³ ⁾, the consumption interval reported in the retrieved FFQ was generally the previous week or month, or the previous 6 months or year. Only two of the retrieved articles had validated a web-based FFQ in adolescents⁽ Reference Matthys, Pynaert and De Keyzer ²⁰ ^, Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ⁾ in relation to food and nutrient data, respectively; all the others were paper-based questionnaires. Four of the FFQ were self-administered⁽ Reference Rockett, Berkey and Colditz ²³ ^, Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Watanabe, Yamaoka and Yokotsuka ³⁰ ^, Reference Watson, Collins and Sibbritt ³¹ ⁾, while the rest were partially or fully interviewer-administered. Portion sizes were estimated mostly based on photographs/illustrations, while a lower amount of studies used household measures (e.g. cups, tablespoons), natural units or a combination of them.

The fourteen studies that translated food intakes into nutrient intakes⁽ Reference Lietz, Barton and Longbottom ¹ ^, Reference Ambrosini, de Klerk and O'Sullivan ¹³ ^– Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ^, Reference Hong, Dibley and Sibbritt ¹⁹ ^, Reference Papadopoulou, Barboukis and Dalkiranis ²² ^– Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Slater, Philippi and Fisberg ²⁶ ^, Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ^– Reference Watson, Collins and Sibbritt ³¹ ⁾ used national or other types of food composition databases, thus resulting in a wide heterogeneity of databases.

In those papers reporting the time needed to complete the FFQ, an average time of 30 min was calculated.

In some studies, the FFQ was administered twice after a time interval ranging from 1 week to 6 months after the first administration, in order to evaluate the reproducibility of the method⁽ Reference Cullen, Watson and Zakeri ¹⁶ ^, Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ^, Reference Hong, Dibley and Sibbritt ¹⁹ ^, Reference Rockett, Berkey and Colditz ²³ ^, Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ^– Reference Watson, Collins and Sibbritt ³¹ ⁾.

FFQ analysed by the validation study characteristics

The characteristics of the validation study were also considered: sample size; reference method (FR or 24-HR); and statistics used to assess the agreement between the two methods and the reproducibility.

Except for one study where the number of participants was very high⁽ Reference Vereecken and Maes ²⁷ ⁾ (n 7072), the sample size ranged from seventeen⁽ Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ⁾ to 785⁽ Reference Ambrosini, de Klerk and O'Sullivan ¹³ ⁾ participants (Table 1). Moreover, in some studies the sample was not homogeneous for variables such as sex. Almost all of the studies reported the difference between males and females: in some cases⁽ Reference Shatenstein, Amre and Jabbour ²⁴ ⁾ it was stated that there was a stronger association for girls; in other studies the questionnaire performed better for males in adequately classifying individuals for all nutrients⁽ Reference Ambrosini, de Klerk and O'Sullivan ¹³ ⁾ or according to their total fat and protein intake⁽ Reference Slater, Philippi and Fisberg ²⁶ ⁾, or fibre⁽ Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ⁾ and PUFA intake⁽ Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ⁾.

With regard to the statistics used in the studies, comparison between methods to assess measurement differences in the validation studies used the mean comparison as a first approach (this is not shown in Table 2). Sometimes Student's t test for paired samples (for normally distributed variables)⁽ Reference Bertoli, Petroni and Pagliato ¹⁵ ^, Reference Slater, Philippi and Fisberg ²⁶ ⁾ or the Wilcoxon signed-rank test (for skewed distributions)⁽ Reference Matthys, Pynaert and De Keyzer ²⁰ ^, Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ⁾ was used.

Although Ambrosini et al.⁽ Reference Ambrosini, de Klerk and O'Sullivan ¹³ ⁾ and others⁽ Reference Chinn ³⁴ ^, Reference Hebert and Miller ³⁵ ⁾ previously showed that the correlation coefficient can be a misleading indicator of agreement, all retrieved studies calculated Pearson's or Spearman's correlation coefficient (Table 2), respectively when the sample distribution was normal or transformed into a normal one, or when it was skewed. In some studies the correlation was considered crude; in some others the presentation of results included the adjustment of nutrients for total energy intake using regression techniques (energy-adjusted values) and/or values de-attenuated from the weakening effect of measurement error.

Other approaches used in the retrieved studies to determine agreement were weighted kappa values, the mean agreement and the limits of agreement (LOA) as a percentage⁽ Reference Bland and Altman ³⁶ ⁾ (Table 2). Weighted kappa values were used in five studies⁽ Reference Arajuo, Massae Yokoo and Alves Pereira ¹⁴ ^, Reference Bertoli, Petroni and Pagliato ¹⁵ ^, Reference Hoelscher, Day and Kelder ¹⁸ ^, Reference Hong, Dibley and Sibbritt ¹⁹ ^, Reference Watson, Collins and Sibbritt ³¹ ⁾. In most cases, the number of categories used for calculating kappa statistics to compare classification of nutrient data varied from two to five⁽ Reference Brenner and Kliebsch ³⁷ ^, Reference Sim and Wright ³⁸ ⁾. In the validation studies of dietary intake considered, quintiles were used in the calculation of kappa statistics⁽ Reference Hong, Dibley and Sibbritt ¹⁹ ^, Reference Watson, Collins and Sibbritt ³¹ ⁾. The mean agreement % and the LOA %, sometimes regressed with the Bland–Altman plot⁽ Reference Altman ³⁹ ^, Reference Bland and Altman ⁴⁰ ⁾, were used in twelve studies⁽ Reference Lietz, Barton and Longbottom ¹ ^, Reference Ambrosini, de Klerk and O'Sullivan ¹³ ^– Reference Cullen, Watson and Zakeri ¹⁶ ^, Reference Hoelscher, Day and Kelder ¹⁸ ^– Reference Matthys, Pynaert and De Keyzer ²⁰ ^, Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Vereecken and Maes ²⁷ ^, Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ^, Reference Watson, Collins and Sibbritt ³¹ ⁾. Since correct ranking ability is a desired outcome from an FFQ, nine studies ranked the subjects by using the same or adjacent tertile⁽ Reference Lietz, Barton and Longbottom ¹ ^, Reference Ambrosini, de Klerk and O'Sullivan ¹³ ^, Reference Bertoli, Petroni and Pagliato ¹⁵ ⁾, quartile⁽ Reference Arajuo, Massae Yokoo and Alves Pereira ¹⁴ ^, Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Slater, Philippi and Fisberg ²⁶ ⁾ or quintile⁽ Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ^, Reference Hong, Dibley and Sibbritt ¹⁹ ^, Reference Watson, Collins and Sibbritt ³¹ ⁾ per cent method.

The studies were then further analysed on the basis of the reference method used: the FR and the 24-HR were the main gold standards.

FFQ v. FR. The majority of FFQ used as reference method the FR, estimated or weighted, covering 3 d or 7 d⁽ Reference Lietz, Barton and Longbottom ¹ ^, Reference Ambrosini, de Klerk and O'Sullivan ¹³ ^– Reference Bertoli, Petroni and Pagliato ¹⁵ ^, Reference Matthys, Pynaert and De Keyzer ²⁰ ^, Reference Papadopoulou, Barboukis and Dalkiranis ²² ^, Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Watanabe, Yamaoka and Yokotsuka ³⁰ ^, Reference Watson, Collins and Sibbritt ³¹ ⁾. The FFQ in general tended to overestimate nutrient intakes in comparison with the FR, even though they reported a modest or good agreement. A good correlation between two methods is generally considered for a coefficient value >0·4⁽ Reference Cade, Burley and Warm ⁴¹ ⁾. On the basis of this cut-off, the selected studies showed a good correlation coefficient between the dietary intake method and the reference method for most food groups and nutrients, thus indicating that the FFQ can be used as a reliable instrument to estimate food and nutrient intakes of adolescents, rank them on a range of nutrient intakes and classify them into low, medium and high consumers. For some studies this was not valid for some food groups and nutrients, which will be evaluated in a further study of meta-analysis.

FFQ v. 24-HR. Eight studies validated an FFQ against a 24-HR⁽ Reference Cullen, Watson and Zakeri ¹⁶ ^– Reference Hong, Dibley and Sibbritt ¹⁹ ^, Reference Rockett, Berkey and Colditz ²³ ^, Reference Slater, Philippi and Fisberg ²⁶ ^, Reference Vereecken and Maes ²⁷ ^, Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ⁾. The majority of 24-HR were repeated three or four times, in a period of 7 d, 2–4 months or 1 year, and included weekdays and weekend days. Almost all the selected FFQ could be used to classify subjects according to their food and nutrient intakes. Nutrient correlations between FFQ and 24-HR data that were de-attenuated and adjusted for energy intake tended to yield higher correlation coefficient values than the crude analysis.

Some authors found a low adjusted and de-attenuated correlation coefficient (<0·30) for certain food groups and nutrients. For example, the Block Kids Questionnaire had validity for some nutrients, but not for most food groups assessed⁽ Reference Cullen, Watson and Zakeri ¹⁶ ⁾.

For assessment of the reproducibility of the method, in the study from Cullen et al.⁽ Reference Cullen, Watson and Zakeri ¹⁶ ⁾ and Dechamps et al.⁽ Reference Deschamps, De Lauzon-Guillain and Lafay ¹⁷ ⁾ the intra-class correlation coefficient (ICC) was used for the reproducibility evaluation (Table 2). A high value of this coefficient indicates a low within-person variation. In the first study all ICC were >0·40, except for percentage of energy from protein and for servings of vegetables and fruit. In the second study, the values were higher for food items consumed daily such as milk or sugars and confectionery, and lower for rarely eaten food such as inner organs.

In the other studies, the Pearson's or Spearman's correlation coefficient was mostly used for assessment of the reproducibility of the foods and nutrients.

Other statistics used for misclassification in the reproducibility study were quintiles and weighted kappa⁽ Reference Hong, Dibley and Sibbritt ¹⁹ ⁾ and the Bland–Altman method plot⁽ Reference Watson, Collins and Sibbritt ³¹ ⁾.

Results of the study quality assessment

The results from the study quality assessment of the seventeen retrieved articles on the validation and reproducibility of FFQ in adolescents are shown in Table 2. Out of the seventeen selected studies, all except three⁽ Reference Matthys, Pynaert and De Keyzer ²⁰ ^, Reference Rockett, Berkey and Colditz ²³ ^, Reference Vereecken and Maes ²⁷ ⁾ resulted in a high quality ranking according to the system proposed by Dennis et al.⁽ Reference Dennis, Snetselaar and Nothwehr ¹¹ ⁾. The issues that decreased the quality of the study, according to this quality score, were related mainly to the number of food items; a number of food items less than seventy is likely to reduce the quality of the nutrition information.

According to the score proposed by Serra-Majem et al.⁽ Reference Serra-Majem, Frost Andersen and Henríque Sánchez ¹² ⁾, the seventeen articles were ranked as follows: the study from Slater et al.⁽ Reference Slater, Philippi and Fisberg ²⁶ ⁾ was very good/excellent; twelve studies⁽ Reference Lietz, Barton and Longbottom ¹ ^, Reference Ambrosini, de Klerk and O'Sullivan ¹³ ^– Reference Hong, Dibley and Sibbritt ¹⁹ ^, Reference Rockett, Berkey and Colditz ²³ ^, Reference Vereecken and Maes ²⁷ ^, Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ^, Reference Watson, Collins and Sibbritt ³¹ ⁾ were good; two studies⁽ Reference Papadopoulou, Barboukis and Dalkiranis ²² ^, Reference Shatenstein, Amre and Jabbour ²⁴ ⁾ were acceptable/reasonable; and two studies⁽ Reference Matthys, Pynaert and De Keyzer ²⁰ ^, Reference Watanabe, Yamaoka and Yokotsuka ³⁰ ⁾ were poor. The quality assessment of the three studies that used methods other than the FFQ⁽ Reference Moore, Ells and McLure ²¹ ^, Reference Sjoberg and Hulthe ²⁵ ^, Reference Vereecken, Covents and Matthys ²⁸ ⁾ resulted in one study being poor⁽ Reference Matthys, Pynaert and De Keyzer ²⁰ ⁾ and in two studies being acceptable/reasonable⁽ Reference Sjoberg and Hulthe ²⁵ ^, Reference Vereecken, Covents and Matthys ²⁸ ⁾. The items that affected the quality of the study, according to the score system from Serra-Majem et al., were mainly the statistics used to assess validity: using the mean comparison or the correlation coefficients alone is not enough to describe one study; the studies from authors that used correlation coefficients adjusted for energy or de-attenuated, or other statistics (such as the Bland–Altman method), in addition to the correlation coefficients, were ranked into a higher quality level. Data gathered by self-administration were subject to be less valid and reliable⁽ Reference Matthys, Pynaert and De Keyzer ²⁰ ^, Reference Shatenstein, Amre and Jabbour ²⁴ ^, Reference Watanabe, Yamaoka and Yokotsuka ³⁰ ⁾, as the scoring system assigned a higher score to the interviewer administration. The heterogeneity for variables such as sex also retained importance, but did not influence consistently the final score. Seasonality and supplements were never reported in the retrieved studies.

Reviews

A total of twelve reviews were considered for the analysis⁽ Reference Rockett, Berkeya and Colditz ² ^, Reference Thompson and Subar ⁴ ^, ⁹ ^, Reference Altman ³⁹ ^, Reference Biro, Hulshof and Ovesen ⁴² ^– Reference Probst and Tapsell ⁴⁹ ⁾. In the USA a new version of ASA24 for use with school-aged children⁽ Reference Bliss ⁴³ ⁾ was developed, consisting of a specialized software program adapted from the Automated Multiple Pass Method to enable the development of a computer-based self-administered 24-HR. Children 14–16 years of age are also likely to require a children's version but testing has not yet been conducted with this age group; however, the adult version may be appropriate for those 14–15 years of age or above, but this has not been thoroughly evaluated.

The review from Cade et al.⁽ Reference Cade, Thompson and Burley ⁴⁶ ⁾ was prepared to guide the individual about to embark on the development and/or use of an FFQ as a dietary assessment tool, and this provided some guidelines for conducting a validation study on a new FFQ. In the review from Ortiz-Andrellucchi et al.⁽ Reference Ortiz-Andrellucchi, Henriquez-Sanchez and Sanchez-Villegas ⁴⁸ ⁾, 80 % of the reviewed studies used FFQ to assess micronutrient intakes for which wide variations in the number of food items were observed (ten to 190 items). In the studies reviewed, the FFQ comprised the dietary method that was most utilized to assess the micronutrient intakes in these groups, in which it is of utmost importance to recognize methodological aspects such as food composition databases used for analysis, portion-size assessment and the time periods between the two dietary assessment methods.

Where interventions are longer and a large number of participants is involved, such as those surveys directed at schoolchildren, 24-HR and 3 d or 7 d FR are possible and can provide more accurate and detailed data⁽ Reference Contento, Randell and Basch ⁴⁷ ⁾.

Some reviews suggested the use of the 24-HR as the best method to estimate food consumption in adolescents. Specifically, the use of two non-consecutive 24-HR and a food list to assess the non-users for infrequently consumed foods was suggested by Biro et al.⁽ Reference Biro, Hulshof and Ovesen ⁴² ⁾ within the EFCOSUM (European Food Consumption Survey Method) Project.

Weighted FR provided the best estimates of energy intake for younger children aged 0·5 to 4 years, while the DH method provided better estimates for adolescents aged 16 years or more⁽ Reference Burrows, Martin and Collins ⁴⁵ ⁾.

Computer tailoring is important in nutrition research and is currently one of the most promising and innovative approaches⁽ Reference Brug, Oenema and Campbell ⁴⁴ ⁾. However, little is known to date and more research is needed about when, why, where and for whom computer-tailored nutrition education is effective. In the review from Probst and Tapsell⁽ Reference Probst and Tapsell ⁴⁹ ⁾ a wide range of programs and features for computerized diet assessment were identified, but they did not specify what age they referred to.

There are many measurement issues that may impact on reporting accuracy when assessing the dietary intakes of children and adolescents⁽ ⁹ ⁾. One of these is the portion-size estimation: for the quantification of portion sizes some papers suggested a picture book, including country-specific dishes, with additional household measures and other relevant measurements⁽ Reference Biro, Hulshof and Ovesen ⁴² ⁾.

Discussion

The present systematic literature review provides useful information on the most valid and reliable dietary assessment methods used worldwide in large-scale surveys on adolescents and suggests the most appropriate tool to use for the collection of dietary intake data.

In this review, fourteen developed and validated FFQ were identified. Semi-quantitative FFQ were demonstrated to be valid and reproducible instruments for estimating dietary intake in adolescent age at a large-scale level. FFQ have the advantages of ease of administration, ability to assess dietary intake over an extended period of time and low cost⁽ Reference Subar ⁵⁰ ⁾. However, probably because of misclassification, FFQ are not always able to detect weak associations⁽ Reference Schatzkin, Kipnis and Carroll ⁵¹ ⁾, are less specific and have greater measurement error⁽ Reference Subar ⁵⁰ ^, Reference Kipnis, Subar and Midthune ⁵² ⁾. The FFQ analysed in the current review differed in the way they were developed and showed large variations in design characteristics, such as the number of items or inclusion of portion-size questions, which could affect reported intakes according to Molag et al.⁽ Reference Molag, de Vries and Ocke ⁵³ ⁾. This leads to the need to further characterize or create new FFQ targeted to adolescents for a standardized data collection.

With regard to the use of a 24-HR for children over 10 years of age, the EFCOSUM Project recommends the use of two non-consecutive 24-HR. It recommends the EPIC-SOFT program as the first choice to collect 24-HR in all European countries⁽ Reference Slimani and Valsta ⁵⁴ ⁾. However, additional developments and improvements are needed, and at the moment the EFCOVAL (European Food Consumption Validation) Project is trying to adapt and validate it according to the specific needs of future possible pan-European monitoring surveys. The 24-HR YANA-C is a useful instrument for collecting data on food and nutrient intakes in adolescents, but it requires too much time to be compiled and is complex to be used in such a large target population. Also in the USA the primary instrument used to collect dietary food intake data in national surveys is a 24-HR: the ASA24 that was developed and is going to be validated also in school-age children.

Specific design and validation issues were highlighted in the present review. These issues should be taken into account when preparing tools for dietary data collection. The retrieved reviews gave indications about how to choose appropriate foods; what number of items to choose; how to manage the portion-size collection; the method of administration; the use of appropriate nutrient databases; the pre-testing process; the validation and reproducibility process; the statistical issues; and other issues such as the seasonality or the use of supplements. There are many factors that may affect the validity of a dietary questionnaire such as respondent characteristics, questionnaire design and quantification, adequacy of the reference data, quality control and data management⁽ Reference Serra-Majem, Frost Andersen and Henríque Sánchez ¹² ⁾.

One of the largest concerns about dietary surveys based on recall is their reliance on memory, which is subject to several errors; recall errors increase as a function of time and up to 30 % of food memory may be lost from the previous day⁽ Reference Fries, Green and Bowen ⁵⁵ ⁾.

The motivation, cognitive ability and literacy level of the participants are basic determinants for which instrument to select. Moreover, adolescents experience difficulty in reporting portion size. Food should be described in frequencies and quantities of units or portions within a certain time frame; this raises the issue of the portion-size assessment. Some food items may be forgotten, other food items may be remembered although not having been consumed within the given time frame. Some food items are not recognized because they are part of a dish (e.g. in pasta with legumes the olive oil is often ignored, as well as the condiment in the pizza). This may lead to overestimation or underestimation of intake. Substantial week-to-week, day-to-day and meal-to-meal variability in food and portion sizes consumed may require arithmetic computations to average usual consumption to fit into the FFQ response categories, and hence may be simplified when a long list of estimations needs to be done. The current findings suggest for example to apply a correction factor to decrease the reported intake of fibre, vitamin C, calcium and iron and to increase the percentage of energy from fat. In particular, under-reporting of energy can be a problem in dietary assessment studies; energy adjustment appears to minimize the bias generated by under-reporting with respect to particular nutrients and their association with various disease outcomes⁽ Reference Gnardellis, Boulou and Trichopoulou ⁵⁶ ⁾. Thus, it is important to include this value in each validation study that is associated with the study analysis.

The statistical analyses of validation data (e.g. energy adjustment, de-attenuation) are important issues to be considered. Since several factors may affect the measures, it is difficult to accurately summarize the correlation coefficient and the agreement for validity and reproducibility abstracted from published articles. The current review, therefore, should be considered a rough description of the validity and reproducibility of the identified FFQ, which have to be analysed in their entirety and by food group, nutrient, FFQ length and other characteristics in a further meta-analysis study. Correlation coefficients were used in all the selected studies, but this method alone is flawed because it does not measure the agreement between two methods, only the degree to which the methods are related⁽ Reference Altman ³⁹ ⁾. Correlation coefficients can be useful in conjunction with the Bland–Altman method, which assesses in graphical form the agreement between the methods across the range of intakes by plotting the mean of the two methods against the difference. The mean agreement indicates how well the FFQ and FR agree on average. The LOA method is used to determine agreement between absolute values from each method and provides an informative analysis of reliability, including information about the magnitude of errors between methods, the direction of bias between methods and whether or not bias is constant across levels of intake.

One important objective is to reduce the costs of collection and processing of dietary intake information due to the amounts and complexity of the data usually involved⁽ Reference Thompson, Subar and Loria ⁵⁷ ⁾. Beyond new technologies, a recent approach used in large studies is the Internet-based FFQ. The questionnaires that used web-based methods were the FFQ from Matthys et al.⁽ Reference Matthys, Pynaert and De Keyzer ²⁰ ⁾, the 24-HR SNAP⁽ Reference Moore, Ells and McLure ²¹ ⁾, the 24-HR YANA-C⁽ Reference Vereecken, Covents and Matthys ²⁸ ^, Reference Vereecken, Covents and Sichert-Hellert ⁵⁸ ⁾, the Health Behaviour in School-aged Children (HBSC) FFQ⁽ Reference Vereecken and Maes ⁵⁹ ⁾ and the Healthy Lifestyle in Europe by Nutrition in Adolescence (HELENA) FFQ⁽ Reference Vereecken, De Bourdeaudhuij and Maes ²⁹ ⁾. Vereecken et al.⁽ Reference Vereecken and Maes ⁵⁹ ⁾ have investigated whether the computer format of the HBSC FFQ would affect the responses of the adolescents in comparison with the paper-and-pencil format; some differences were found between the female and male reporters. In another study⁽ Reference Vereecken, Covents and Sichert-Hellert ⁵⁸ ⁾ an adaptation of YANA-C for different country realities in Europe was described: the feasibility of self-administration by comparison with administration by an interviewer was investigated and it was concluded that after an adaptation, translation and standardization of YANA-C, it is possible to assess the dietary intake of adolescents by self-administration in a broad international context. The use of interviewers may be an advantage in some situations and allows for immediate checking by the interviewer of improbable or unlikely responses; against this is the need to standardize the training processes, the cost of employing interviewers and the influence of the interviewer's presence on increasing the likelihood of social desirability bias in the participant's responses. In the light of these considerations, when the quality of the studies is assessed, the assignment of a higher score to the studies that use interviewer-administered questionnaires could be revised. Studies such as those from Matthys⁽ Reference Matthys, Pynaert and De Keyzer ²⁰ ⁾, Shatenstein⁽ Reference Shatenstein, Amre and Jabbour ²⁴ ⁾ and Watanabe⁽ Reference Watanabe, Yamaoka and Yokotsuka ³⁰ ⁾, for example, would gain in quality. Self-administered computerized assessment could be considered a valid way of collecting data; it makes it possible for participants to register and assess their dietary intake at their own pace and convenience; the respondent immediately stores data and interviewers do not have to be present during the entire interview, which saves considerable time and decreases costs. Furthermore, computerized assessment tools can directly calculate nutrient intakes and energy expenditure, which makes it possible to give immediate feedback⁽ Reference Evers and Carol ⁶⁰ ⁾. In addition, adolescents might be more motivated to report their dietary intake with computer use⁽ Reference Vereecken, Covents and Matthys ²⁸ ⁾.

The first limit of our review is that studies validating dietary intake instruments in comparison with biomarkers were not considered, as they often reflect status rather than intake, short-term rather than long-term intakes, and are invasive and expensive⁽ Reference Lampe and Rock ³ ⁾. Moreover, some foods and nutrients need particular attention when included in an FFQ, since relatively poor validity and reproducibility were observed in FFQ estimates for them; the detailed information on these foods and nutrients is not given in the present review, as it is a purpose of further meta-analysis study. Another limitation is the choice of the language of the articles, which could have excluded validated and reliable dietary methods used in other countries.

Conclusions

There is an ongoing need for the refinement of existing approaches, especially ones that can be used in large epidemiological studies. The analysed validation studies in adolescents justify advocating the FFQ method over the 24-HR and suggest the development of a new semi-quantitative FFQ that could fit the purposes of the ASSO Project. The design of the FFQ will be established in detail after a meta-analysis study on the validity and reproducibility of the identified FFQ, ranking by specific characteristics such as food group, nutrient or FFQ length. The ASSO-FFQ will be a new tool addressing the need for a valid, reproducible, user-friendly, fast, cost-effective, standardized method of accurately assessing nutrient intakes in adolescents.

Acknowledgements

Sources of funding: The work was performed within the Adolescents and Surveillance System for the Obesity prevention (ASSO) Project (code GR-2008-1140742, CUP I85J10000500001), a young researchers’ project funded by the Italian Ministry of Health. The Italian Ministry of Health had no role in the design, analysis or writing of this article. Conflicts of interest: The authors state there are no conflicts of interest. Ethics: Ethical approval was given by the ethical committee of the Azienda Ospedaliera Universitaria Policlinico Paolo Giaccone (approval code n.9/2011). Authors’ contributions: All authors contributed to the development of the review. G.T., C.M. and A.B. performed the search, screening and elaboration of concepts. E.A., M.d.P. and M.J. provided a valuable contribution to the whole manuscript.

References

1. Lietz, G, Barton, KL, Longbottom, PJ et al. (2002) Can the EPIC food-frequency questionnaire be used in adolescent populations? Public Health Nutr 5, 783–789.Google Scholar

2. Rockett, HRH, Berkeya, CS & Colditz, GA (2003) Evaluation of dietary assessment instruments in adolescents. Curr Opin Clin Nutr Metab Care 6, 557–562.Google Scholar

3. Lampe, JW & Rock, CL (2008) Biomarkers and their use in nutrition intervention. In Nutrition in the Prevention and Treatment of Disease, 2nd ed., pp. 187–201 [AM Coulston and CJ Boushey, editors]. San Diego, CA: Academic Press.Google Scholar

4. Thompson, FE & Subar, AF (2008) Dietary assessment methodology. In Nutrition in the Prevention and Treatment of Disease, 2nd ed., pp. 5–46 [AM Coulston and CJ Boushey, editors]. San Diego, CA: Academic Press..Google Scholar

5. Ngo, J, Engelen, A, Molag, M et al. (2009) A review of the use of information and communication technologies for dietary assessment. Br J Nutr 101, Suppl. 2, S102–S112.Google Scholar

6. Poslusna, K, Ruprich, J, de Vries, JHM et al. (2009) Misreporting of energy and micronutrient intake estimated by food records and 24 hour recalls, control and adjustment methods in practice. Br J Nutr 108, Suppl. 2, S73–S85.Google Scholar

7. Gibson, RS (2005) Principles of Nutritional Assessment, 2nd ed. New York: Oxford University Press.Google Scholar

8. Kroeze, W, Werkman, A & Brug, J (2006) A systematic review of randomized trials on the effectiveness of computer-tailored education on physical activity and dietary behaviors. Ann Behav Med 31, 205–223.Google Scholar

9. Innovation of Dietary Assessment Methods for Epidemiological Studies and Public Health (2009) Dietary Assessment Methods: State of the Art Report. http://nugo.dife.de/twiki41/pub/IDAMES/IdamesResults/2009_WP4_Report.pdf (accessed February 2013).Google Scholar

10. Livingstone, MB, Robson, PJ & Wallace, JM (2004) Issues in dietary intake assessment of children and adolescents. Br J Nutr 92, Suppl. 2, S213–S222.CrossRef Google Scholar PubMed

11. Dennis, LK, Snetselaar, LG, Nothwehr, FK et al. (2003) Developing a scoring method for evaluating dietary methodology in reviews of epidemiologic studies. J Am Diet Assoc 103, 483–487.Google Scholar

12. Serra-Majem, L, Frost Andersen, L, Henríque Sánchez, P et al. (2009) Evaluating the quality of dietary intake validation studies. Br J Nutr 102, Suppl. 1, S3–S9.Google Scholar

13. Ambrosini, GL, de Klerk, NH, O'Sullivan, TA et al. (2009) The reliability of a food frequency questionnaire for use among adolescents. Eur J Clin Nutr 63, 1251–1259.CrossRef Google Scholar PubMed

14. Arajuo, MC, Massae Yokoo, E & Alves Pereira, R (2010) Validation and calibration of a semiquantitative food frequency questionnaire designed for adolescents. J Am Diet Assoc 110, 1170–1177.Google Scholar

15. Bertoli, S, Petroni, ML, Pagliato, E et al. (2005) Validation of food frequency questionnaire for assessing dietary macronutrients and calcium intake in Italian children and adolescents. J Pediatr Gastroenterol Nutr 40, 555–560.Google Scholar

16. Cullen, KW, Watson, K & Zakeri, I (2008) Relative reliability and validity of the Block Kids Questionnaire among youth aged 10 to 17 years. J Am Diet Assoc 108, 862–866.Google Scholar

17. Deschamps, V, De Lauzon-Guillain, B, Lafay, L et al. (2009) Reproducibility and relative validity of a food-frequency questionnaire among French adults and adolescents. Eur J Clin Nutr 63, 282–291.CrossRef Google Scholar PubMed

18. Hoelscher, D, Day, S, Kelder, SH et al. (2003) Reproducibility and validity of the secondary level School-Based Nutrition Monitoring student questionnaire. J Am Diet Assoc 103, 186–194.Google Scholar

19. Hong, TK, Dibley, MJ & Sibbritt, D (2010) Validity and reliability of an FFQ for use with adolescents in Ho Chi Minh City, Vietnam. Public Health Nutr 13, 368–375.CrossRef Google Scholar PubMed

20. Matthys, C, Pynaert, I, De Keyzer, W et al. (2007) Validity and reproducibility of an adolescent web-based food frequency questionnaire. J Am Diet Assoc 107, 605–610.Google Scholar

21. Moore, HJ, Ells, LJ, McLure, SA et al. (2008) The development and evaluation of a novel computer program to assess previous-day dietary and physical activity behaviours in school children: the Synchronised Nutrition and Activity Program™ (SNAP™). Br J Nutr 99, 1266–1274.Google Scholar

22. Papadopoulou, SK, Barboukis, V, Dalkiranis, A et al. (2008) Validation of a questionnaire assessing food frequency and nutritional intake in Greek adolescents. Int J Food Sci Nutr 59, 148–154.Google Scholar

23. Rockett, HRH, Berkey, CS & Colditz, GA (2007) Comparison of a short food frequency questionnaire with the Youth/Adolescent Questionnaire in the Growing Up Today Study. Int J Pediatr Obes 2, 31–39.Google Scholar

24. Shatenstein, B, Amre, D, Jabbour, M et al. (2010) Examining the relative validity of an adult food frequency questionnaire in children and adolescents. J Pediatr Gastroenterol Nutr 51, 645–652.Google Scholar

25. Sjoberg, A & Hulthe, L (2004) Assessment of habitual meal pattern and intake of foods, energy and nutrients in Swedish adolescent girls: comparison of diet history with 7-day record. Eur J Clin Nutr 58, 1181–1189.Google Scholar

26. Slater, B, Philippi, ST, Fisberg, RM et al. (2003) Validation of a semi-quantitative adolescent food frequency questionnaire applied at a public school in Sao Paulo, Brazil. Eur J Clin Nutr 57, 629–635.Google Scholar

27. Vereecken, CA & Maes, L (2003) A Belgian study on the reliability and relative validity of the Health Behaviour in School-Aged Children food-frequency questionnaire. Public Health Nutr 6, 581–588.Google Scholar

28. Vereecken, CA, Covents, M, Matthys, C et al. (2005) Young adolescents’ nutrition assessment on computer (YANA-C). Eur J Clin Nutr 59, 658–667.Google Scholar

29. Vereecken, CA, De Bourdeaudhuij, I & Maes, L (2010) The HELENA online food frequency questionnaire: reproducibility and comparison with four 24-hour recalls in Belgian-Flemish adolescents. Eur J Clin Nutr 64, 541–548.Google Scholar

30. Watanabe, M, Yamaoka, K, Yokotsuka, M et al. (2010) Validity and reproducibility of the FFQ (FFQW82) for dietary assessment in female adolescents. Public Health Nutr 14, 297–305.CrossRef Google Scholar PubMed

31. Watson, JF, Collins, CE, Sibbritt, DW et al. (2009) Reproducibility and comparative validity of a food frequency questionnaire for Australian children and adolescents. Int J Behav Nutr Phys Act 6, 62.Google Scholar

32. Willett, W (1998) Nutritional Epidemiology, 2nd ed. New York: Oxford University Press.Google Scholar

33. Fraser, GE, Lindsted, KD, Knutsen, SF et al. (1998) Validity of dietary recall over 20 years among California Seventh-day Adventists. Am J Epidemiol 148, 810–818.Google Scholar

34. Chinn, S (1990) The assessment of methods of measurement. Stat Med 9, 351–362.CrossRef Google Scholar PubMed

35. Hebert, JR & Miller, DR (1991) The inappropriateness of conventional use of the correlation coefficient in assessing validity and reliability of dietary assessment methods. Eur J Epidemiol 7, 339–343.CrossRef Google Scholar PubMed

36. Bland, JM & Altman, DG (1999) Measuring agreement in method comparison studies. Stat Methods Med Res 8, 135–160.CrossRef Google Scholar PubMed

37. Brenner, H & Kliebsch, U (1996) Dependence of weighted kappa coefficients on the number of categories. Epidemiology 7, 199–202.Google Scholar

38. Sim, J & Wright, CC (2005) The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys Ther 85, 257–268.CrossRef Google Scholar PubMed

39. Altman, DG (1991) Practical Statistics for Medical Research. London: Chapman and Hall.Google Scholar

40. Bland, JM & Altman, DG (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1, 307–310.Google Scholar

41. Cade, JE, Burley, VJ, Warm, DL et al. (2004) Food frequency questionnaires: a review of their design, validation and utilization. Nutr Res Rev 17, 5–22.Google Scholar

42. Biro, G, Hulshof, KF, Ovesen, L et al. (2002) Selection of methodology to assess food intake. Eur J Clin Nutr 56, Suppl. 2, S25–S32.CrossRef Google Scholar PubMed

43. Bliss, RM (2004) Researchers produce innovation in dietary recall. Agric Res 52, 10–12.Google Scholar

44. Brug, J, Oenema, A & Campbell, M (2003) Past, present, and future of computer-tailored nutrition education. Am J Clin Nutr 77, 1028–1034.Google Scholar

45. Burrows, TL, Martin, RJ & Collins, CE (2010) A systematic review of the validity of dietary assessment methods in children when compared with the method of doubly labeled water. J Am Diet Assoc 110, 1501–1510.Google Scholar

46. Cade, JE, Thompson, RL, Burley, V et al. (2002) Development, validation and utilisation of food-frequency questionnaires – a review. Public Health Nutr 5, 567–587.CrossRef Google Scholar PubMed

47. Contento, IR, Randell, JS & Basch, CE (2002) Review and analysis of evaluation measures used in nutrition education intervention research. J Nutr Educ Behav 34, 2–25.CrossRef Google Scholar PubMed

48. Ortiz-Andrellucchi, A, Henriquez-Sanchez, P, Sanchez-Villegas, A et al. (2009) Dietary assessment methods for micronutrient intake in infants, children and adolescents: a systematic review. Br J Nutr 102, Suppl. 1, S87–S117.Google Scholar

49. Probst, YC & Tapsell, LC (2005) Overview of computerized dietary assessment programs for research and practice in nutrition education. J Nutr Educ Behav 37, 20–26.Google Scholar

50. Subar, AF (2004) Developing dietary assessment tools. J Am Diet Assoc 104, 769–770.Google Scholar

51. Schatzkin, A, Kipnis, V, Carroll, RJ et al. (2003) A comparison of a food frequency questionnaire with a 24-hour recall for use in an epidemiological cohort study: results from the biomarker based Observing Protein and Energy Nutrition (OPEN) study. Int J Epidemiol 32, 1054–1062.Google Scholar

52. Kipnis, V, Subar, AF, Midthune, D et al. (2003) The structure of dietary measurement error. Results of the OPEN biomarker study. Am J Epidemiol 158, 14–21.CrossRef Google Scholar PubMed

53. Molag, ML, de Vries, JHM, Ocke, MC et al. (2007) Design characteristics of food frequency questionnaires in relation to their validity. Am J Epidemiol 166, 1468–1478.Google Scholar

54. Slimani, N & Valsta, L (2002) Perspectives of using the EPIC-SOFT programme in the context of pan-European nutritional monitoring surveys: methodological and practical implications. Eur J Clin Nutr 56, Suppl. 2, S63–S74.Google Scholar

55. Fries, E, Green, P & Bowen, DJ (1995) What did I eat yesterday? Determinants of accuracy in 24-hour food memories. Appl Cogn Psychol 9, 143–155.Google Scholar

56. Gnardellis, C, Boulou, C & Trichopoulou, A (1998) Magnitude, determinants and impact of under-reporting of energy intake in a cohort study in Greece. Public Health Nutr 1, 131–137.CrossRef Google Scholar

57. Thompson, FE, Subar, AF, Loria, CM et al. (2010) Need for technological innovation in dietary assessment. J Am Diet Assoc 110, 48–51.Google Scholar

58. Vereecken, CA, Covents, M, Sichert-Hellert, W et al. (2008) Development and evaluation of a self-administered computerized 24-h dietary recall method for adolescents in Europe. Int J Obes (Lond) 32, Suppl. 5, S26–S34.Google Scholar

59. Vereecken, CA & Maes, L (2006) Comparison of a computer-administered and paper-and-pencil administered questionnaire on health and lifestyle behaviors. J Adolesc Health 38, 426–432.Google Scholar

60. Evers, W & Carol, B (2007) An Internet-based assessment tool for food choices and physical activity behaviors. J Nutr Educ Behav 39, 105–106.Google Scholar

Fig. 1 Selection process flow of the original articles and reviews on the validation and/or reproducibility of dietary assessment methods in adolescents

Table 1 Overview of the twenty eligible original articles

Table 2 Characteristics of the seventeen FFQ validation studies

Table 3 Characteristics of the three validation studies of other questionnaires than the FFQ

Article contents

Validation and reproducibility of dietary assessment methods in adolescents: a systematic literature review

Abstract

Keywords