Dietary flavonoids are secondary plant metabolites, and are present in fruits, vegetables, spices, herbs, nuts, legumes and their products. Recently, several systematic reviews with meta-analyses of cohort and/or case–control studies have supported the expected beneficial effects of flavonoids on chronic diseases like CVD( Reference Wang, Ouyang and Liu 1 ), type 2 diabetes( Reference Babu, Liu and Gilbert 2 ) and some cancers( Reference Woo and Kim 3 , Reference Hui, Qi and Qianyong 4 ): the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort study, conducted in several centres across Europe, has observed reduced risk of colorectal cancer( Reference Zamora-Ros, Not and Guino 5 ), oesophageal cancer( Reference Vermeulen, Zamora-Ros and Duell 6 ) and gastric adenocarcinoma( Reference Zamora-Ros, Agudo and Lujan-Barroso 7 ). Recently, two other studies( Reference Jennings, Welch and Spector 8 , Reference Wedick, Pan and Cassidy 9 ) observed a lower risk of type 2 diabetes associated with higher consumption of anthocyanins. Although one study( Reference Cassidy, Rimm and O'Reilly 10 ) was not able to associate total flavonoid intakes with reduced risk of stroke, it did observe that increased intakes of flavanones, a subclass of flavonoids, were inversely associated with reduced risk of ischaemic stroke. Quercetin, a flavonol, was found to be associated with reduced risk of gastric cancer in a case–control study( Reference Ekstrom, Serafini and Nyren 11 ). Thus, flavonoids have displayed diverse bioactivities associated with different compounds.
Food composition databases for dietary flavonoids have been found to be necessary tools to support these research studies. The Nutrient Data Laboratory (NDL) of the Agricultural Research Service at the US Department of Agriculture (USDA), developed a database for isoflavones, ‘USDA Database for the Isoflavone Content of Selected Foods’, in 1999( 12 ) and updated it in 2008, Release 2, (IDB 2.0)( Reference Bhagwat, Haytowitz and Holden 13 ). In 2003, the NDL developed a database for flavonoids, ‘USDA Database for the Flavonoid Content of Selected Foods’ for predominant dietary flavonoids in five subclasses (flavonols, flavones, flavanones, flavan-3-ols and anthocyanidins)( 14 ); it was updated in 2007( 15 ), 2011( Reference Bhagwat, Haytowitz and Holden 16 ), and most recently in 2013, Release 3.1 (FDB 3.1)( Reference Bhagwat, Haytowitz and Holden 17 ). Together, these two databases encompass all of the predominant dietary flavonoids except proanthocyanidins. The NDL is in the process of updating the USDA database for proanthocyanidins( 18 ). Flavonoid values in all these databases are reported as mg/100 g edible portion on fresh weight basis. In 2013, the NDL formulated a new Flavonoids Database for approximately 2900 commonly consumed foods, ‘USDA's Expanded Flavonoid Database for the Assessment of Dietary Intakes’ (FDB-EXP)( Reference Bhagwat, Haytowitz and Wasswa-Kintu 19 ), using analytical values from FDB 3.1 and IDB 2.0 as the foundation to calculate values for all the twenty-nine flavonoid compounds included in these two databases. Original analytical values in Flavonoid Release 3.1 and Isoflavone Release 2.0 for corresponding foods were retained in the newly constructed database. Values were calculated only for the foods and/or the flavonoid compounds that were found missing in these two databases. Unlike FDB 3.1 and IDB 2.0 which do not contain analytical values for all the twenty-six compounds and all the three isoflavones respectively for every food, FDB-EXP contains values for all the twenty-nine flavonoid compounds for every food, which are a combination of analytical and calculated values. It does not currently include proanthocyanidin values. NDL plans to add proanthocyanidin data to the expanded database after it is updated. The database comprises a subset of foods included in NDL's National Nutrient Database for Standard Reference 22 (SR22)( 20 ) released in 2009. These 2926 foods represent a snapshot of foods commonly consumed in the United States that was used to develop Food and Nutrient Database for Dietary Studies 4.1 (FNDDS 4.1)( 21 ), and was used to analyse data from What We Eat in America component of the National Health and Nutrition Examination Survey 2007–8( 22 ). Table 1 provides the number of foods, subclasses of flavonoids, and compounds in each subclass included in FDB 3.1, IDB 2.0 and the newly constructed FDB-EXP databases. The FDB-EXP is available on the NDL's website: http://www.ars.usda.gov/nutrientdata/flav.
IDB 2.0, USDA Database for the Isoflavone Content of Selected Foods, Release 2.0; FDB 3.1, USDA Database for the Flavonoid Content of Selected Foods, Release 3.1; FDB-EXP; USDA's Expanded Flavonoid Database for the Assessment of Dietary Intakes.
As of 2013, our ongoing literature searches revealed that these flavonoid databases, singly or together, have been used for more than seventy studies conducted in the USA, Europe, the UK and Australia to estimate flavonoid intakes, and to evaluate associations between flavonoid intakes and various chronic diseases. Although the databases were developed after comprehensive literature searches for analytical data, and the analysis of nationally representative samples of fruits, vegetables and nuts in the USA at the USDA's laboratories( Reference Harnly, Doherty and Beecher 23 , Reference Gu, Kelm and Hammerstone 24 ), the databases do not include every food (fruit, vegetable, herb or legume) expected to contain some flavonoid compounds. In addition, these databases, with the exception of the IDB 2.0 which contains some prepared foods, do not include multi-ingredient prepared foods/food products, although these foods may contain ingredients that contribute flavonoids to those dishes. Therefore, scientists who use these databases for epidemiological studies need to calculate flavonoid contents for food items unavailable in these databases and for the multi-ingredient foods, to avoid underestimation of intakes.
This article describes the process of constructing a new Flavonoids Database for approximately 2900 commonly consumed foods. Analytical values from FDB 3.1 and IDB 2.0 are used as the foundation to calculate values for all the twenty-nine flavonoid compounds included in these two databases. Also described are the challenges encountered during this process, and how the NDL addressed these challenges.
Materials and methods
The general approach to developing USDA's flavonoids databases is described in the earlier publication( Reference Holden, Bhagwat and Haytowitz 25 ). FDB 3.1 contains data for twenty-six flavonoid compounds in five subclasses of flavonoids: flavonols, flavan-3-ols, flavones, flavanones and anthocyanidins in 306 foods, although every food may not have values for all the twenty-six selected flavonoids. IDB 2.0 contains data for three prominent isoflavones: daidzein, genistein and glycitein, in 549 foods, but may not have values for all the three compounds for all the foods. These two databases contain data for twenty-nine predominant dietary flavonoids in six subclasses. The NDL has observed that scientists generally analyse only the expected predominant class of compounds in a particular food type; for example, flavanones in citrus fruits or anthocyanidins in coloured berries. The objective of the expansion project was to populate full flavonoid profiles for the twenty-nine flavonoid compounds for each food in a subset of 2926 foods in NDL's SR22 that was used to develop FNDDS 4.1, using analytical values from FDB 3.1 and IDB 2.0 as foundations. The process involved various steps including (1) assigning logical zero values, (2) matching analytical values, (3) calculating flavonoid values (moisture adjustment, using retention factors for cooked/processed foods, food yield factors and substituting values with similar foods) and (4) calculating values for multi-ingredient foods (Fig. 1). Each step is described in detail as follows.
Assigning logical ‘zero’ values
Standard reference (SR) contains twenty-five Food Groups (FG) covering a wide variety of foods consumed in the United States. Animals cannot synthesise flavonoids like plants but get them through foods that they consume( Reference Peterson and Dwyer 26 ). It may be possible that animals can accumulate flavonoids in their tissues through their feed. Based on the limited data in FDB 3.1 and IDB 2.0, and an extensive review of literature, flavonoids data for animal product food groups were not found except for a very small amount of isoflavones in eggs( Reference Horn-Ross, Barnes and Lee 27 ), and for ‘total isoflavones’ in some dairy, seafood and meat products( Reference Kuhnle, Dell'Aquila and Aspinall 28 ). The USDA databases accept values reported for individual compounds only, not if reported only as ‘total’. Therefore, a ‘0’ value was assigned to all the flavonoids for foods in animal product groups (beef, pork, poultry, finfish, shellfish, dairy, fats, sausages and luncheon meats), and multi-ingredient foods derived from these food groups, if other ingredients were not expected to contain flavonoids. Vegetable oils were also assumed to have no flavonoids since no data were available except for olive oils, which are extracted from fruits and which did have analytical values for flavones. More comprehensive data on these foods are needed to fully characterise their flavonoids content.
Only one or two subclasses of flavonoids are predominant in most food items: for e.g. flavanones are a major subclass in citrus fruits and may contain some flavonols. Therefore, ‘zero’ values were assigned to flavonoid compounds in the subclasses that were not expected to be present in a particular food or FG. Table 2 illustrates the general scheme of assuming logical zeroes in each FG.
z denotes assumed zero; zx denotes mostly assumed zero with some exception.
Matching analytical values
FG which are expected to contain one or more flavonoid subclasses included fruits/fruit juices, vegetables/vegetable products, spices/herbs, nuts/seeds and legumes. During the development of FDB 3.1 and IDB 2.0, foods were assigned Nutrient Data Bank numbers (NDB), unique five-digit numerical codes used in SR, if the food descriptions matched with those in SR. If exact matches were not available, temporary NDB numbers were assigned. The analytical flavonoid values for the exact matches for these NDB numbers in FDB 3.1 and IDB 2.0 with NDB numbers in FDB-EXP were retained. Food descriptions for the temporary NDB numbers were also reviewed to find matches for similar foods in the FDB-EXP.
Calculating flavonoid values
After assigning available analytical values to those foods which were direct matches by food descriptions, various techniques were then employed to calculate values for the remaining compounds and/or foods according to the procedures described by Schakel et al. ( Reference Schakel, Buzzard and Gebhardt 29 ). The various procedures used to calculate values where analytical values were unavailable are described below.
Moisture adjustment
For plant-based foods that undergo moisture changes due to cooking, drying or dilution, values for full flavonoid profiles were calculated from another form of the same food or from a similar food. A moisture factor based on the change in total solids was applied (e.g. raw to cooked asparagus or fresh to dried basil). If the moisture content was not reported in the published article, moisture values from the SR were used for the same food or sometimes for similar food.
Retention factors
To account for changes in flavonoid contents after cooking/processing of the foods, it was necessary to determine retention factors. Literature searches done while developing the flavonoid databases had retrieved only fifteen studies that analysed vegetables in raw and cooked forms( Reference Andlauer, Stumpf and Furst 30 – Reference Price, Colquhoun and Barnes 44 ). Most of them analysed only flavonols, one analysed apigenin( Reference Ferracane, Pellegrini and Visconti 35 ) and one naringenin in Brussels sprouts( Reference Pellegrini, Chiavaro and Gardana 41 ). Different cooking methods such as boiling, steaming, frying or microwaving were used in these studies. The average retention for flavonols, flavones and flavanones from these studies was approximately 86 % (58–132 %). The cooking methods were not taken into consideration when the average retention percentage was calculated. There were no reports on anthocyanidins retention. After consulting scientists at the Food Composition and Methods Development Laboratory of Agricultural Research Service/USDA, as well as data from limited literature sources, retention factors of 85 % for flavonols, flavanols, flavanones and flavones and 50 % for anthocyanidins (because of their heat labile characteristics) were established for the FDB-EXP. The application of dry heat (i.e. baking or air drying) was considered to have negligible effects on flavonoid losses (J Harnly, personal communication, 2012). Therefore, retention factors were not applied when drying process was used, e.g. for fresh to dried fruits or fresh to dried herbs. If the analytical values were available for cooked/processed foods, these values were retained in preference to those calculated by applying retention factors. Retention factors were not needed for isoflavones, as analytical values were available for most of the raw and cooked/processed foods that contained isoflavones. True retention factors can be generated only by paired studies by analysing half of the food sample raw and the other half after cooking, and recording weights of the raw and cooked samples. Cooked or canned plant-based foods without full flavonoid profiles were calculated from values for raw forms of the same food, using estimated retention factors to account for the loss of flavonoids during processing.
The following example illustrates how the moisture and retention factors were applied to calculate unavailable quercetin values for cooked beets from the analytical values for raw beets.
Formula: Nt= (Ns× Ss/St) × R, where N t= the nutrient (quercetin) content of the target item (cooked beets); N s= the nutrient (quercetin) content of the source item (raw beets = 0·134 mg/100 g); S s= the total solid content (total weight − moisture content) of the source item (raw beets = 100 − 87·58 = 12·42 g/100 g); S t= the total solid content (total weight − moisture content) of the target item (cooked beets = 100 − 87·06 = 12·94 g/100 g); R = retention factor (estimated retention factor for quercetin 0·85).
Using the above formula, quercetin content of cooked beets is calculated thus:
Nt= (0·134 × 12·42)/(12·94) × 0·85 = 0·12 mg/100 g.
Food yield factors
Food yield factors were also applied to account for food processing effects in instances where values were available only for a different form of the same food. For example, for canned foods, food yield factors applied to the raw form to adjust for yields of solid foods after draining liquids ranging from 53–67 % (USDA, 1975)( 45 ), since the flavonoids were expected only in the fruit or vegetable (a yield factor of 67 % was applied to canned peaches, i.e. a can of peaches contained 67 % fruit and 33 % liquid). Retention factors were applied to compensate for processing losses. Although some flavonoids are leached in the canning liquid (water or syrup)( Reference Rickman, Barrett and Bruhn 46 ), no adjustments were made for these losses.
Substitution with similar foods
Values from a similar food were substituted for the food or compound considering botanical origins, or other similarities like colour and texture. For example, flavan-3-ols, flavanones, anthocyanidins and flavonol myricetin values of fresh blackberries were used for fresh mulberries, without any adjustments for the lack of analytical values for these compounds in mulberries.
Market share
Other factors such as market share data were considered when estimating flavonoid values for some items. The market shares were considered only for wines and grapes because of the lack of specific NDB numbers distinguishing red and white wines, or red and green grapes. The NDB no. 14 084 in SR describes a table wine without specifying whether it is red or white. However, this item is not included in the FDB 3.1, as data were available for both red and white wines, and as flavonoids values, particularly that of anthocyanidin, would be quite different. Market share proportions of 47 % red wine, 40 % white wine and 13 % blush wine were reported by AC Nielsen supermarket data, 2009 and the US Department of Commerce, 2011( Reference Hodgen 47 ). However, due to lack of analytical data for flavonoid for blush wines (only one source), a generic profile for table wine (NDB no. 14 084) was created by using 50 % of red and 50 % of white wine values. In the development of FDB 3.1, some food items (e.g. red and green grapes) were assigned more specific provisional NDB numbers; it was done to differentiate them from each other on the basis of flavonoid content, although grapes have a single NDB no. (09 132) in SR that does not distinguish red and green grapes. Due to lack of nationwide market share or consumption data on table grapes specified as red or green separately, a generic value for all grapes, corresponding to the NDB no. 09 132 in SR, was also developed by using both the red (50 %) and green (50 %) grape values.
Generic profiles
A generic profile for twenty-nine flavonoid compounds was prepared for common leafy vegetables, using values for nine leafy vegetables from FDB 3.1. This profile was used to estimate values for a specific leafy vegetable, when flavonoid values were not available for any or some compounds. A generic profile for fruits was also developed using fifteen fruits, and used for less common fruits when similar fruits were not available for substitution of flavonoid values or missing values.
Other sources
Values for instant tea powders provided in this database were drawn from the unpublished figures provided by the Unilever Lipton Company in 2002. The weight for added flavouring and/or artificial sweetener like saccharin was disregarded when calculating flavonoid values for tea powders with flavours and/or with sugar substitutes, i.e. the same values were used for unsweetened/unflavoured and sweetened with saccharin/flavoured instant tea powders and prepared teas as well.
Values from other databases such as Phenol-Explorer, Release 2( Reference Neveu, Perez-Jimenez and Vos 48 ) were used for a limited number of foods/compounds not available in the FDB 3.1. For instance, kaempferol values for most nuts were obtained from Phenol-Explorer.
Calculating flavonoid values for multi-ingredient foods
Multi-ingredient foods with one or more ingredients of plant origin such as baby foods, soups, breakfast cereals, beverages and prepared meals/entrees may contain some flavonoids, depending on amounts of plant-based ingredients. However, few multi-ingredient food items were included in FDB 3.1 because of the lack of analytical data. For multi-ingredient foods, formulations developed by NDL scientists( Reference Haytowitz, Lemar and Pehrsson 49 ) were used to estimate percentages of flavonoid-containing ingredients. These formulations use regression equations which take the ingredient lists on the labels of these products, plus the nutrient content of the ingredients from SR, and estimate the proportion of each ingredient in the food item. The formulations were developed by food group specialists in NDL who are familiar with the products; but they were developed for other nutrients, and occasionally were not suitable for flavonoid calculations and hence had to be modified suitably. In general, the flavonoid-containing ingredient was used in calculations only if it contributed ≥ 5 % of the total by weight. The exceptions were cocoa powder (regular and alkalised) and soya protein isolates/soya flour because of their high contents of catechins and isoflavones respectively per unit weight. Multi-ingredient foods containing less than five percent of each plant-based ingredient were estimated to have no flavonoids present in them. Whenever orange juice was one of the ingredients ( ≥ 5 %), e.g. as in baby food juices or citrus juice drinks, we used values of the entry ‘orange juice, chilled, includes from concentrate (NDB no. 09 209)’ to calculate flavanone values for these foods for consistency in estimation.
Table 3 illustrates some of the techniques used to calculate flavonoid values.
Results and discussion
The FDB-EXP database was released on the NDL's website in September 2014. It is available as Microsoft® ACCESS files, not as PDF files, which include descriptions of foods, food groups, flavonoids data, compound (nutrient) numbers, source codes and derivation codes. The 2926 foods in the FDB-EXP were distributed across twenty-five FG, and contained a total of 84 854 data points. All the flavonoid values from FDB 3.1 and IDB 2.0 for corresponding foods were retained in FDB-EXP. Values were calculated only for foods and/or compounds that were not available in these two databases. The values in FDB-EXP are reported as mg/100 g edible portion. Since only five plant-based FG and foods that contain plant foods as ingredients are expected to have compounds from two or three flavonoids subclasses, and are likely to contain only some flavonoid compounds from each of those subclasses, each food item may have analytical values for only three to five flavonoid compounds. The five food groups expected to have some flavonoids comprise approximately 780 foods out of 2926 total foods in the FDB-EXP. Therefore, 73 % of the data points (61 943 out of a total of 84 854 data points) received logical ‘zero’ values. Twenty-four percent values (20 365) in the database were calculated by techniques described in the earlier methods section, and three percent (2546) were matched with the analytical values that were drawn from FDB 3.1 and IDB 2.0. In SR, a single food item can have many different forms, and each form will have a unique NDB number, e.g. a specific fruit can be fresh, frozen, dried, canned in water, canned in syrup or a vegetable could be boiled without draining, with draining, with added salt and without added salt. Therefore, the number of data points in these five flavonoid-containing FG is very large. As analytical data may exist for only one of the forms, usually fresh (raw), it was necessary to calculate values for other forms of the foods. Therefore, it is reasonable to say that only a small percentage of data in the FDB-EXP are analytical.
The significant approach for expanding FDB 3.1 and IDB 2.0 flavonoid databases involved calculating values for unavailable analytical data after assigning logical or assumed ‘zero’ and matching analytical values. Using moisture factors, retention factors and yield factors required many algorithms.
The EPIC studies( Reference Zamora-Ros, Knaze and Lujan-Barroso 50 , Reference Zamora-Ros, Knaze and Lujan-Barroso 51 ) used FDB Release 2.1 (2007) and Phenol-Explorer, Release 2( Reference Neveu, Perez-Jimenez and Vos 48 ) to estimate intakes of flavanols, flavones, flavanones and anthocyanidins using retention factors of 70, 35 and 25 % for fried, microwaved and boiled foods respectively for all flavonoid compounds. These factors were based on a single study( Reference Crozier, Lean and McDonald 32 ). These dissimilar approaches which were applied to retention factors based on cooking methods by EPIC studies v. the same based on classes of flavonoid compounds by USDA will affect the values for cooked foods in these two databases, and consequently the estimation of flavonoid intakes.
Calculating flavonoid values for multi-ingredient foods like meals/entrees was another challenge. Although formulations developed by the NDL scientists( Reference Haytowitz, Lemar and Pehrsson 49 ) were applied to estimate percentages of flavonoid-containing ingredients for these foods, it required some more algorithms. Soya flour and soya protein isolates are used in the bakery products as dough conditioners( Reference Mulligan, Kuhnle and Lentjes 52 ). It was also confirmed by personal communication with manufacturers in the USA. Therefore, it is not surprising that the authors of the EPIC-Norfolk cohort study for the UK( Reference Mulligan, Kuhnle and Lentjes 52 ) observed that bread and bread rolls contributed the highest percentages of isoflavones in the diet of both men and women. Analytical values for total isoflavones in doughnut varied from 0·60 to 5·31 mg/100 g in IDB 2.0, depending on the kind of doughnut. Small isoflavones values were also reported in IDB 2.0 for eggs( Reference Horn-Ross, Barnes and Lee 27 ). The poultry industry uses soyabean meal as the main protein in poultry feed, and it comprises 66 % of all the proteins used( Reference Waldroup and Smith 53 ). Use of soyabean meal in poultry feed was also confirmed by personal communication (R Angel, Department of Poultry Science, University of Maryland, USA, 2013). The unexpected isoflavone values in eggs may be due to the use of soyabean meal in the chicken feed( Reference Saitoh, Sato and Harada 54 ).
Most foods, except for some fresh fruits and vegetables, are consumed after processing (different cooking methods, canning, drying, etc.), but no retention factors were available at the time the database was developed, to calculate values for the flavonoid compounds post-processing. Therefore, the major difficulty encountered during the expansion of the databases was the estimation of retention factors. The NDL in collaboration with Food Composition and Methods Development Laboratory has planned a study to generate true retention factors for different compounds/classes of flavonoids, using different cooking methods. Eggs are a staple food used in many recipes and are thus highly consumed. Analyses of eggs, some meat products and multi-ingredient foods would enhance these databases.
Summary
Plant foods such as fruits, vegetables, legumes and a few grains are the main sources of flavonoids in the human diet. Flavonoids data were not available for animal foods except for isoflavone values for eggs. Only two or three subclasses of flavonoids are predominant in most foods, e.g. flavanones in citrus. Generally, researchers focus on analysing the expected predominant compounds only in the food they are analysing. Therefore, analytical values for the full profile of all the twenty-nine flavonoid compounds included in the FDB-EXP were not available for many foods. In order to properly estimate flavonoid intakes, it is necessary to create full profiles of flavonoid compounds for each food item. The FDB-EXP contains data for twenty-nine flavonoids in six classes for approximately 2900 food items from SR22 which provides the basis for the FNDDS 4.1.
The FDB-EXP will be a valuable tool for epidemiologists to estimate flavonoid intakes of a population of interest, and study associations between intakes and various chronic diseases. The database contains flavonoid values for a large number of foods in many different forms that are commonly consumed in the United States. It can also be employed as an effctive tool to address directly discrepancies created by using different techniques by different scientists while calculating data for missing compounds or foods in their own databases. Studies using FDB-EXP will have uniformity in the estimations of flavonoids contents in foods. As more and more analytical data for flavonoids become available through published literature the need to impute flavonoid values will diminish.
Limitations
The database for 84 854 flavonoid values was formulated on the basis of analytical values directly available for just three percent of the data. Many assumptions were made in order to calculate those values for the rest of the data, depending on the experience of the authors in the database development, and consultations with experts in the areas of flavonoids analyses. The database contains twenty-nine selected prominent flavonoid compounds for 2926 selected foods, but not other flavonoid compounds that could possibly be present in those foods.
Acknowledgements
The present study was supported by the Office of Dietary Supplements (ODS), NIH, Agreement no. Y1CN5010. The salary of S. I. W.-K. was supported by the University of Maryland, College Park, Maryland (Agreement no. 58-1235-1-98). The ODS and the University of Maryland had no role in the design, analysis or writing of the present article.
The authors' contributions are as follows: S. A. B., D. B. H. and S. I. W.-K. designed, carried out the study, and developed the Expanded Flavonoid Database; D. B. H. developed and executed the algorithms to calculate flavonoid values; S. I. W.-K. sketched the figure; S. A. B. wrote the manuscript; D. B. H., S. I. W.-K. and P. R. P. contributed significant comments. All authors read and approved the final manuscript.
None of the authors has any conflict of interest.