While the first half of 20th century was a notable period of discovery in nutritional sciences, with elucidation of the structures and roles of essential nutrients, the second half can be seen to have been dominated by the challenges of understanding how an increasingly affluent diet might be related to dramatic increases in CVD, cancer and latterly, neurodegenerative disease. The methodology and approaches used to investigate diet–disease relationships differ significantly from those concerned with the discovery-science era of the first part of the century. Understanding diet–disease relationships requires the exploration of evidence from human populations using epidemiological approaches which were then only beginning to emerge(1). Studies such as comparisons of trends in disease prevalence in countries with differing dietary cultures (cross-cultural studies) and details of the habitual diets of people who had developed specific chronic diseases compared with healthy matched control subjects (case-control studies) were just beginning to be reported. As the sub-specialisation of nutritional epidemiology progressed, many advanced countries invested funding to establish large-scale prospective population studies. The latter involve recruitment and follow-up of cohorts of healthy people to determine whether their habitual diets are related to subsequent risk of developing specific chronic diseases (i.e. prospective cohort studies (PCS)). Although PCS were considered the most rigorous type of observational study, due to their greater cost and lengthy duration, there were relatively few of them in the 1960s. Their major limitation was considered to be their lack of causal certainty compared with randomised controlled trials (RCT) of morbidity and mortality, where cause (diet) and effect (diagnosis of disease or death) provide unambiguous outcomes(1). More recently there have been criticisms of the use of RCT in studying diet–cancer relationships due to costs, poor compliance and lack of generalisability(Reference Giovannucci2).
Limitations in ascribing causality from observational studies are largely due to the possibility that observed associations between diet and disease may be confounded by factors other than diet, a weakness that has repeatedly dogged nutritional epidemiology. Limited evidence to support causality was faced by members of the expert group which published the first policy recommendation for dietary prevention of CVD in the UK in 1974(3). Although the group agreed recommendations for reductions in population intakes of fat and saturated fats for prevention of CVD the chair of this report commented that ‘Because of the complex nature of the evidence and because conflicting interpretations of it are possible, we have not always been able to reach agreed conclusions’(3).
In 1966, Bradford-Hill(Reference Bradford-Hill4), a UK researcher involved in developing epidemiological methods in nutrition, had proposed a number of key criteria which he and others considered necessary to be fulfilled if a causal link between diet and diseases was to be assigned (Table 1). Although the precise wording and weighting placed on individual criteria has been modified over time, they have played an important part in supporting developments in the design, statistical analysis and synthesis of observational studies (notably PCS), throughout the late 20th and early 21st century.
* Established in 1965 by the English epidemiologist Sir Austin Bradford Hill; their exact wording, application and limitations continue to be debated including their applicability in 21st century. Modified from Williams et al.(Reference Williams, Ashwell and Prentice5).
Research underpinning dietary policy
As described in a position paper from the Academy of Nutrition Sciences(Reference Williams, Ashwell and Prentice5) major advances have been made in the design, statistical analysis and quality evaluation of population-based research. Systematic reviews of individual studies, including meta-analyses and pooled studies, have largely replaced narrative accounts of individual studies which can lead to bias due to selective use of well-known and well-cited studies. Greater statistical power from summation of data enables causal characteristics of the data such as effect size, dose-response, consistency, specificity and temporality to be addressed more fully (Table 1).
Fig. 1 (a and b) shows an example of this type from a meta-analysis of body weight gain in adulthood and risk of postmenopausal breast cancer published in the World Cancer Research Fund (WCRF) 3rd report(6). The analysis shows a high level of consistency between studies, a clear dose-response (Fig. 1b) with individual studies also fully adjusted for confounding. This analysis for body weight and postmenopausal breast cancer in the WCRF report was graded as ‘convincing’ (highest level). The overall report demonstrated overweight and obesity were related to cancer at thirteen different sites with overall levels of certainty rated either ‘probable’ or ‘convincing’.
The effect size in this breast cancer study appears small (6 % increase in risk) but this risk applies to a large number of women in the UK since according to the Heath Survey for England the average body weight gain for UK women in the period 1993–2019 was estimated to be in the region of 4–6 kg. The survey reported that the prevalence of obesity among adult women in 1993 was 16 %, which had increased to 29 % by 2019(7). In a recent study of two cohorts of young women separated 16 years by birth date, the women born later (in 1989–95) were 4 kg heavier at age 21 years than the earlier cohort born in 1973–78. The future trajectory of weight gain indicated women born in the 1989–95 cohort would be 17 kg heavier at the age of 41 years than the 1973–75 cohort(Reference Brown, Flores and Keating8). The high population attributable risk of this common cancer, with strong prediction that this will increase over the next 20–30 years due to continuing trends for weight gain in the UK, suggests a better understanding of the mechanisms(s) underlying this association is urgently required.
Despite the importance of PCS, criticisms have been levied at this study type for reasons including: over-reliance on data from prosperous countries; reports of differing conclusions from meta-analyses based on identical data; the long duration of follow-up to allow accrual of cases (in some cases >50 years), also potentiates the risk of changes in confounding factors such as habitual diet, changes in food compositions, smoking, alcohol and prescription drugs. RCT of diet and disease outcomes have also been criticised for use in diet–disease studies, especially cancer, due to costs, logistical difficulties of ensuring compliance to long-term diets and lack of generalisability of their outcomes to whole populations(Reference Giovannucci2).
Data from RCT on free-living subjects of intermediate markers (or phenotypes) for chronic disease (e.g. BP, cholesterol, platelet aggregation for CVD; glycated Hb and postprandial glucose for diabetes and metabolic syndrome)(Reference Sacks, Appel and Moore9) provide an important part of the evidence-base for dietary prevention of chronic diseases. Unlike RCT for disease outcome, which require years of follow up, these types of studies can be highly controlled over periods of months to a year and can be conducted in a wide range of subjects including at-risk and healthy individuals. By investigating potential intermediate phenotypes, these data can also contribute to the proposed two-step framework for studying mechanisms underlying diet and disease relationships described later.
Taking a wider view it is clear that no single type of study can provide the level of certainty required by governments and organisations when making policy recommendations for prevention of diet-related disease. Observational epidemiology (usually PCS), RCT of morbidity and mortality (where these are available) and RCT of intermediate risk markers for disease, all play an important part in the final synthesis of the evidence for causality in diet and disease relationships.
Mechanistic data in understanding diet–disease relationships
Although data from PCS provide possibility of deriving causal criteria such as consistency, temporality, dose-response, lack of confounding, etc., they are not able to examine mechanisms underlying observed associations. Most expert group reports include consideration of mechanistic findings and discuss these in relation to the other types of evidence used to assess causality. However transparent, systematic selection and examination of the mechanistic literature for quality and relevance is not normally undertaken and there is currently no agreed framework for undertaking this type of analysis. This raises possibility of risk of bias from selective use of supportive studies and reveals a potential gap in the rigorous approach that applies to the greater part of expert group work.
To an extent the gap in the evidence-base is understandable given the variety of mechanistic hypotheses put forward and the vast number of ‘diet’ studies in the literature involving cell and animal models, which make any systematic examination a daunting prospect. Some reported associations from PCS provide insight into potential mechanisms, particularly where intermediate phenotype or genetic data are also available. For example, the reported associations between weight gain and risk of postmenopausal breast cancer have been proposed to be due to impact of adiposity on insulin resistance, decreased levels of steroid hormone-binding proteins and increased production of oestrogen by adipose tissue after the menopause, which, collectively, increase bioavailability of oestrogen to tissues. The oestrogen supply hypothesis is supported by observation of stronger impact of overweight on risk of breast cancer for women where an oestrogen and progesterone receptor-positive tumour is present than is the case for overweight women where the tumour is sex hormone receptor-negative(6). Greater adipose tissue mass may also be a marker of excess consumption of food and energy during adult life, leading to disruption of normal cellular energy metabolism or via provision of growth-promoting nutrients, e.g. specific fatty acids. It is equally possible that each of these mechanisms may help explain the obesity–breast cancer relationship.
Developing systematic methodologies for selecting and synthesising mechanistic studies which underlie diet–disease relationships
A more systematic approach to this part of the evidence-base is required, ensuring a level of rigour comparable to that used for other types of studies by most expert groups, with clear criteria for selection of studies and grading of the overall evidence according to quality, rigour and relevance. This should enable discrimination of those studies that can usefully address current diet and health policy questions from those that cannot. Although this may appear a relatively straightforward challenge, heterogeneity in the mechanistic models and design characteristics used makes the evaluation and synthesis of work in this area more demanding.
There has been little attention or discussion given to this issue with limited coverage in the literature. The work described later by a multidisciplinary group at the University of Bristol and commissioned by the WCRF is the first to publish a potential framework for objective selection, assessment and integration of mechanistic studies for the purpose of strengthening the evidence-base underpinning policy in diet and health. The work focuses on diet–cancer relationships and is still at an early stage, with the methodology for the framework(Reference Lewis, Gardner and Higgins10) and a number of proof of principle papers published in collaboration with other researchers between 2017 and 2021(Reference Harrison, Lennon and Holly11–Reference James, Dimopoulou and Martin13). Challenges lie in selecting and assessing the quality of different types of models (cell, animal, human subjects) and synthesising the heterogeneous data into a visual format for evaluating the effect size and direction for particular mechanistic pathways. As noted by the authors the ability to apply the model is highly dependent on the quality of the studies and the extent to which relevant data are available in the published report(Reference James, Dimopoulou and Martin13).
The framework proposed by Lewis et al.(Reference Lewis, Gardner and Higgins10) involves using a two-step process, with the first step a synthesis of data for associations of a dietary exposure with an intermediate phenotype (step 1 is exposure to intermediate phenotype (IP)), and the second step an assessment and synthesis of data associating the selected IP with the specific cancer site of interest (step 2 is IP to cancer). The authors proposed the use of existing methodologies for assessing the quality of human(Reference Higgins, Altman and Gøtzsche14) and animal(Reference Hooijmans, Rovers and de Vries15) studies, with the use of GRADE (Grading of Recommendations, Assessment, Development and Evaluation)(Reference Guyatt, Oxman and Vist16) for assessing the strength of the overall evidence. In their pilot and feasibility studies, Lewis et al.(Reference Lewis, Gardner and Higgins10) found the quality and reproducibility of the cell studies to be poor although they noted better authentication and quality control criteria for publication of cell studies have recently become a requirement for most journals.
This two-step approach involving intermediate phenotypes is already familiar in nutrition for studying diet–disease relationships. Due to the lack of RCT data that can directly link a dietary exposure with a disease outcome the use of an intermediate phenotypes is a standard approach, but in the context of nutrition they are usually termed surrogate or risk markers. For example, circulating cholesterol and BP are well-established IPs (or surrogate risk markers), with many examples of well-controlled dietary trials demonstrating effects of diet (exposure) on BP and on cholesterol(Reference Sacks, Appel and Moore9,Reference Williams, Francis-Knapper and Webb17) , and of clinical trials of BP- and cholesterol-lowering drugs on CVD (outcome)(Reference Ettehad, Emdin and Kiran18,Reference Wang, Woodward and Huffman19) . However, in the case of cancer, the putative IPs are many and varied, with agreement not reached as to which may be most important in terms of acting along the causal pathway, and for which there is also evidence for demonstrable effects of diet on the IP in question. Intermediate phenotypes for cancers may be present in the circulation, in immune cells or at the sites of the tumour and could include markers of genomic instability, DNA repair, mutations, cytokines, hormones, insulin-like growth factors (IGF) and others. To overcome this the authors(Reference Lewis, Gardner and Higgins10) proposed using a non-hypothesis approach to selecting putative IPs using a text mining methodology (text mining for mechanism prioritisation; www.temmpo.org.uk) which visualises multiple diet-IP and IP-cancer pathways, with results displayed using a Sankey plot (https://en.wikipedia.org/wiki/Sankey_diagram). IP prioritisation involves an algorithm which scores candidate IPs according to their relative dominance against all IPs studied.
Once a putative IP is chosen, systematic reviews can be undertaken for: (1) association of diet on the chosen IP, and (2) association of the IP with a specific cancer. In a proof of principle study(Reference Lewis, Gardner and Higgins10,Reference Harrison, Lennon and Holly11) using milk as the dietary exposure and IGF-1 as the IP, the investigators investigated evidence for a role of the IGF pathway in the reported associations between milk and risk of prostate cancer (Fig. 2). They determined for this diet–cancer relationship that for step 1 the heterogeneity of exposure data and of models (experimental and observational, human subjects and animals) would not allow a standard meta-analysis with forest plots to be used. They developed a novel graphical approach (an Albatross plot) which, like a meta-analysis, allows the strength and direction of the association to be displayed. Each study point is plotted with the relevant P value on the x-axis and the number of participants/sample size for the study, along the y-axis. If data points cluster to the right side of the graph, this indicates a positive association, but to the left a negative association. Distribution across left, right and centre indicates no clear association. Contour lines which indicate a specific β coefficient can be drawn to visually indicate the magnitude of the association, but this will not provide as precise an effect estimate as would be possible with a forest plot.
To test the effect size for the association between milk and IGF-1, an Albatross plot showed a positive association between milk/dairy and IGF-1 with a β coefficient of 0⋅1 which is a 0⋅1 standard deviation increase in IGF-1 for a 1 standard deviation increase in exposure (milk/dairy). Estimates for 1 standard deviation for milk consumption range from 200 to 350 ml/d. Insulin like growth factor binding protein (IGFBP)-3 also showed positive association with milk consumption with a β coefficient of 0⋅055 but no associations were shown for other IGFs (Fig. 2).
In this proof of principle study(Reference Lewis, Gardner and Higgins10,Reference Harrison, Lennon and Holly11) the authors showed there was moderate evidence for an effect of milk on the two components of the IGF pathway (IGF-1 and IGFBP-3) but only IGF-1 showed a positive association with prostate cancer, whilst IGFBP-3 showed a negative association. The authors concluded there was moderate evidence for increased cancer risk with IGF-1. Although the negative association with IGF-3 could potentially attenuate the adverse impact of milk on prostate cancer, no firm conclusion was drawn by the authors(Reference Lewis, Gardner and Higgins10,Reference Harrison, Lennon and Holly11) . This may be because the overall strength of the evidence was low since most of the human studies were observational with high risk of bias according to grading of recommendations, assessment, development and evaluation(Reference Guyatt, Oxman and Vist16). There were a small number of animal studies in the analysis but these scored low due to lack of RCT, limited experimental information and limited information on consistency and high potential for publication bias.
Independent of the Bristol group, two teams from the Netherlands and Germany separately tested the two-step framework to assess evidence for associations between body fatness and breast cancer(Reference Ertaylan, Le Cornet and van Roekel20). Their analysis provides useful independent reflection on the feasibility, utility and recommendations for future use of the framework as outlined in Table 2. They raised questions regarding the use of grading of recommendations, assessment, development and evaluation for mechanistic studies due to the emphasis of this approach on RCT. Also considered were issues such as high work load and overall utility of some analyses, including the large numbers of cell studies identified from the open search, but with limited subsequent utilisation of these data in the overall assessment process.
IP, intermediate phenotype; GRADE, grading of recommendations, assessment, development and evaluation; SYRCLE, systematic review centre for laboratory animal experimentation; RCT, randomised controlled trial.
Conclusions
A transparent, systematic framework for the assessment, synthesis and grading of mechanistic studies for the purposes of advancing policy recommendations in the area of diet and chronic disease is a significant gap in the current evidence-base. A framework approach, similar to that used for assessing human population studies, has been developed by a group of multidisciplinary scientists working in the field of diet and cancer. Their work proposes novel approaches to solving the considerable challenges faced by assessment of quality, and visual and statistical analysis, of the heterogeneous models used in this area. Proof of principle studies have been published with further collaborative work ongoing. More studies are needed to determine whether this approach can be applied to other types of chronic disease known to be influenced by diet. There is a need for more appropriate methods for scoring study quality and overall strength of the mechanistic evidence-base for diet, which may need to differ from those used for studying observational associations of diet and disease.
The interrogative and critical approach of expert groups concerning the quality and utility of observational data over the past 50 years has led to significant advances in this type of research. There is similar potential for mechanistic research in diet and health, but its current use in policy is limited by lack of quality and relevance for much of the published data. It is acknowledged that research into potential dietary mechanisms is, by itself, as worthy as in other advanced areas of biology. However, where research funding has been awarded on the basis of policy relevance, studies should be designed to support biological plausibility in human subjects which require that the studies use levels of dietary exposure and address mechanisms which are relevant to human subjects.
Acknowledgements
Thanks to Professor Richard Martin, senior author in the Bristol/WCRF studies for providing further information on published studies.
Financial Support
None.
Conflict of Interest
C. M. W. has chaired and co-chaired the WCRF Grant Panel for the past 5 years. However she has not been involved as an assessor or recipient of grant funding for any of the WCRF-funded work outlined in this paper.
Authorship
The paper is based on an invited presentation to C. M. W. at the Cork Nutrition Society Meeting. C. M. W. is the sole author responsible for drafting and finalising all aspects of the manuscript.