Introduction
To enhance in vivo capacity for animal feed evaluation, the use of the in vitro gas production (GP) technique has increasingly spread over the last decades. Following on from the manual systems of, e.g. Menke and Steingass (Reference Menke and Steingass1988) and Theodorou et al. (Reference Theodorou, Williams, Dhanoa, McAllan and France1994), the capacity of the in vitro GP system of Theodorou et al. (Reference Theodorou, Williams, Dhanoa, McAllan and France1994) was enhanced by a semi-automation modification (Mauricio et al., Reference Mauricio, Mould, Dhanoa, Owen, Channa and Theodorou1999). Further progress was made with the arrival of commercial systems such as the ANKOM equipment (ANKOM, 2015). Moreover, in addition to inoculum derived from rumen fluid, further research has shown that inoculum (microbial and fungal) from non-ruminants and also from faecal matter are viable options (El-Meadaway et al., Reference El-Meadaway, Mir, Mir, Zaman and Yanke1998; Mauricio et al., Reference Mauricio, Owen, Mould, Givens, Theodorou, France, Davies and Dhanoa2001; Nielsen et al., Reference Nielsen, Zhu, Dhanoa, Trinci and Theodorou2002; Lopez et al., Reference López, Dhanoa, Dijkstra, Bannink, Kebreab and France2007). The spectrum of use of the in vitro GP technique now includes characterization of human diets and diet components using inocula from pig (Williams et al., Reference Williams, Mikkelsen, le Paih and Gidley2011) and human faeces (Lu et al., Reference Lu, Flanagan, Mikkelsen, Williams and Gidley2022). The effects of substrate and inoculum treatments on in vitro GP fermentation profiles have been studied with respect to model parameters and functions of those parameters (France et al., Reference France, Dhanoa, Theodorou, Lister, Davies and Isac1993, Reference France, Dijkstra, Dhanoa, López and Bannink2000; Dhanoa et al., Reference Dhanoa, López, Dijkstra, Davies, Sanderson, Williams, Sileshi and France2000). In parallel, analysis of variance (ANOVA) of GP profile data at each time point has also been undertaken. However, ANOVA without accounting for ante-dependence associations will only give gross treatment effects rather than the net effects at those time points in the GP profiles.
Box (Reference Box1950), citing Wilks (Reference Wilks1946), illustrated a compound symmetry test of the hypothesis that all variances (diagonal values) and all covariances (off-diagonal values) are respectively equal (matrix V 0) v. the alternative hypothesis that all variances (diagonal values a, b, c) and all covariances (off-diagonal values d, e, f) are (respectively) not equal (matrix V 1). Illustrating for a case of three-point profile-variates:
Box (Reference Box1954a, Reference Box1954b) subsequently defined an index of sphericity (ɛ Box) to check for this symmetry, obtained following application of the formula (from Abdi, Reference Abdi and Salkind2010):
Here ζij are entries (row i, column j) in the population variance–covariance matrix and α is the number of profile samples.
Following the work of Box, Greenhouse and Geisser (Reference Greenhouse and Geisser1959) proposed a notation for V 1 by assuming that each individual profile is a random vector sampled from a p-variate normal distribution with an arbitrary variance–covariance matrix (Δ as on left side below) with further assumption that the p-variables have the same metric which is necessary if one considers that group profiles have the same shape (Δ on the right side below).
In the case of growth curve and repeated measures profiles, correlations among the near-time observations tend to be larger compared to observations that are a greater time apart. That means the profile data variance–covariance matrix is likely to be similar to V 1 (or Δ on left) rather than V 0. This being the case, Greenhouse and Geisser (Reference Greenhouse and Geisser1959) showed that F ratios involving the time factor variances will become biased and will point to lower F probability values. To mitigate the problem, they proposed a (multiplicative) epsilon (ɛ) correction for degrees of freedoms prior to entering F distribution tables to read F probability values.
Greenhouse and Geisser (Reference Greenhouse and Geisser1959) further discussed how the distribution of F values is affected by various elements of the variance–covariance matrix and hence developed their epsilon correction to reduce degrees of freedoms before calculating F probability values:
In this equation, $\sigma _{t \bullet }^{}$are the elements of the matrix Δ, $\bar{\sigma }_{tt}$ is the mean of the diagonal terms, $\bar{\sigma }_{t \bullet }^{}$is the mean of the tth row (or tth column), $\bar{\sigma }_{ {\bullet}{\bullet} }^{}$ is the grand mean, and p is the number of profile samples. For any p-time point profile, the lower bound of epsilon will be $\varepsilon = {1 / {( p-1) }}$ and it is independent of variance–covariance matrix values (the authors suggest its use when profile time-points are greater than the design units net of profile samples; design units = number of profile samples × number of replicates). However, actual epsilon value based on profile sample data needs to be calculated (using Eqn (1) above and the method described in Abdi, Reference Abdi and Salkind2010) if net design units or subjects are greater than the profile time points (Greenhouse and Geisser, Reference Greenhouse and Geisser1959).
When the epsilon estimate is close to unity, the Greenhouse–Geisser correction tends to be conservative because of their ɛ value being an underestimate whilst the correction of Huynh and Feldt (Reference Huynh and Feldt1976) tends to overestimate ɛ (Abdi, Reference Abdi and Salkind2010). Therefore, recommended practice is to use the Greenhouse–Geiser correction if ɛ < 0.75 but use the Huynh–Feldt correction if ɛ > 0.75. The formula for the Huynh–Feldt correction (from Abdi, Reference Abdi and Salkind2010) is
where $\hat{\varepsilon }$ is the estimate for ɛ; s the number of subjects; and α the number of profile samples.
The objective of our work was to illustrate the use of a modified ANOVA procedure (modANOVA) using the Genstat procedure AREPMEASURES (Payne, Reference Payne2022), which produces an ANOVA for repeated measurements. The procedure modANOVA incorporates the refinements proposed by both Box (Reference Box1950) and Greenhouse and Geisser (Reference Greenhouse and Geisser1959), which are discussed above. Various aspects of relevant methods are simply described but more mathematical details can be found in the cited references. Along with modANOVA, other approaches such as ante-dependence analysis, multivariate ANOVA or residual maximum likelihood parameter estimation in mixed models, provide suitable alternatives for the analysis of repeated measures structured data for the purpose of treatment comparisons. Raw GP profile data are repeated measures in nature, and these alternatives can be applied for elucidating the effects of substrate, additives, treatments or inocula on rumen fermentation kinetics. It should be stated that modelling the GP profile trends of design unit curves remains a valuable tool for the enhancement of experimental information and scientific knowledge in general.
Materials and methods
Handling of GP data profiles resembles, as for growth curves, repeated measurements and sometimes multivariate analysis. There are choices that can be made for the comparison of substrate and inoculum treatment effects. Some of the options available are as follows. These are implemented herein using the statistical software Genstat (VSN International, 2022). Note that the R Project software and SAS, among others, also provide these options.
Option 1
The GP curve for each design unit over the entire incubation period (usually until the upper asymptote is quasi-reached) is modelled with the use of some appropriate function and thereafter ANOVA can be used to compare treatments on the basis of fitted model parameters and measures derived from them. For this purpose, the specially designed model of France et al. (Reference France, Dhanoa, Theodorou, Lister, Davies and Isac1993) or classic growth functions can be used. For the analysis of non-standard GP profiles, recent publications by Powell et al. (Reference Powell, Dhanoa, Garber, Murray, López, Ellis and France2020) and Dhanoa et al. (Reference Dhanoa, López, Powell, Sanderson, Ellis, Murray, Garber, Williams and France2021) cover analysis of monophasic, multiphasic, hybrid-phasic and numerical modelling. Model fitting and ANOVA can be combined by the use of grouped nonlinear regression (Heitjan, Reference Heitjan1989), also called parallel curve analysis or simply regression-ANOVA.
Option 2
Using raw GP data, modANOVA based on the methodologies of Box (Reference Box1950) and Greenhouse–Geisser (Reference Greenhouse and Geisser1959) for dealing with the effects of unequal variances, covariances and correlations among repeated profile measurements, is a suitable option unlike the use of ordinary ANOVA without accounting for correlations among repeated profile measurements taken for all the experimental units ( = sum of replicates for all profile samples = subjects). The modANOVA layout will be similar to that of split-plot design ANOVA (sometimes called split-plot-in-time) but there are important differences with respect to the subplot part F-ratios (see above). The Genstat procedure AREPMEASURES (Payne, Reference Payne2022) implements the above modifications and produces an analysis of variance for repeated measurements and growth profiles as is the case with in vitro cumulative GP data.
Option 3
Another methodology for ANOVA-like analysis is ante-dependence analysis (Gabriel, Reference Gabriel1962; Kenward, Reference Kenward1987), which also deals with auto and serial correlation in the inter-profile curve trend. Correlation order-based tests of treatments can point to any emerging or short-term and overall effects. Once the ante-dependence order is determined, e.g. using the Genstat procedure ANTORDER (Ridout and Payne, Reference Ridout and Payne2022), this analysis is essentially equivalent to standard ANOVA for a given time point with (for example) previous one or two time points as covariates. More detailed analysis is provided by using Genstat procedure ANTTEST (Payne and Ridout, Reference Payne and Ridout2022). Both the procedures ANTORDER and ANTTEST allow actual times of profile samples to be used, thus obviating the need for data to be recorded at equal time points.
Option 4
Using recorded raw data
For the analysis of one or more dependent variables as affected by multiple predictors, multivariate ANOVA (MANOVA; Chatfield and Collins, Reference Chatfield and Collins1986) is a standard method. Growth or repeated measurement profiles can be analysed with the use of MANOVA (Payne and Arnold, Reference Payne and Arnold2022) and it does not require the assumption of sphericity. However, if the time points of the profiles are greater than the main plot residual degrees of freedom (DF = N – g; where N = subjects or design units and g = profile samples) then MANOVA cannot be used. This situation is quite common with GP, growth and repeated measurement profile data. However, ante-dependence analysis (Option 3 above) and mixed model analysis (Option 5 below) can still be used. Furthermore, as an alternative, Option 4b via the use of the inter-data-unit-based distance matrix is also available.
Using the data-based distance matrix
Analysis of multivariate distance methodology devised by Gower and Krzanowski (Reference Gower and Krzanowski1999) is implemented in Genstat via the procedure MVAOD (Payne and White, Reference Payne and White2021). This option provides a breakdown of the sums of squared distances between the design units, similar to that provided for sums of squares in an analysis of variance. The squared distances between the units in a symmetric matrix are either input or calculated from the profile data. The methodology implemented in MVAOD can be regarded as providing an alternative to MANOVA for units whose attributes are not all continuous variables. This method permits profile samples greater than the design residual degrees of freedom. For treatment comparison, a permutation-based probability calculation for treatment effects is available.
Option 5
Residual maximum likelihood (REML), an algorithm for variance parameter estimation in linear mixed models (Patterson and Thompson, Reference Patterson and Thompson1971), can also handle repeated measurement (hence GP) profile data. In the statistical software Genstat, mixed model features for analysing repeated measurements using REML include: (i) time points variate, (ii) treatments formula, (iii) additional random terms formula, (iv) dealing with equally spaced or irregular time-points, (v) model for correlation within subject across time, (vi) allowance for heterogeneity across time, and (vii) additional uniform correlation within subject (for details see VSN International, 2022).
Data set 1
Gliricidia trees provide a browse plant and source of fodder for ruminants in the tropics. The data set reported by Lister et al. (Reference Lister, Dhanoa, Stewart and Gill2000) embraced 25 Gliricidia provenances originating from nine countries (one provenance unidentified) but were grown at a single site in Honduras (El Zamorano at 14° 1′ N, 87° 2′ W). Full details are given in Lister et al. (Reference Lister, Dhanoa, Stewart and Gill2000) where near-infrared spectra-based analysis is described. Samples from these trees were incubated in vitro for up to 140 h using ovine rumen fluid as the inoculum, and cumulative fermentation GP was recorded at 3, 6, 9, 15, 19, 24, 27, 33, 38, 44, 51, 59, 69, 82, 96, 120 and 140 h (17 time points, see observed values in Fig. 1). France et al. (Reference France, Dhanoa, Theodorou, Lister, Davies and Isac1993) used some of the GP curves to test the fit of the Mitscherlich equation to GP data over the incubation time span, so that effects of country of origin on the model parameters (asymptotic GP or half-life from fractional fermentation rate) could be assessed. This data set is used to demonstrate modANOVA using AREPMEASURES (Option 2). For this option, the number of time points must be less than the degrees of freedom for the experimental error applicable to whole unit comparisons (in this particular case equal to the total number of provenances minus the number of different countries of origin, i.e. 25–10 = 15). Thus, only 14 incubation times could be used, and GP values recorded at 69, 120 and 140 h were excluded from the original data set to run the repeated measures ANOVA (Option 2).
Data set 2
These data were generated in three experiments reported by Mauricio et al. (Reference Mauricio, Tomich, Filho, Reis, Goncalves, Borges, Dhanoa, Sandoval-Castro, Hovell, Torres Acosta and Ayala-Burgos2006) using a semi-automatic in vitro GP technique (Mauricio et al., Reference Mauricio, Mould, Dhanoa, Owen, Channa and Theodorou1999). In each experiment, the inoculum used was bovine rumen fluid and 15 GP profiles were obtained by recording cumulative GP at 2, 4, 6, 8, 10, 12, 16, 19, 24, 30, 36, 48, 60, 72 and 96 h of incubation. In Experiment 1, which used five replicates and four treatments (cutting age: 56, 84, 112 and 140 d), the substrate was Andropogon gayanus hay. The data (Data Set 2a) are used for MANOVA and distance-MANOVA (Option 4a and 4b, respectively). Experiment 2 examined four tropical grasses: elephant grass (Pennisetum purpureum), sugar cane (Saccharum officinarum) and two hybrids of sorghum with Sudan grass (Sorghum bicolor × Sorghum sudanense) using three replicates. These data (Data Set 2b) are used for ante-dependence analysis with Genstat procedure ANTTEST (Option 3). Experiment 3 examined silages of sunflower (Helianthus annuus) cut when 100% of the grains were mature. Four genotypes were considered (Rumbossol 91, Victoria 627, Victoria 807 and Mycogen 93338) and three replicates per genotype were used. These data (Data Set 2c) are used for mixed model analysis using the REML algorithm (Option 5).
Results
Modelling the trends of design units generated by repeated measurements adds to the information contained in the raw data in terms of kinetics, model parameters and meaningful functions of those parameters (France et al., Reference France, Dhanoa, Theodorou, Lister, Davies and Isac1993; Powell et al., Reference Powell, Dhanoa, Garber, Murray, López, Ellis and France2020; Dhanoa et al., Reference Dhanoa, López, Powell, Sanderson, Ellis, Murray, Garber, Williams and France2021). This is illustrated in Fig. 1 for Data Set 1, with the lines showing the fitted curves when the diminishing returns equation of Mitscherlich (Reference Mitscherlich1909) is fitted. Results obtained with grouped nonlinear regression (Option 1) are detailed in Table 1. The accumulated analysis of variance from grouped regression shows that the function (Mitscherlich) used to fit the data accounted for most of the variability (93.2%). The inclusion of the effect of y-intercept increased this percentage to 95.4%, whereas the effects of x-axis curve position and curve shape only increased this latter value to a marginal extent. Option 1 allows us to evaluate the suitability of the model to fit the data over incubation time and to determine the effects of the main source of variation among curves (in this case country of origin of the provenances) on the parameters defining the intercepts, position and shape of the GP curves. Other options (2–5) can be used for the analysis of GP profile data, and those outlined in this work should serve that purpose.
df, degrees of freedom; F value, variance ratio; P value, F ratio probability; SE = standard error of observations.
Response variable: GP = cumulative gas production for 25 Gliricidia provenances originating in 10 different countries (locations).
Explanatory variable: t = incubation time (h).
Fitted equation: GP = A + B × Ct ≡ A + B × e–kt,.
where A = upper asymptote; B is a parameter related to the position of the curve in the x-axis; C = e–k (constrained to C < 1), k = fractional rate (parameter determining the curve shape).
Accumulated analysis of variance (from grouped regression).
* Change: ‘+’ indicates addition of individual factor intercepts and origin slopes to overall Mitscherlich equation fit.
*1 model fit = overall Mitscherlich equation fit, i.e. effect of incubation time on GP using the equation.
GP = A + B × Rt, so that the adjusted R 2 for this factor is the percentage of variance explained by the model.
*2 A (origin) = y-intercepts for each origin, i.e. effect of country of origin on the parameter A.
*3 B (origin) = x-axis curve position, i.e. effect of country of origin on the parameter B.
*4 C (origin) = separate nonlinear curves or curve shape, i.e. effect of country of origin of the parameter C.
A modified analysis of variance (modANOVA; Option 2) is illustrated in Table 2 using Data Set 1 (limited to 14 time points). The variance–covariance matrix for these data showed variances within each time increasing from 1.87 (at 3 h) to 105.4 (at 38 h) then decreasing to 31.1 at 96 h of incubation; with a mean value of 54.5. The average covariance and correlation between incubation times were 39.2 and 0.719, respectively. However, there was a large variation in covariance, minimum (1.4) with values at 3 h and maximum (up to 100.8) with values at 33 or 38 h of incubation. Box's test (both χ 2 and F test) indicates that compound symmetry of the variance–covariance matrix cannot be assumed (P < 0.001). Thus, data can be analysed by repeated measures ANOVA without assuming sphericity, using the Greenhouse–Geisser method. The derived Greenhouse–Geisser correction factor for the data (ɛ = 0.124). The modified Box–Greenhouse–Geisser analysis of variance with the adjustment accounting for the value of epsilon is given in Table 2. The analysis showed no significant effect of the country of origin of the Gliricidia provenance on average GP (P = 0.102), which was confirmed in a multiple comparisons of means using the Student–Newman–Keuls test (Thomas, Reference Thomas1973). Although the Gliricidia forage tree provenances were from different countries, samples for in vitro GP were collected from trees grown at a single site. The site's soil attributes and the local environment appear to have had a defining effect on the nutritional quality of these trees. The effect of incubation time is large because these GP profiles were recorded as cumulative GP increased steadily as incubation time lengthened. If interest is in the net values at the time-points then the first difference of the trends of the profile curve may be used instead, as suggested by Box (Reference Box1950).
df, degrees of freedom; F value, variance ratio; P value, F ratio probability.
a The analysis considers the 25 Gliricidia provenances as the whole units or subjects, and the 14 incubation times as the subunits within the whole units.
b The df in the within subject stratum were multiplied by the correction factor (Greenhouse–Geisser ɛ = 0.1241) before calculating F probabilities (if no epsilon correction is applied the P value would be 0.00014428).
Ante-dependence analysis (Option 3) is demonstrated in Table 3 using Data Set 2b. Application of the procedures ANTORDER and ANTEST brings out details about the serial- or auto-correlation structure–property of the data set profiles. First, the extent of serial correlation between zero-, first- and second-order (or higher) is determined (first part of Table 3). After pairwise testing of serial order, pooled order evidence is garnered for the overall order (order 2 in this case; the second part of Table 3) for ante-dependence analysis using ANTTEST. Assuming order 2, the third part of Table 3 details treatment effects (3 columns after the time column) at each data profile step. These three columns give cumulative treatment effects up to each relevant time point. In this case, treatment effect significance is established from 6 h of incubation (incubation time number 3 in the sequence) onwards.
Assessment of the order of ante-dependence for repeated measures data (GENSTAR ANTORDER procedure)
Comparison of ante-dependence structures with maximum order (4).
From the above two tests an ante-dependence order 2 is appropriate for ANTTEST procedure.
Overall tests of treatment terms assuming an order 2 of ante-dependence structure (GENSTAT ANTTEST procedure).
Tests ‘at each time (*1)’ and ‘test up to each time (*2)’ illustrate emerging effects of the treatments.
Overall test using data from all the times: statistic 164.289, df 48, P < 0.001.
Multivariate analysis of variance and covariance (MANOVA; Option 4a) using recorded raw GP data (Data Set 2a) is illustrated in Table 4. The procedure analyses the data variates by ANOVA first as y-variates, and then as covariates in order to obtain the sums of squares and products (SSP) matrices. The SSP matrices are then adjusted for the covariates (e.g. using matrix manipulation in Genstat's CALCULATE directive; VSN International, 2022), and latent root and vector decompositions are done before the test statistics are calculated (using CALCULATE). MANOVA calculates Wilks lambda (with approximate F test) and the Pillai–Bartlett trace among other test statistics (Chatfield and Collins, Reference Chatfield and Collins1986). Wilks lambda test demonstrates the strength of the relationship between the dependent variable (MANOVA allows more than one dependent or y-variable) and the predictor variables (the lower the value of lambda the better the explained variation). The actual lambda value indicates the proportion of total variation not explained by the MANOVA model; this is rather different from an ANOVA F test which tests the explained part of total variance. The lambda test provides an overall test of the significance of treatment effects. Rao's F probability provides an expansion of Wilks's criterion (Rao, Reference Rao1951). If the probability is ⩽0.05 then significant differences among treatment means are indicated. The profile sample times need to be equal for MANOVA. If data are recorded at unequal time points, then it will be necessary to calculate interpolated data at equal time points (Dhanoa et al., Reference Dhanoa, López, Powell, Sanderson, Ellis, Murray, Garber, Williams and France2021).
The GP recorded at each incubation time constituted one of the y-variables included in this MANOVA (for a total of 15 y-variables).
* Permutation probabilities based on 999 random permutations.
Multivariate analysis of design inter-unit distances (MVAOD; Option 4b) using Data Set 2a is demonstrated in Table 5. Data variates may not all be continuous and distributional requirements may not be fully complied with for standard MANOVA (Option 4a). Additionally, MVAOD provides an alternative when the size of profile samples is greater than net design units. In MANOVA, actual data values are used but the alternative methodology of MVAOD uses design inter-unit distances across all profile samples generating a dissimilarity matrix. Instead of the sum of squares of actual data values as in simple ANOVA, in this analysis, the sum of squared distances is used. The total squared distance between the units is partitioned into components that can be explained by each of the terms (e.g. treatment factors) in the design-based model. The advantage of this distance-based analysis is that it can handle various types of data. The units may often be described by a collection of attributes ranging from continuous measurements to categorical variables, such as the presence or absence of a particular feature. Output from MVAOD looks similar to the one-way ANOVA output but there is no F test available, instead treatment significance probability is calculated from 999 random permutations. The distance matrix from this analysis can be used for further multivariate analyses, e.g. by converting the distances to similarities and then using them as input for a principal coordinates analysis (see PCO directive of Genstat; VSN International, 2022). Summary distances among treatment means shown at the end of Table 5 can be used to look for apparent differences (in this case distances between treatment four and other treatments are relatively large).
* Probabilities determined from 999 random permutations.
Distances among treatment levels.
a These distances are relatively large.
Residual maximum likelihood-based mixed model analysis (REML; Option 5) using Data Set 2c is illustrated in Table 6. Any of the available options for the analysis of linear mixed models can be used when applying the Genstat REML directive, including the one for repeated measurements which is relevant for the analysis of in vitro cumulative GP profile data. In GP data the replicated design units are the substrate (replicated) samples and are treated as the random effects along with treatments as the fixed effects. Times of the profile readings can be unequal but need to be one of the REML model inputs along with many other model modifying options (for details see VSN International, 2022). In Table 6, information and options of the REML model are listed before the listing of tests for fixed effects. These tests include the Wald statistic (Ward and Ahlquist, Reference Ward and Ahlquist2018) along with the F statistic and F probability. The fixed effects part also includes information about deleting some terms from the REML model. Here REML suggests dropping the time and treatment interaction term will make a significant change. Like ANOVA and modANOVA (Option 2) superscripts can be calculated for treatment differences (Bonferroni multiple comparisons are used herein; Hsu, Reference Hsu1996). The Akaike information criterion ( = 725.31; Snipes and Taylor, Reference Snipes and Taylor2014) and Schwarz (Bayesian) information criterion ( = 940.30; Verbyla, Reference Verbyla2019) are measures of REML model goodness of fit, calculated in Genstat using the VAIC procedure (Payne and Cave, Reference Payne and Cave2022). These coefficients are a simple way of ranking the models.
a Denominator degrees of freedom for approximate F-tests are calculated using algebraic derivatives ignoring fixed/boundary/singular variance parameters.
REML variance components analysis.
Response variate: gas production (GP).
Fixed model: constant + genotype + incubation time + (genotype × time).
Random model: (genotype × replicates × time) – used as residual term.
Number of units: with three replicates per genotype (4 × 3 × 16) = 192.
Bonferroni test
Akaike information coefficient 725.31.
Schwarz Bayes information coefficient 940.30.
Multiple comparisons between Genotype means.
Discussion
In vivo digestibility and in situ nylon bag methods (e.g., Blaxter et al., Reference Blaxter, Graham and Wainman1956; Ørskov and McDonald, Reference Ørskov and McDonald1979) provided means of assessing animal feed quality but due to the limited capacity of these procedures, laboratory methods were sought to find in vitro quality measurements. Menke and Steingass (Reference Menke and Steingass1988) developed the initial in vitro GP technique to speed up forage matter quality testing using rumen liquor as the inoculum. Semi-automated and automated methods of measuring cumulative GP subsequently followed (e.g., Mauricio et al., Reference Mauricio, Mould, Dhanoa, Owen, Channa and Theodorou1999; ANKOM, 2015). Non-invasive use of many animal species (including humans) led to the use of faecal matter-based inocula. Given the ever-increasing use of in vitro GP to evaluate a variety of plant products, food processing by-products, animal feeds and human diets, and such a variety of methods for generating cumulative GP profiles, rapid analysis of GP profile data and sufficient statistical analysis options are needed. Options outlined in this work should serve that purpose. There is always the application of substrate and inoculum treatments that need statistical evaluation for the detection of any significant effects. In addition to modelling the profile-related time dimension of subject curves and the analysis of model parameters and their functions, direct analyses of raw profile data under five methodology options are presented in this paper that can be applied even when individual design unit trends may have different shapes.
Results from the five options, as implemented in the statistical software Genstat, show flexibility of methodology choice and ease of application. It is worth re-iterating that the statistical analyses proposed herein can be implemented in any other statistical software or programming language commonly used (e.g., R Project, SAS, SPSS). The choice of analytical approach (Options 2–5 delineated herein) and successful application depend on the experimental design of the GP study, profile samples and replication of any GP experiment. Some profiles hold valuable kinetic information, so the modelling option (i.e. Option 1) is useful either as the preferred option or as a supplemental one for obtaining further information. Modelling can also identify multiphasic features of the response profile that allow split-curve information to be generated.
In addition to recording the amount of gas produced, the status of substrate degradation can provide further useful information. By recording remaining substrate dry matter (DM) at various stages of fermentation and the final DM remaining, in vitro GP can be a suitable surrogate for in vivo digestibility (Lowman et al., Reference Lowman, Theodorou, Hyslop, Dhanoa and Cuddeford1999), in situ substrate degradation in the rumen (Sileshi et al., Reference Sileshi, Owen, Dhanoa and Theodorou1996) or in vivo extent of feed degradation in the rumen (France et al., Reference France, Dijkstra, Dhanoa, López and Bannink2000). The study of substrate degradation kinetics in the rumen proper is complicated by the particle fractions lost with digesta passage, and an advantage of GP is that substrate can only disappear via microbial degradation (Dhanoa et al., Reference Dhanoa, López, Dijkstra, Davies, Sanderson, Williams, Sileshi and France2000). France et al. (Reference France, Dhanoa, Theodorou, Lister, Davies and Isac1993) linked GP information to that obtained from the rumen proper, proposing a formula for calculating ruminal extent of substrate degradation while accounting for relevant passage rate and hence the losses. Another facet of the GP technique is fluid from the incubation bottles can be used to determine the volatile fatty acid composition of the test feed and hence the C 2:C 3 ratio (Morvay et al., Reference Morvay, Bannink, France, Kebreab and Dijkstra2011). In GP studies, sufficient replicates are required to derive robust and representative GP profiles and reliable fermentation kinetics that can be extrapolated to the rumen itself. In addition, replication is required to analyse the effects of substrate and inoculum on fermentation kinetics, or for the comparison among experimental treatments applied in each study.
The statistical analysis methodologies identified and illustrated in this communication are for ANOVA-like comparison of substrate and inoculum treatments in any study-design generated GP profile. For a single-phase GP curve, grouped regression is straightforward to implement and to make treatment comparisons in terms of trend shape and position of the GP curve. However, this will become more complicated if GP curves consist of multiphasic subsections, which appear more likely to arise when using voided dung rather than rumen fluid as the inoculum. For modelling identifiable bi-phasic and tri-phasic GP curves, Dhanoa et al. (Reference Dhanoa, López, Powell, Sanderson, Ellis, Murray, Garber, Williams and France2021) used single- and double-node connected functions (either a doubled-up same function or hybrid pairs). Modified ANOVA for repeated measures splits variance between and within plots and considers the structure of the variance–covariance matrix to assess the statistical significance of treatments and time effects. Ante-dependence analysis is useful to detect the critical incubation times at which differences between treatments become relevant. Multivariate ANOVA (i.e. MANOVA) integrates the information gathered at all incubation times to analyse the comparison among treatments. Too many profile samples over the plateau part of GP curves may not add much to curve information, likewise over the lag phase, but can make MANOVA use difficult. So, some profile samples may be best left out from the selected analysis. If possible, GP profile readings should be taken at equal time intervals (it is a requirement for Options 2–4a described above), otherwise interpolation is needed to obtain equal time intervals from the trends of each design unit as described in Dhanoa et al. (Reference Dhanoa, López, Powell, Sanderson, Ellis, Murray, Garber, Williams and France2021). The MANOVA option, like modANOVA, requires that the number of profile time points should be less than the number of design units (the other options do not have this requirement). Option 4b (MVAOD) is an alternative for a multivariate analysis when time intervals are unequal. Finally, repeated measures designs can be analysed as mixed models using REML methodology, a useful tool to discriminate the effects of experimental treatments from that attributed to the incubation time. The choice of one or another option, or the implementation of a variety of them for the analysis of a given data set, will depend on the objectives of each particular study.
The profile data analysis options described in this communication lend themselves to wider applications. In the in vitro GP method, a second profile set can be obtained by analysing bottle contents, at the end of incubation, for volatile fatty acids (e.g. acetate and propionate). Profile data responses are common in the agricultural, medical, environmental and other applied sciences. In animal science, for example, sets of diet proteins, fatty acids, minerals, amino acids, etc. should be analysed in totality because of their inter-correlations (multicollinearity). MANOVA and mixed model options are applicable. If comparing monthly climate data at two or more geo-sites using a number of years' data, then the options modANOVA, MANOVA, MVAOD and REML may help. The chemical composition of dairy cow milk at various stages of lactation could be handled using the options described herein. Diurnal patterns of water-soluble carbohydrates in crops are a feature much studied by plant scientists. For designed experiments to test selected treatments, many covariates in addition to the response variate may be recorded, so ANOVA (with or without covariates) tends to be used. Instead of covariates, the response variate might be observed multiple times leading to repeated measurement profile data and options described here will be needed.
Conclusion
The options described herein allow analysis of raw GP data in addition to application of the curve fitting approach. Results from the ante-dependence methodology are indicative of further insights into one's data that should lead to better understanding and improved inference from the relevant experiment. This option, in particular, can be used in conjunction with the other options to extract additional information from the experimental data. The options described are suited to wider application in the applied biological sciences generally.
Author contributions
Conceptualization: MSD, SL and JF; methodology: MSD, RS, SJL, RMM, SL and JF; formal analysis: MSD, RS, SL, JLE, CDP and JF; writing – original draft preparation: MSD and JF; writing – review and editing: MSD, RS, SJL, RMM, SL, JLE, CDP and JF. All authors have read and agreed to the published version of the manuscript.
Funding statement
This work was funded in part by the Canada Research Chairs program, grant number 045867 (Natural Sciences and Engineering Research Council of Canada, Ottawa).
Competing interests
None.
Ethical standards
Not applicable.