Matching Data to Models

Jamie D. Riggs; Trent L. Lalonde

doi:10.1017/9781316544778.011

The Decision Process of Modeling

Given a data set or a design for collecting data, it is the task of the data analyst to match the data to an appropriate model. The selection of an appropriate model depends on a number of factors, including the goals or intentions of the study, properties of the data collected, and the nature of the conclusions the analyst would like to make. In most cases many models can be deemed appropriate for one data set, and the analyst must select one or many appropriate models to address the goals of the study. The data analyst cannot focus on “right” or “wrong” models, but must instead balance the relative strengths and weaknesses of different modeling choices. The analyst must also consider the availability of computing resources, interpretability of results, and the ability of the analyst herself.

Very generally, the data analyst must consider the specific goals or questions that need to be addressed by the study, including whether there is an interest in evaluating model effects, in making predictions using the model, or both. The analyst must also consider the nature of predictors for the analysis, including whether the predictors are continuous or categorical, whether interactions between predictors should be considered, whether any predictors should be considered as a source of random variation, whether any predictors present as time-dependent, and so on. Perhaps most relevant to the discussions from this handbook, the data analyst must consider the nature of the data collected, including whether the outcome of interest is continuous, skewed, categorical, longitudinal, or otherwise. Figure 10.1 shows some very general properties of the response of interest that must be determined by the data analyst when matching data to an appropriate model. First, the analyst must determine the type of data representing the outcome of interest which is represented by the top node, “outcome variable.” Exploratory data analyses corroborate the choice of the three options for the outcome variable given in the second tier of nodes.

Book contents

10 - Matching Data to Models

Summary

Access options

Book purchase

Temporarily unavailable

Book contents

10 - Matching Data to Models

Summary

Access options

Book purchase

Temporarily unavailable

Save book to Kindle

Save book to Dropbox

Save book to Google Drive