Book contents
- Frontmatter
- Contents
- List of Figures
- List of Tables
- List of Boxes
- Acknowledgments
- 1 Introduction
- PART I R AND BASIC STATISTICS
- 2 Introduction to R
- 3 Looking at Data – Numerical Summaries
- 4 Looking at Data – Tables
- 5 Looking at Data – Graphs
- 6 Transformations
- 7 Missing Values
- 8 Confidence Intervals and Hypothesis Testing
- 9 Relating Variables
- PART II MULTIVARIATE METHODS
- PART III ARCHAEOLOGICAL APPROACHES TO DATA
- References
- Index
4 - Looking at Data – Tables
from PART I - R AND BASIC STATISTICS
Published online by Cambridge University Press: 22 July 2017
- Frontmatter
- Contents
- List of Figures
- List of Tables
- List of Boxes
- Acknowledgments
- 1 Introduction
- PART I R AND BASIC STATISTICS
- 2 Introduction to R
- 3 Looking at Data – Numerical Summaries
- 4 Looking at Data – Tables
- 5 Looking at Data – Graphs
- 6 Transformations
- 7 Missing Values
- 8 Confidence Intervals and Hypothesis Testing
- 9 Relating Variables
- PART II MULTIVARIATE METHODS
- PART III ARCHAEOLOGICAL APPROACHES TO DATA
- References
- Index
Summary
A table is simply a two-dimensional presentation of data or a summary of the data. We use tables to inspect the original data for errors or problems such as missing entries. We used tables to present condensed summaries of data values in Chapter 3 (e.g., numSummary()). Those summaries involved computing summary statistics by a categorical variable to see how the groups differed from one another. We can also use tables to see how categorical variables covary.
Nominal or categorical data play a large role in archaeological research. At the regional level, sites are the categories and we are interested in the number of different types of artifacts (also a category) found in each site. The same applies at the site level where the artifact categories are distributed across excavation units. Within sites, different kinds of features are present and features contain different types of artifacts. At the artifact level, some properties of artifacts are represented by categories. Because of this, the same data are often represented in different ways for different purposes. That is not a problem unless the statistical procedures we are using expect a format different from the one we are currently using. In Chapter 3, we created tables of descriptive statistics. In this chapter we are concerned with tables in which the cell entries consist of counts of objects.
R distinguishes between tables and data frames and some functions will work with one but not the other. Data frames have columns that represent different types of data (e.g., character strings, factors, numbers), but tables in R represent numeric data only. In fact, R tables are a kind of matrix. Before constructing tables, we will briefly describe how R encodes categorical data using factors.
FACTORS IN R
Factors are a way of storing categorical information in R. If you have coded a variable into a set of categories, you have the choice of storing the information as a character or factor vector. A factor stores each category as an integer and the category labels are stored as levels. If you import your data into a data frame, R will automatically convert character vectors into factors unless you use the argument stringsAsFactors=FALSE.
- Type
- Chapter
- Information
- Quantitative Methods in Archaeology Using R , pp. 65 - 84Publisher: Cambridge University PressPrint publication year: 2017