Creates a one- or two-way table of summary statistics for a quantitative variable.
Usage
sumTable(formula, ...)
# S3 method for class 'formula'
sumTable(formula, data = NULL, FUN = mean, digits = getOption("digits"), ...)
Arguments
- formula
A formula with a quantitative variable on the left-hand-side and one or two factor variables on the right-hand-side. See details.
- ...
Other arguments to pass through to
FUN
.- data
An optional data frame that contains the variables in
formula
.- FUN
A scalar function that identifies the summary statistics. Applied to the quantitative variable for all data subsets identified by the combination of the factor(s). Defaults to
mean
.- digits
A single numeric that indicates the number of digits to be used for the result.
Value
A one-way array of values if only one factor variable is supplied on the right-hand-side of formula
. A two-way matrix of values if two factor variables are supplied on the right-hand-side of formula
. These are the same classes of objects returned by tapply
.
Details
The formula must be of the form quantitative~factor
or quantitative~factor*factor2
where quantitative
is the quantitative variable to construct the summaries for and factor
and factor2
are factor variables that contain the levels for which separate summaries should be constructed. If the variables on the right-hand-side are not factors, then they will be coerced to be factors and a warning will be issued.
This function is largely a wrapper to tapply()
, but only works for one quantitative variable on the left-hand-side and one or two factor variables on the right-hand-side. Consider using tapply
for situations with more factors on the right-hand-side.
Author
Derek H. Ogle, DerekOgle51@gmail.com
Examples
## The same examples as in the old aggregate.table in gdata package
## but data in data.frame to illustrate formula notation
d <- data.frame(g1=sample(letters[1:5], 1000, replace=TRUE),
g2=sample(LETTERS[1:3], 1000, replace=TRUE),
dat=rnorm(1000))
sumTable(dat~g1*g2,data=d,FUN=length) # get sample size
#> Warning: First RHS variable was converted to a factor.
#> Warning: Second RHS variable was converted to a factor.
#> A B C
#> a 73 70 60
#> b 59 68 63
#> c 61 83 75
#> d 75 57 50
#> e 57 79 70
sumTable(dat~g1*g2,data=d,FUN=validn) # get sample size (better way)
#> Warning: First RHS variable was converted to a factor.
#> Warning: Second RHS variable was converted to a factor.
#> A B C
#> a 73 70 60
#> b 59 68 63
#> c 61 83 75
#> d 75 57 50
#> e 57 79 70
sumTable(dat~g1*g2,data=d,FUN=mean) # get mean
#> Warning: First RHS variable was converted to a factor.
#> Warning: Second RHS variable was converted to a factor.
#> A B C
#> a -0.0742281 0.0071300 0.1220617
#> b -0.0890093 -0.2673082 -0.1332384
#> c 0.2214661 0.0308539 -0.0028557
#> d -0.0024910 -0.1756349 -0.0269173
#> e -0.1744024 0.0166841 -0.0942160
sumTable(dat~g1*g2,data=d,FUN=sd) # get sd
#> Warning: First RHS variable was converted to a factor.
#> Warning: Second RHS variable was converted to a factor.
#> A B C
#> a 0.8540582 0.9696002 1.0705232
#> b 1.0772834 1.0319364 0.9669174
#> c 1.0477833 1.0044162 1.0192641
#> d 0.9593495 1.0464473 0.9969988
#> e 1.0720302 0.9515544 1.0513543
sumTable(dat~g1*g2,data=d,FUN=sd,digits=1) # show digits= argument
#> Warning: First RHS variable was converted to a factor.
#> Warning: Second RHS variable was converted to a factor.
#> A B C
#> a 0.9 1 1.1
#> b 1.1 1 1.0
#> c 1.0 1 1.0
#> d 1.0 1 1.0
#> e 1.1 1 1.1
## Also demonstrate use in the 1-way example -- but see Summarize()
sumTable(dat~g1,data=d,FUN=validn)
#> Warning: RHS variable was converted to a factor.
#> a b c d e
#> 203 190 219 182 206
sumTable(dat~g1,data=d,FUN=mean)
#> Warning: RHS variable was converted to a factor.
#> a b c d e
#> 0.0118431 -0.1674870 0.0724024 -0.0634279 -0.0738739
## Example with a missing value (compare to above)
d$dat[1] <- NA
sumTable(dat~g1,data=d,FUN=validn) # note use of validn
#> Warning: RHS variable was converted to a factor.
#> a b c d e
#> 203 190 219 181 206
sumTable(dat~g1,data=d,FUN=mean,na.rm=TRUE)
#> Warning: RHS variable was converted to a factor.
#> a b c d e
#> 0.0118431 -0.1674870 0.0724024 -0.0663266 -0.0738739