aggregate-methods {S4Vectors}R Documentation

Compute summary statistics of subsets of vector-like objects

Description

The S4Vectors package defines aggregate methods for Vector, Rle, and List objects.

Usage

## S4 method for signature 'Vector'
aggregate(x, by, FUN, start=NULL, end=NULL, width=NULL,
          frequency=NULL, delta=NULL, ..., simplify=TRUE)

## S4 method for signature 'Rle'
aggregate(x, by, FUN, start=NULL, end=NULL, width=NULL,
          frequency=NULL, delta=NULL, ..., simplify=TRUE)

## S4 method for signature 'List'
aggregate(x, by, FUN, start=NULL, end=NULL, width=NULL,
          frequency=NULL, delta=NULL, ..., simplify=TRUE)

Arguments

x

A Vector, Rle, or List object.

by

An object with start, end, and width methods.

If x is a List object, the by parameter can be a IntegerRangesList object to aggregate within the list elements rather than across them. When by is a IntegerRangesList object, the output is either a SimpleAtomicList object, if possible, or a SimpleList object, if not.

FUN

The function, found via match.fun, to be applied to each subset of x.

start, end, width

The start, end, and width of the subsets. If by is missing, then two of the three must be supplied and have the same length.

frequency, delta

Optional arguments that specify the sampling frequency and increment within the subsets (in the same fashion as window from the stats package does).

...

Optional arguments to FUN.

simplify

A logical value specifying whether the result should be simplified to a vector or matrix if possible.

Details

Subsets of x can be specified either via the by argument or via the start, end, width, frequency, and delta arguments.

For example, if start and end are specified, then:

  aggregate(x, FUN=FUN, start=start, end=end, ..., simplify=simplify)

is equivalent to:

  sapply(seq_along(start),
         function(i) FUN(x[start[i]:end[i]], ...), simplify=simplify)

(replace x[start[i]:end[i]] with 2D-style subsetting x[start[i]:end[i], ] if x is a DataFrame object).

When x is a Vector derivative such as a DataFrame, FUN can also be omitted and replaced by named summary expressions supplied via .... In that case, each expression is evaluated once per group in an environment where the grouped columns of x are available as variables. Symbols not found there are resolved in the calling environment. The returned object contains one row per group, a grouping column describing the grouping, and one column for each named summary expression.

Value

Returns an object of the same general kind as the selected aggregate method.

For the Vector methods, when FUN is supplied, the result is the per-group summary returned by applying FUN to each subset of x, simplified when possible.

When FUN is omitted and named summary expressions are supplied via ..., the result is a DataFrame with one row per group, a grouping column, and one column for each named summary expression.

See Also

Examples

x <- Rle(10:2, 1:9)
aggregate(x, x > 4, mean)
aggregate(x, FUN=mean, start=1:26, width=20)

## Note that aggregate() works on a DataFrame object the same way it
## works on an ordinary data frame:
aggregate(DataFrame(state.x77), list(Region=state.region), mean)
aggregate(weight ~ feed, data=DataFrame(chickwts), mean)

df <- DataFrame(g=c("a", "a", "b", "b"), x=1:4, y=11:14)
offset <- 100
aggregate(df, df[["g"]], sx=sum(x), shifted=mean(y) + offset)

library(IRanges)
by <- IRanges(start=1:26, width=20, names=LETTERS)
aggregate(x, by, is.unsorted)

[Package S4Vectors version 0.49.3 Index]