makeIRangesFromDataFrame {IRanges}R Documentation

Make a IRanges object from a data.frame or DataFrame

Description

makeIRangesFromDataFrame takes a data-frame-like object as input and try to automatically find the columns that describe integer ranges. If successful, it returns them in an object.

The function is also the workhorse behind the coercion method from data.frame (or DataFrame) to IRanges.

Usage

makeIRangesFromDataFrame(df, keep.extra.columns=FALSE,
        start.field="start",
        end.field=c("end", "stop"),
        starts.in.df.are.0based=FALSE,
        na.rm=FALSE)

Arguments

df

A data.frame or DataFrame object. If not, then the function first tries to turn df into a data frame with as.data.frame(df).

keep.extra.columns

TRUE or FALSE (the default). If TRUE, then the columns in df that are not used to form the integer ranges of the returned IRanges are returned as metadata columns on the object. Otherwise, they are ignored.

Note that if df has a width column, then makeIRangesFromDataFrame will always ignore it.

start.field

A character vector of recognized names for the column in df that contains the start values of the integer ranges. Only the first name in start.field that is found in colnames(df) is used. If no one is found, then an error is raised.

end.field

A character vector of recognized names for the column in df that contains the end values of the integer ranges. Only the first name in start.field that is found in colnames(df) is used. If no one is found, then an error is raised.

starts.in.df.are.0based

TRUE or FALSE (the default). If TRUE, then the start values of the integer ranges in df are considered to be 0-based and are converted to 1-based in the returned IRanges object. This feature is intended to make it more convenient to handle input that contains data obtained from resources using the "0-based start" convention. A notorious example of such resource is the UCSC Table Browser (http://genome.ucsc.edu/cgi-bin/hgTables).

na.rm

TRUE or FALSE (the default).

If TRUE, then rows in the df with missing start or end values (i.e. the value is NA) are ignored. Otherwise, they trigger an error.

Value

An IRanges object.

If na.rm is set to FALSE (the default), then the returned object is guaranteed to have one element per row in the input. However, if na.rm is set to TRUE, then the length of the returned object can be less than nrow(df).

If df has non-automatic row names (i.e. rownames(df) is not NULL and is not seq_len(nrow(df))), then they will be used to set names on the returned IRanges object.

Note

Coercing a data.frame or DataFrame df to IRanges (with as(df, "IRanges")), or calling IRanges(df), are both equivalent to calling makeIRangesFromDataFrame(df, keep.extra.columns=TRUE).

Author(s)

H. Pagès

See Also

Examples

df <- data.frame(ID=letters[1:5],
                 locus_stART=11:15,
                 locus_STop=12:16,
                 score=1:5)

## makeIRangesFromDataFrame() tries hard to figure out what columns
## to use for the start and end values of the ranges.
makeIRangesFromDataFrame(df)

makeIRangesFromDataFrame(df, keep.extra.columns=TRUE)

as(df, "IRanges")  # equivalent to the above

[Package IRanges version 2.44.0 Index]