Random Planted Forest
Usage
rpf(x, ...)
# S3 method for class 'data.frame'
rpf(
x,
y,
max_interaction = 1,
ntrees = 50,
splits = 30,
split_try = 10,
t_try = 0.4,
deterministic = FALSE,
nthreads = 1,
purify = FALSE,
cv = FALSE,
loss = "L2",
delta = 0,
epsilon = 0.1,
...
)
# S3 method for class 'matrix'
rpf(
x,
y,
max_interaction = 1,
ntrees = 50,
splits = 30,
split_try = 10,
t_try = 0.4,
deterministic = FALSE,
nthreads = 1,
purify = FALSE,
cv = FALSE,
loss = "L2",
delta = 0,
epsilon = 0.1,
...
)
# S3 method for class 'formula'
rpf(
formula,
data,
max_interaction = 1,
ntrees = 50,
splits = 30,
split_try = 10,
t_try = 0.4,
deterministic = FALSE,
nthreads = 1,
purify = FALSE,
cv = FALSE,
loss = "L2",
delta = 0,
epsilon = 0.1,
...
)
# S3 method for class 'recipe'
rpf(
x,
data,
max_interaction = 1,
ntrees = 50,
splits = 30,
split_try = 10,
t_try = 0.4,
deterministic = FALSE,
nthreads = 1,
purify = FALSE,
cv = FALSE,
loss = "L2",
delta = 0,
epsilon = 0.1,
...
)Arguments
- x, data
Feature
matrix, ordata.frame, orrecipe.- ...
(Unused).
- y
Target vector for use with
x. The class ofy(eithernumericorfactor) determines if regression or classification will be performed.- max_interaction
[1]: Maximum level of interaction determining maximum number of split dimensions for a tree. The default1corresponds to main effects only. If0, the number fo columns inxis used, i.e. for 10 predictors, this is equivalent to settingmax_interaction = 10.- ntrees
[50]: Number of trees generated per family.- splits
[30]: Number of splits performed for each tree family.- split_try
[10]: Number of split points to be considered when choosing a split candidate.- t_try
[0.4]: A value in (0,1] specifying the proportion of viable split-candidates in each round.- deterministic
[FALSE]: Choose whether approach deterministic or random.- nthreads
[1L]: Number of threads used for computation, defaulting to serial execution.- purify
[FALSE]: Whether the forest should be purified. Set toTRUEto enable components extract withpredict_components()are valid. Can be achieved after fitting withpurify().- cv
[FALSE]: Determines if cross validation is performed.- loss
["L2"]: For regression, only"L2"is supported. For classification,"L1","logit"and"exponential"are also available."exponential"yields similar results as"logit"while being significantly faster.- delta
[0]: Only used iflossis"logit"or"exponential". Proportion of class membership is truncated to be smaller 1-delta when calculating the loss to determine the optimal split.- epsilon
[0.1]: Only used if loss ="logit"or"exponential". Proportion of class membership is truncated to be smaller 1-epsilon when calculating the fit in a leaf.- formula
Formula specification, e.g. y ~ x1 + x2.
Examples
# Regression with x and y
rpfit <- rpf(x = mtcars[, c("cyl", "wt")], y = mtcars$mpg)
# Regression with formula
rpfit <- rpf(mpg ~ cyl + wt, data = mtcars)