Random Planted Forest
Usage
rpf(x, ...)
# S3 method for class 'data.frame'
rpf(
  x,
  y,
  max_interaction = 1,
  ntrees = 50,
  splits = 30,
  split_try = 10,
  t_try = 0.4,
  deterministic = FALSE,
  nthreads = 1,
  purify = FALSE,
  cv = FALSE,
  loss = "L2",
  delta = 0,
  epsilon = 0.1,
  ...
)
# S3 method for class 'matrix'
rpf(
  x,
  y,
  max_interaction = 1,
  ntrees = 50,
  splits = 30,
  split_try = 10,
  t_try = 0.4,
  deterministic = FALSE,
  nthreads = 1,
  purify = FALSE,
  cv = FALSE,
  loss = "L2",
  delta = 0,
  epsilon = 0.1,
  ...
)
# S3 method for class 'formula'
rpf(
  formula,
  data,
  max_interaction = 1,
  ntrees = 50,
  splits = 30,
  split_try = 10,
  t_try = 0.4,
  deterministic = FALSE,
  nthreads = 1,
  purify = FALSE,
  cv = FALSE,
  loss = "L2",
  delta = 0,
  epsilon = 0.1,
  ...
)
# S3 method for class 'recipe'
rpf(
  x,
  data,
  max_interaction = 1,
  ntrees = 50,
  splits = 30,
  split_try = 10,
  t_try = 0.4,
  deterministic = FALSE,
  nthreads = 1,
  purify = FALSE,
  cv = FALSE,
  loss = "L2",
  delta = 0,
  epsilon = 0.1,
  ...
)Arguments
- x, data
- Feature - matrix, or- data.frame, or- recipe.
- ...
- (Unused). 
- y
- Target vector for use with - x. The class of- y(either- numericor- factor) determines if regression or classification will be performed.
- max_interaction
- [1]: Maximum level of interaction determining maximum number of split dimensions for a tree. The default- 1corresponds to main effects only. If- 0, the number fo columns in- xis used, i.e. for 10 predictors, this is equivalent to setting- max_interaction = 10.
- ntrees
- [50]: Number of trees generated per family.
- splits
- [30]: Number of splits performed for each tree family.
- split_try
- [10]: Number of split points to be considered when choosing a split candidate.
- t_try
- [0.4]: A value in (0,1] specifying the proportion of viable split-candidates in each round.
- deterministic
- [FALSE]: Choose whether approach deterministic or random.
- nthreads
- [1L]: Number of threads used for computation, defaulting to serial execution.
- purify
- [FALSE]: Whether the forest should be purified. Set to- TRUEto enable components extract with- predict_components()are valid. Can be achieved after fitting with- purify().
- cv
- [FALSE]: Determines if cross validation is performed.
- loss
- ["L2"]: For regression, only- "L2"is supported. For classification,- "L1",- "logit"and- "exponential"are also available.- "exponential"yields similar results as- "logit"while being significantly faster.
- delta
- [0]: Only used if- lossis- "logit"or- "exponential". Proportion of class membership is truncated to be smaller 1-delta when calculating the loss to determine the optimal split.
- epsilon
- [0.1]: Only used if loss =- "logit"or- "exponential". Proportion of class membership is truncated to be smaller 1-epsilon when calculating the fit in a leaf.
- formula
- Formula specification, e.g. y ~ x1 + x2. 
Examples
# Regression with x and y
rpfit <- rpf(x = mtcars[, c("cyl", "wt")], y = mtcars$mpg)
# Regression with formula
rpfit <- rpf(mpg ~ cyl + wt, data = mtcars)