Skip to contents

Build a decision tree model following/based on a targeter analysis

Usage

tartree(
  data,
  tar_object = NULL,
  tarsum_object = NULL,
  target = NULL,
  decision_tree_sample = 0.8,
  seed = 42,
  predict_prob_cutpoint = 0.5,
  predict_prob_cutpoint_quantile = 0.5,
  rpart.control = list(minsplit = 20, minbucket = 8, cp = 0.01, maxcompete = 4,
    maxsurrogate = 5, usesurrogate = 2, xval = 10, surrogatestyle = 0, maxdepth = 3L),
  ...
)

Arguments

data

data.frame or data.table

tar_object

targeter object

tarsum_object

targeter summary object

target

character, target column name - default: NULL and if tar_object is provided, target is taken from it

decision_tree_sample

numeric, proportion of data to be used for training - to be betwwen 0 (not included) and 1 (not recommended) default: 0.8

seed

integer, seed for random number generation - default: 42

predict_prob_cutpoint

cutpoint to be used for binary decision - default 0.5

predict_prob_cutpoint_quantile

quantile of probabilities to be used for further additional preduction. Default 0.5. Could be used to see what if we want to create a group of x% records.

rpart.control

list, control parameters for rpart function

...

other parameters to be passed to targeter

Value

a targeter decision tree model on top of which a report can be generated with report function

Details

tartree is a function that builds a decision tree model based on a targeter analysis. It is recommended to have pre-computed targeter object and targeter summary object. If not, the function will compute them. The targeter object is used to define the target column and the target type. The targeter summary object is used to define the variables to be used in the decision tree model. The function will split the data into training and validation sets, build the decision tree model, and return it. The decision tree model is a rpart object with additional attributes: tar_object, tarsum_object, and target.

Examples

if (FALSE) { # \dontrun{
data(adult)
tar_object <- targeter(adult, target = "ABOVE50K")
tarsum_object <- summary(tar_object)
tar_tree <- tartree(adult, tar_object = tar_object, tarsum_object = tarsum_object)
plot(tar_tree)
tar_report(tar_tree)
} # }