For each explanatory variable, this function analyzes its relationship with a target variable by calculating statistics, WOE, IV, and other metrics.
Arguments
- data
data.frame or data.table - Data to analyze
- target
character - Name of the target variable to explain
- description_data
character - Description of the dataset (optional)
- target_type
character - Type of target: "autoguess" (default), "binary", "categorical", or "numeric"
- target_reference_level
any - Reference level for binary/categorical targets (if NULL, will be inferred)
- description_target
character - Description of the target variable (optional)
- analysis_name
character - Name for the analysis (optional)
- select_vars
character vector - Variables to include (if NULL, all columns are considered)
- exclude_vars
character vector - Variables to exclude from analysis
- nbins
integer - Number of bins for numeric variables (default: 12)
- binning_method
character - Method for binning: "quantile" (default), "clustering", or "smart"
- naming_conventions
logical - Whether to enforce naming conventions (default: FALSE)
- useNA
character - How to handle NAs: "ifany" (default) or "no"
- verbose
logical - Whether to print detailed progress information (default: FALSE)
- dec
integer - Number of decimals for numeric display (default: 2)
- order_label
character - Method for ordering labels in output (default: "auto")
- cont_target_trim
numeric - Trimming factor for continuous targets, as percentage between 0 and 1 (default: 0.01)
- bxp_factor
numeric - Factor for boxplot whiskers calculation (default: 1.5)
- num_as_categorical_nval
integer - Threshold for treating numeric as categorical (default: 5)
- autoguess_nrows
integer - Rows to use for variable type detection (default: 1000, 0 means all rows)
- woe_alternate_version
character - When to use alternate WOE definition: "if_continuous" (default) or "always"
- woe_shift
numeric - Shift value for WOE calculation to prevent issues with 0% or 100% classes (default: 0.01)
- woe_post_cluster
logical - Whether to cluster WOE values (default: FALSE)
- woe_post_cluster_n
integer - Number of clusters for WOE clustering (default: 6)
- smart_quantile_by
numeric - Quantile step for smart binning (default: 0.01)
- by_nvars
integer - Number of variables to process in each batch (default: 200)
- debug
logical - Whether to print debug information (default: FALSE)
- ...
Additional parameters passed to targeter_internal
Examples
targeter(adult,target ="ABOVE50K")
#>
#> INFO:target ABOVE50K detected as type: binary
#> INFO:binary target contains number, automatic chosen level: 1; override using `target_reference_level`
#>
#> Target profiling object with following properties:
#> Target: ABOVE50K of type: binary (target level:1 )
#> Run on data: adult the: 2025-03-29
#> 12 profiles available (AGE, FNLWGT, EDUCATIONNUM, HOURSPERWEEK, WORKCLASS...)
#> You can access each crossing using slot $profiles[[__variable__]]. Then on it use `plot` or `summary`
#> You can also directly invoke a global `summary` function on this object.