Train models using different combinations of predictor variables based upon missing data patterns.
Source:R/medley.R
medley_train.Rd
Train models using different combinations of predictor variables based upon missing data patterns.
Usage
medley_train(
data,
formula,
method = glm,
var_sets = get_variable_sets(data = data, formula = formula, min_set_size =
min_set_size),
min_set_size = 0.1,
exclusive_membership = TRUE,
...
)
# S3 method for class 'medley'
summary(object, ...)
# S3 method for class 'medley'
print(x, ...)
# S3 method for class 'medley'
predict(object, newdata, ...)
Arguments
- data
data.frame used to estimate the models.
- formula
with all possible predictor varaibles to be considered.
- method
the function used to train the models (e.g. glm, randomForest).
- var_sets
a list of formulas to use for the predictive models.
- min_set_size
the minimum set size as a percentage to incldue as a model.
- exclusive_membership
whether an observation should only be used only in the model for which the most predictor variables are available. If `FALSE` then observations may be used in training more than one model.
- ...
other parameters passed to the `predict()` function.
- object
the results from `medley_train`.
- x
the results of `medley_train`.
- newdata
(optional) a new data.frame to get predictions for.
Value
an object with the following elements:
- n_models
the number of models trained.
- formulas
the list of formulas used to train the models.
- models
list of objects returned from the training method.
- data
the data.frame used to train the models.
- model_observations
a data.frame that specifies which observations are used for which model(s).
a vector of predictions.