Skip to contents

Train models using different combinations of predictor variables based upon missing data patterns.

Usage

medley_train(
  data,
  formula,
  method = glm,
  var_sets = get_variable_sets(data = data, formula = formula, min_set_size =
    min_set_size),
  min_set_size = 0.1,
  exclusive_membership = TRUE,
  ...
)

# S3 method for class 'medley'
summary(object, ...)

# S3 method for class 'medley'
print(x, ...)

# S3 method for class 'medley'
predict(object, newdata, ...)

Arguments

data

data.frame used to estimate the models.

formula

with all possible predictor varaibles to be considered.

method

the function used to train the models (e.g. glm, randomForest).

var_sets

a list of formulas to use for the predictive models.

min_set_size

the minimum set size as a percentage to incldue as a model.

exclusive_membership

whether an observation should only be used only in the model for which the most predictor variables are available. If `FALSE` then observations may be used in training more than one model.

...

other parameters passed to the `predict()` function.

object

the results from `medley_train`.

x

the results of `medley_train`.

newdata

(optional) a new data.frame to get predictions for.

Value

an object with the following elements:

n_models

the number of models trained.

formulas

the list of formulas used to train the models.

models

list of objects returned from the training method.

data

the data.frame used to train the models.

model_observations

a data.frame that specifies which observations are used for which model(s).

a vector of predictions.