Information about models that can be used in the predictive models are stored in Debian Control Files (dcf). This is the similar to the format used in RMarkdown YAML (i.e. metadata).
Usage
read_ml_datasets(
dir = c(paste0(find.package("mldash"), "/datasets")),
cache_dir = dir,
pattern = "*.dcf",
use_cache = TRUE,
check_for_missing_packages = interactive()
)
Arguments
- dir
directory containing the dcf files for the datasets.
- pattern
optional regular expression that is used when finding files to read in. It defaults to all dcf files in the
dir
, but could be a single filename to test a metadata file.- use_cache
whether to read data from the cache if available. If FALSE, then the data will be retrieved from the
data
function parameter.- check_for_missing_packages
if TRUE you will be prompted to install missing packages.
- data_cache
directory where rds data files will be stored.
Value
a data frame with the following fields:
idThe filename of the dataset.
title*The name of the dataset from the dcf file.
type*Whether this is for a regression or classification model.
descriptionDescription of the dataset.
sourceThe source of the dataset.
referenceReference for the dataset (APA format preferred).
model*The model formula used for the predictive model.
noteAny additional information.
denotes required fields.
Details
name*The name of the dataset.
type*Whether this is for a regression, classification, timeseries, or spatial model.
descriptionDescription of the dataset.
sourceThe source of the dataset.
referenceReference for the dataset (APA format preferred).
data*An R function that returns a data.frame.
model*The formula used for the predictive model.
noteAny additional information.
denotes required fields.