Transfer Function Model for ts_inter
objects
mbtransfer.Rd
This is the main prediction function in the mbtransfer
package. Given an
object of class ts_inter
(see ts_from_dfs()
), this will fit a collection
of linear gradient boosting-based transfer function models. The output is an
object of class mbtransfer_model
. Each component boosting model is
contained in the @parameters
slot, which is a list whose j^th element is
the model for the j^th taxon (row) within each ts
's values matrix.
Usage
mbtransfer(
ts_inter,
P = 1,
Q = 1,
nrounds = 500,
early_stopping_rounds = 5,
verbose = 0,
lambda = 0.1,
alpha = 0.01,
eta = 0.1,
interactions = "search",
nthread = -1,
...
)
Arguments
- ts_inter
An object of class
ts_inter
containing the time-varying microbiome community, environmental interventions, and static host features data. The columns for each element of thevalues
matrix are expected to be consecutive timepoints in an changing community.ts_from_dfs()
is a helper to create these objects fromdata.frames
with the relevant information.- P
The number of historical community composition timepoints to consider when making predictions.
- Q
The number of historical intervention timepoints to consider when making predictions.
- nrounds
The maximum number of rounds for each taxon's gradient boosting model. Smaller values will lead to faster training, but at the risk of poorer fits. Defaults to 500.
- early_stopping_rounds
If the loss function does not improve after this many rounds, then the model is assumed to have converged and training is stopped. Defaults to 5.
- verbose
Should information about each gradient boosting model's performance be printed? Allowable values are 2 (all information), 1 (some information), and 0 (no information, default).
- lambda
The l2-regularization value in the linear gradient boosting model. Defaults to 1e-1.
- alpha
The l1-regularization value in the linear gradient boosting model. Defaults to 1e-2. This value generally leads to less sparse fits, which creates useful variation in potential downstream mirror statistics calculations.
- eta
The learning rate. Defaults to 0.1. This is slower than the default in xgboost (0.3) but has been found to improve stability when needing to train on taxa with a wide range of abundances.
Examples
data(sim_ts)
fit <- mbtransfer(sim_ts)
fit@parameters[[1]]
#> ##### xgb.Booster
#> call:
#> xgb.train(params = params, data = dtrain, nrounds = nrounds,
#> evals = evals, verbose = verbose, print_every_n = print_every_n,
#> early_stopping_rounds = early_stopping_rounds, maximize = maximize,
#> save_period = save_period, save_name = save_name, xgb_model = xgb_model,
#> callbacks = callbacks, booster = "gblinear", alpha = ..2,
#> lambda = ..3, eta = ..4, nthread = ..5)
#> # of features: 194
#> # of rounds: 500
#> xgb.attributes:
#> best_iteration, best_score
#> callbacks:
#> early_stop, evaluation_log
#> evaluation_log:
#> iter train_rmse
#> <int> <num>
#> 1 4.693780
#> 2 4.659525
#> ---
#> 499 4.284943
#> 500 4.284925