Title: | Reluctant Generalized Additive Models |
---|---|
Description: | A method for fitting the entire regularization path of the reluctant generalized additive model (RGAM) for linear regression, logistic, Poisson and Cox regression models. See Tay, J. K., and Tibshirani, R., (2019) <arXiv:1912.01808> for details. |
Authors: | Kenneth Tay, Robert Tibshirani |
Maintainer: | Kenneth Tay <[email protected]> |
License: | GPL-3 |
Version: | 1.1 |
Built: | 2025-03-05 03:47:21 UTC |
Source: | https://github.com/kjytay/relgam |
Does k
-fold cross-validation for rgam
.
cv.rgam( x, y, lambda = NULL, family = c("gaussian", "binomial", "poisson", "cox"), offset = NULL, init_nz, gamma, nfolds = 10, foldid = NULL, keep = FALSE, parallel = FALSE, verbose = TRUE, ... )
cv.rgam( x, y, lambda = NULL, family = c("gaussian", "binomial", "poisson", "cox"), offset = NULL, init_nz, gamma, nfolds = 10, foldid = NULL, keep = FALSE, parallel = FALSE, verbose = TRUE, ... )
x |
Input matrix, of dimension |
y |
Response |
lambda |
A user-supplied |
family |
Response type. Either |
offset |
Offset vector as in |
init_nz |
A vector specifying which features we must include when computing the non-linear features. Default is to construct non-linear features for all given features. |
gamma |
Scale factor for non-linear features (vs. original features),
to be between 0 and 1. Default is 0.8 if |
nfolds |
Number of folds for CV (default is 10). Although |
foldid |
An optional vector of values between 1 and |
keep |
If |
parallel |
If TRUE, use parallel foreach to fit each fold. Must
register parallel before hand, such as doMC or others. Note that this also
passes |
verbose |
Print information as model is being fit? Default is
|
... |
Other arguments that can be passed to |
The function runs rgam
nfolds+1 times; the first to get the lambda
sequence, and then the remainder to compute the fit with each of the folds
omitted. The error is accumulated, and the average error and standard
deviation over the folds is computed.
Note that cv.rgam
only does cross-validation for lambda but not for
the degrees of freedom hyperparameter.
An object of class "cv.rgam"
.
glmfit |
A fitted |
lambda |
The values of |
nzero_feat |
The number of non-zero features for the model |
nzero_lin |
The number of non-zero linear components for the model
|
nzero_nonlin |
The number of non-zero non-linear components for the
model |
fit.preval |
If |
cvm |
The mean cross-validated error: a vector of length
|
cvse |
Estimate of standard error of |
cvlo |
Lower curve = |
cvup |
Upper curve = |
lambda.min |
The value of |
lambda.1se |
The largest value of |
foldid |
If |
name |
Name of error measurement used for CV. |
call |
The call that produced this object. |
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) cvfit <- cv.rgam(x, y) # specify number of folds cvfit <- cv.rgam(x, y, nfolds = 5)
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) cvfit <- cv.rgam(x, y) # specify number of folds cvfit <- cv.rgam(x, y, nfolds = 5)
Returns the additive component of the RGAM model for a given feature at given data points, i.e. f_j(X_j).
getf(object, x, j, index)
getf(object, x, j, index)
object |
Fitted |
x |
Data for which we want the additive component. If |
j |
The index of the original feature whose additive component we want. |
index |
Index of lambda value for which plotting is desired. Default is
the last lambda value in |
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) fit <- rgam(x, y) # get the additive component for the feature 6, x as matrix f6 <- getf(fit, x, 6) # last value of lambda plot(x[, 6], f6) f6 <- getf(fit, x, 6, index = 20) # f1 at 20th value of lambda plot(x[, 6], f6) # get the additive component for the feature 6, x as vector new_x6 <- seq(-1, 1, length.out = 30) new_f6 <- getf(fit, new_x6, 6) # last value of lambda plot(new_x6, new_f6)
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) fit <- rgam(x, y) # get the additive component for the feature 6, x as matrix f6 <- getf(fit, x, 6) # last value of lambda plot(x[, 6], f6) f6 <- getf(fit, x, 6, index = 20) # f1 at 20th value of lambda plot(x[, 6], f6) # get the additive component for the feature 6, x as vector new_x6 <- seq(-1, 1, length.out = 30) new_f6 <- getf(fit, new_x6, 6) # last value of lambda plot(new_x6, new_f6)
Internal function for making non-linear features.
makef(x, r, df = 4, tol = 0.01, removeLin = T)
makef(x, r, df = 4, tol = 0.01, removeLin = T)
x |
Input vector of length |
r |
Vector of residuals. |
df |
Degrees of freedom for the fit. Default is 4. |
tol |
A tolerance for same-ness or uniqueness of the x values. To be
passed to the |
removeLin |
If |
A list:
f |
Non-linear feature associated with |
nl_predictor |
A function which, when given new data |
Given a vector of true outcomes and a vector of predictions, returns a list containing performance measures.
myroc(ytest, rit, N = 100)
myroc(ytest, rit, N = 100)
ytest |
True test outcome: vector of 0s and 1s. |
rit |
Predictions for the true outcome. Should be vector of continuous variables between 0 and 1. |
N |
Number of breakpoints where we evaluate the performance measures. Default is 100. |
We currently evaluate the performance measures at 100 quantiles of the
predicted values; this can be adjusted via the N
option.
A list of performance measures and intermediate computations.
sens |
Vector of sensitivity values. |
spec |
Vector of specificity values. |
ppv |
Vector of PPV values. |
npv |
Vector of NPV values |
area |
Area under ROC curve (AUC). |
se |
Standard error for AUC. |
cutp |
Cut points at which the performance measures were computed. |
cutp.max |
Cut point which maximizes (sens + spec) / 2. |
Plots the cross-validation curve produced by a cv.rgam
object, along
with upper and lower standard deviation curves, as a function of the lambda
values used. The plot also shows the number of non-zero features picked for
each value of lambda.
## S3 method for class 'cv.rgam' plot(x, sign.lambda = 1, ...)
## S3 method for class 'cv.rgam' plot(x, sign.lambda = 1, ...)
x |
Fitted " |
sign.lambda |
Either plot against |
... |
Other graphical parameters to plot. |
A plot is produced and nothing is returned.
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) cvfit <- cv.rgam(x, y) plot(cvfit)
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) cvfit <- cv.rgam(x, y) plot(cvfit)
Produces plots of the estimated functions for specified variables at a given value of lambda.
## S3 method for class 'rgam' plot( x, newx, index, which = NULL, rugplot = TRUE, grid_length = 100, names, ... )
## S3 method for class 'rgam' plot( x, newx, index, which = NULL, rugplot = TRUE, grid_length = 100, names, ... )
x |
Fitted |
newx |
Matrix of values of each predictor at which to plot. |
index |
Index of lambda value for which plotting is desired. Default is
the last lambda value in |
which |
Which features to plot. Default is the first 4 or |
rugplot |
If |
grid_length |
The number of points to evaluate the estimated function at. Default is 100. |
names |
Vector of variable names of features in |
... |
Optional graphical parameters to plot. |
A plot of the specified fitted functions is produced. Nothing is returned.
set.seed(1) n <- 100; p <- 12 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 3), rep(0, 9)), ncol = 1) y <- x %*% beta + x[, 4]^2 + rnorm(n) fit <- rgam(x, y) # default: print functions for first 4 variables opar <- par(mfrow = c(2, 2)) plot(fit, newx = x, index = 20) par(opar) # print for variables 5 to 8 opar <- par(mfrow = c(2, 2)) plot(fit, newx = x, index = 20, which = 5:8) par(opar)
set.seed(1) n <- 100; p <- 12 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 3), rep(0, 9)), ncol = 1) y <- x %*% beta + x[, 4]^2 + rnorm(n) fit <- rgam(x, y) # default: print functions for first 4 variables opar <- par(mfrow = c(2, 2)) plot(fit, newx = x, index = 20) par(opar) # print for variables 5 to 8 opar <- par(mfrow = c(2, 2)) plot(fit, newx = x, index = 20, which = 5:8) par(opar)
This function returns the predictions for a new data matrix from a
cross-validated rgam
model by using the stored "glmfit
"
object and the optimal value chosen for lambda
.
## S3 method for class 'cv.rgam' predict(object, xnew, s = c("lambda.1se", "lambda.min"), ...)
## S3 method for class 'cv.rgam' predict(object, xnew, s = c("lambda.1se", "lambda.min"), ...)
object |
Fitted " |
xnew |
Matrix of new values for |
s |
Value of the penalty parameter |
... |
Other arguments to be passed to |
This function makes it easier to use the results of cross-validation to make a prediction.
Predictions which the cross-validated model makes for xnew
at
the optimal value of lambda
. Note that the default is the "lambda.1se"
for lambda, to make this function consistent with cv.glmnet
in the
glmnet
package.
The output depends on the ...
argument which is passed on to the predict
method for rgam
objects.
cv.rgam
and predict.rgam
.
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) cvfit <- cv.rgam(x, y) # predictions at the lambda.1se value predict(cvfit, xnew = x[1:5, ]) # predictions at the lambda.min value predict(cvfit, xnew = x[1:5, ], s = "lambda.min") # predictions at specific lambda value predict(cvfit, xnew = x[1:5, ], s = 0.1) # probability predictions for binomial family bin_y <- ifelse(y > 0, 1, 0) cvfit2 <- cv.rgam(x, bin_y, family = "binomial") predict(cvfit2, xnew = x[1:5, ], type = "response", s = "lambda.min")
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) cvfit <- cv.rgam(x, y) # predictions at the lambda.1se value predict(cvfit, xnew = x[1:5, ]) # predictions at the lambda.min value predict(cvfit, xnew = x[1:5, ], s = "lambda.min") # predictions at specific lambda value predict(cvfit, xnew = x[1:5, ], s = 0.1) # probability predictions for binomial family bin_y <- ifelse(y > 0, 1, 0) cvfit2 <- cv.rgam(x, bin_y, family = "binomial") predict(cvfit2, xnew = x[1:5, ], type = "response", s = "lambda.min")
This function returns the predictions from a "rgam
" object
for a new data matrix.
## S3 method for class 'rgam' predict(object, xnew, ...)
## S3 method for class 'rgam' predict(object, xnew, ...)
object |
Fitted " |
xnew |
Matrix of new values for |
... |
Any other arguments to be passed to |
Predictions of which the model object
makes at
xnew
. The type of predictions depends on whether a type
argument
is passed. By default it givs the linear predictors for the regression model.
If an offset is used in the fit, then one must be supplied via the
newoffset
option.
rgam
.
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) fit <- rgam(x, y) # predict for full lambda path predict(fit, xnew = x[1:5, ]) # predict for specific lambda values predict(fit, xnew = x[1:5, ], s = 0.1) # predictions for binomial family bin_y <- ifelse(y > 0, 1, 0) fit2 <- rgam(x, bin_y, family = "binomial") # linear predictors predict(fit2, xnew = x[1:5, ], s = 0.05) # probabilities predict(fit2, xnew = x[1:5, ], type = "response", s = 0.05)
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) fit <- rgam(x, y) # predict for full lambda path predict(fit, xnew = x[1:5, ]) # predict for specific lambda values predict(fit, xnew = x[1:5, ], s = 0.1) # predictions for binomial family bin_y <- ifelse(y > 0, 1, 0) fit2 <- rgam(x, bin_y, family = "binomial") # linear predictors predict(fit2, xnew = x[1:5, ], s = 0.05) # probabilities predict(fit2, xnew = x[1:5, ], type = "response", s = 0.05)
Print a summary of the results of cross-validation for a RGAM model.
## S3 method for class 'cv.rgam' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'cv.rgam' print(x, digits = max(3, getOption("digits") - 3), ...)
x |
Fitted |
digits |
Significant digits in printout. |
... |
Additional print arguments. |
The call that produced the object x is printed, followed by some information
on the performance for lambda.min
and lambda.1se
.
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) cvfit <- cv.rgam(x, y) print(cvfit)
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) cvfit <- cv.rgam(x, y) print(cvfit)
Print a summary of the rgam path at each step along the path.
## S3 method for class 'rgam' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'rgam' print(x, digits = max(3, getOption("digits") - 3), ...)
x |
Fitted |
digits |
Significant digits in printout. |
... |
Additional print arguments. |
The call that produced the object x is printed, followed by a five-column matrix with columns NonZero, Lin, NonLin, columns say how many nonzero, linear and nonlinear terms there are. the percent deviance explained (relative to the null deviance).
The matrix above is silently returned.
rgam
.
set.seed(1) n <- 100; p <- 12 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 3), rep(0, 9)), ncol = 1) y <- x %*% beta + x[, 4]^2 + rnorm(n) fit <- rgam(x, y) print(fit)
set.seed(1) n <- 100; p <- 12 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 3), rep(0, 9)), ncol = 1) y <- x %*% beta + x[, 4]^2 + rnorm(n) fit <- rgam(x, y) print(fit)
Fits a reluctant generalized additive model (RGAM) for an entire regularization
path indexed by the parameter lambda
. Fits linear, logistic, Poisson
and Cox regression models. RGAM is a three-step algorithm: Step 1 fits the
lasso and computes residuals, Step 2 constructs the non-linear features, and
Step 3 fits a lasso of the response on both the linear and non-linear features.
rgam( x, y, lambda = NULL, lambda.min.ratio = ifelse(nrow(x) < ncol(x), 0.01, 1e-04), standardize = TRUE, nl_maker = relgam:::makef, family = c("gaussian", "binomial", "poisson", "cox"), offset = NULL, init_nz, nfolds = 5, foldid = NULL, gamma, parallel = FALSE, verbose = TRUE, ... )
rgam( x, y, lambda = NULL, lambda.min.ratio = ifelse(nrow(x) < ncol(x), 0.01, 1e-04), standardize = TRUE, nl_maker = relgam:::makef, family = c("gaussian", "binomial", "poisson", "cox"), offset = NULL, init_nz, nfolds = 5, foldid = NULL, gamma, parallel = FALSE, verbose = TRUE, ... )
x |
Input matrix, of dimension |
y |
Response variable. Quantitative for |
lambda |
A user-supplied |
lambda.min.ratio |
Smallest value for lambda as a fraction of the largest lambda value. The default depends on the sample size nobs relative to the number of variables nvars. If nobs > nvars, the default is 0.0001, close to zero. If nobs < nvars, the default is 0.01. |
standardize |
If |
nl_maker |
This is a function that takes in one feature |
family |
Response type. Either |
offset |
A vector of length nobs. Useful for the "poisson" family (e.g. log of exposure time), or for refining a model by starting at a current fit. Default is NULL. If supplied, then values must also be supplied to the predict function. |
init_nz |
A vector specifying which features we must include when computing the non-linear features. Default is to construct non-linear features for all given features. |
nfolds |
Number of folds for CV in Step 1 (default is 5). Although
|
foldid |
An optional vector of values between 1 and |
gamma |
Scale factor for non-linear features (vs. original features), to
be between 0 and 1. Default is 0.8 if |
parallel |
If TRUE, the |
verbose |
If |
... |
Any additional arguments to be the non-linear fitter in Step 2. |
If there are variables which the user definitely wants to compute non-linear
versions for in Step 2 of the algorithm, they can be specified as a vector for
the init_nz
option. The algorithm will compute non-linear versions for
these features as well as the features suggested by Step 1 of the algorithm.
If standardize = TRUE
, the standard deviation of the linear and
non-linear features would be 1 and gamma respectively. If
standardize = FALSE
, linear features will remain on their original
scale while non-linear features would have standard deviation gamma times
the mean standard deviation of the linear features.
For family="gaussian"
, rgam
standardizes y
to have unit
variance (using 1/n
rather than 1/(n-1)
formula).
An object of class "rgam"
.
full_glmfit |
The glmnet object resulting from Step 3: fitting a |
nl_predictor |
List of functions used to get the non-linear features for new data. For internal use only. |
init_nz |
Column indices for the features which we allow to have non-linear relationship with the response. |
step1_nz |
Indices of features which CV in Step 1 chose. |
mxf |
Means of the features (both linear and non-linear). |
sxf |
Scale factors of the features (both linear and non-linear). |
feat |
Column indices of the non-zero features for each value of
|
linfeat |
Column indices of the non-zero linear components for each value of
|
nonlinfeat |
Column indices of the non-zero non-linear components for each value
of |
nzero_feat |
The number of non-zero features for each value of
|
nzero_lin |
The number of non-zero linear components for each value of
|
nzero_nonlin |
The number of non-zero non-linear components for each value
of |
lambda |
The actual sequence of |
p |
The number of features in the original data matrix. |
family |
Response type. |
call |
The call that produced this object. |
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) fit <- rgam(x, y) # construct non-linear features for only those selected by Step 1 fit <- rgam(x, y, init_nz = c()) # specify scale factor gamma and degrees of freedom fit <- rgam(x, y, gamma = 1, df = 6) # binomial family bin_y <- ifelse(y > 0, 1, 0) fit2 <- rgam(x, bin_y, family = "binomial") # Poisson family poi_y <- rpois(n, exp(x %*% beta)) fit3 <- rgam(x, poi_y, family = "poisson") # Poisson with offset offset <- rnorm(n) fit3 <- rgam(x, poi_y, family = "poisson", offset = offset)
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) fit <- rgam(x, y) # construct non-linear features for only those selected by Step 1 fit <- rgam(x, y, init_nz = c()) # specify scale factor gamma and degrees of freedom fit <- rgam(x, y, gamma = 1, df = 6) # binomial family bin_y <- ifelse(y > 0, 1, 0) fit2 <- rgam(x, bin_y, family = "binomial") # Poisson family poi_y <- rpois(n, exp(x %*% beta)) fit3 <- rgam(x, poi_y, family = "poisson") # Poisson with offset offset <- rnorm(n) fit3 <- rgam(x, poi_y, family = "poisson", offset = offset)
Makes a two-panel plot of the rgam object showing coefficient paths.
## S3 method for class 'rgam' summary(object, label = FALSE, index = NULL, which = NULL, ...)
## S3 method for class 'rgam' summary(object, label = FALSE, index = NULL, which = NULL, ...)
object |
Fitted |
label |
If |
index |
The indices of the lambda hyperparameter which we want the plot for. The default is to plot for the entire lambda path. |
which |
Which values to plot. Default is all variables. |
... |
Additional arguments to summary. |
A two panel plot is produced, that summarizes the linear components and the nonlinear components, as a function of lambda. For the linear components, it is the coefficient for each variable. For the nonlinear components, it is the coefficient of the non-linear variable. Nothing is returned.
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) fit <- rgam(x, y) opar <- par(mfrow = c(1, 2)) summary(fit) par(opar) # with labels, just variables 1 to 5 opar <- par(mfrow = c(1, 2)) summary(fit, label = TRUE, which = 1:5) par(opar) # as above, but just the first 30 values of lambda opar <- par(mfrow = c(1, 2)) summary(fit, label = TRUE, which = 1:5, index = 1:30) par(opar)
set.seed(1) n <- 100; p <- 20 x <- matrix(rnorm(n * p), n, p) beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1) y <- x %*% beta + rnorm(n) fit <- rgam(x, y) opar <- par(mfrow = c(1, 2)) summary(fit) par(opar) # with labels, just variables 1 to 5 opar <- par(mfrow = c(1, 2)) summary(fit, label = TRUE, which = 1:5) par(opar) # as above, but just the first 30 values of lambda opar <- par(mfrow = c(1, 2)) summary(fit, label = TRUE, which = 1:5, index = 1:30) par(opar)