% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/gbt.train.R
\name{gbt.train}
\alias{gbt.train}
\title{aGTBoost Training.}
\usage{
gbt.train(y, x, learning_rate = 0.01, loss_function = "mse",
  nrounds = 50000, verbose = 0, gsub_compare,
  algorithm = "global_subset", previous_pred = NULL, weights = NULL,
  force_continued_learning = FALSE, ...)
}
\arguments{
\item{y}{response vector for training. Must correspond to the design matrix \code{x}.}

\item{x}{design matrix for training. Must be of type \code{matrix}.}

\item{learning_rate}{control the learning rate: scale the contribution of each tree by a factor of \code{0 < learning_rate < 1} when it is added to the current approximation. Lower value for \code{learning_rate} implies an increase in the number of boosting iterations: low \code{learning_rate} value means model more robust to overfitting but slower to compute. Default: 0.01}

\item{loss_function}{specify the learning objective (loss function). Only pre-specified loss functions are currently supported.
\itemize{
\item \code{mse} regression with squared error loss (Default).
\item \code{logloss} logistic regression for binary classification, output score before logistic transformation.
\item \code{poisson} Poisson regression for count data using a log-link, output score before natural transformation.
\item \code{gamma::neginv} gamma regression using the canonical negative inverse link. Scaling independent of y.
\item \code{gamma::log} gamma regression using the log-link. Constant information parametrisation. 
\item \code{negbinom} Negative binomial regression for count data with overdispersion. Log-link.
\item \code{count::auto} Chooses automatically between Poisson or negative binomial regression.
}}

\item{nrounds}{a just-in-case max number of boosting iterations. Default: 50000}

\item{verbose}{Enable boosting tracing information at i-th iteration? Default: \code{0}.}

\item{gsub_compare}{Deprecated. Boolean: Global-subset comparisons. \code{FALSE} means standard GTB, \code{TRUE} compare subset-splits with global splits (next root split). Default: \code{TRUE}.}

\item{algorithm}{specify the algorithm used for gradient tree boosting.
\itemize{
\item \code{vanilla} ordinary gradient tree boosting. Trees are optimized as if they were the last tree.
\item \code{global_subset} function-change to target maximized reduction in generalization loss for individual datapoints
}}

\item{previous_pred}{prediction vector for training. Boosted training given predictions from another model.}

\item{weights}{weights vector for scaling contributions of individual observations. Default \code{NULL} (the unit vector).}

\item{force_continued_learning}{Boolean: \code{FALSE} (default) stops at information stopping criterion, \code{TRUE} stops at \code{nround} iterations.}

\item{...}{additional parameters passed.
\itemize{
\item if loss_function is 'negbinom', dispersion must be provided in \code{...}
}}
}
\value{
An object of class \code{ENSEMBLE} with some or all of the following elements:
\itemize{
  \item \code{handle} a handle (pointer) to the \pkg{agtboost} model in memory.
  \item \code{initialPred} a field containing the initial prediction of the ensemble.
  \item \code{set_param} function for changing the parameters of the ensemble.
  \item \code{train} function for re-training (or from scratch) the ensemble directly on vector \code{y} and design matrix \code{x}.
  \item \code{predict} function for predicting observations given a design matrix
  \item \code{predict2} function as above, but takes a parameter max number of boosting ensemble iterations.
  \item \code{estimate_generalization_loss} function for calculating the (approximate) optimism of the ensemble.
  \item \code{get_num_trees} function returning the number of trees in the ensemble.
}
}
\description{
\code{gbt.train} is an interface for training an \pkg{agtboost} model.
}
\details{
These are the training functions for an \pkg{agtboost}.

Explain the philosophy and the algorithm and a little math

\code{gbt.train} learn trees with adaptive complexity given by an information criterion, 
until the same (but scaled) information criterion tells the algorithm to stop. The data used 
for training at each boosting iteration stems from a second order Taylor expansion to the loss 
function, evaluated at predictions given by ensemble at the previous boosting iteration.
}
\examples{
## A simple gtb.train example with linear regression:
x <- runif(500, 0, 4)
y <- rnorm(500, x, 1)
x.test <- runif(500, 0, 4)
y.test <- rnorm(500, x.test, 1)

mod <- gbt.train(y, as.matrix(x))
y.pred <- predict( mod, as.matrix( x.test ) )

plot(x.test, y.test)
points(x.test, y.pred, col="red")


}
\references{
Berent Ånund Strømnes Lunde, Tore Selland Kleppe and Hans Julius Skaug,
"An Information Criterion for Automatic Gradient Tree Boosting", 2020, 
\url{https://arxiv.org/abs/2008.05926}
}
\seealso{
\code{\link{predict.Rcpp_ENSEMBLE}}
}
