% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cqcheck.R
\name{cqcheck}
\alias{cqcheck}
\title{Visually checking a fitted quantile model}
\usage{
cqcheck(obj, v, X = NULL, y = NULL, nbin = c(10, 10), bound = NULL,
  lev = 0.05, scatter = FALSE, ...)
}
\arguments{
\item{obj}{the output of a \code{qgam} call.}

\item{v}{if a 1D plot is required, \code{v} should be either a single character or a numeric vector. In the first case
\code{v} should be the names of one of the variables in the dataframe \code{X}. In the second case, the length
of \code{v} should be equal to the number of rows of \code{X}. If a 2D plot is required, \code{v} should be 
either a vector of two characters or a matrix with two columns.}

\item{X}{a dataframe containing the data used to obtain the conditional quantiles. By default it is NULL, in which
case predictions are made using the model matrix in \code{obj$model}.}

\item{y}{vector of responses. Its i-th entry corresponds to the i-th row of X.  By default it is NULL, in which
case it is internally set to \code{obj$y}.}

\item{nbin}{a vector of integers of length one (1D case) or two (2D case) indicating the number of bins to be used
in each direction. Used only if \code{bound==NULL}.}

\item{bound}{in the 1D case it is a numeric vector whose increasing entries represent the bounds of each bin.
In the 2D case a list of two vectors should be provided. \code{NULL} by default.}

\item{lev}{the significance levels used in the plots, this determines the width of the confidence 
intervals. Default is 0.05.}

\item{scatter}{if TRUE a scatterplot is added (using the \code{points} function). FALSE by default.}

\item{...}{extra graphical parameters to be passed to \code{plot()}.}
}
\value{
Simply produces a plot.
}
\description{
Given an additive quantile model, fitted using \code{qgam}, \code{cqcheck} provides some plots
             that allow to check what proportion of responses, \code{y}, falls below the fitted quantile.
}
\details{
Having fitted an additive model for, say, quantile \code{qu=0.4} one would expect that about 40% of the 
         responses fall below the fitted quantile. This function allows to visually compare the empirical number
         of responses (\code{qu_hat}) falling below the fit with its theoretical value (\code{qu}). In particular, 
         the responses are binned, which the bins being constructed along one or two variables (given be arguments
         \code{v}). Let (\code{qu_hat[i]}) be the proportion of responses below the fitted quantile in the ith bin.
         This should be approximately equal to \code{qu}, for every i. In the 1D case, when \code{v} is a single
         character or a numeric vector, \code{cqcheck} provides a plot where: the horizontal line is \code{qu}, 
         the dots correspond to \code{qu_hat[i]} and the grey lines are confidence intervals for \code{qu}. The
         confidence intervals are based on \code{qbinom(lev/2, siz, qu)}, if the dots fall outside them, then 
         \code{qu_hat[i]} might be deviating too much from \code{qu}. In the 2D case, when \code{v} is a vector of two
         characters or a matrix with two columns, we plot a grid of bins. The responses are divided between the bins
         as before, but now don't plot the confidence intervals. Instead we report the empirical proportions \code{qu_hat[i]}
         for the non-empty bin, and with colour the bins in red if \code{qu_hat[i]<qu} and in green otherwise. If       
         \code{qu_hat[i]} falls outside the confidence intervals we put an * next to the numeric \code{qu_hat[i]} and
         we use more intense colours.
}
\examples{
#######
# Bivariate additive model y~1+x+x^2+z+x*z/2+e, e~N(0, 1)
#######
\dontrun{
library(qgam)
set.seed(15560)
n <- 500
x <- rnorm(n, 0, 1); z <- rnorm(n)
X <- cbind(1, x, x^2, z, x*z)
beta <- c(0, 1, 1, 1, 0.5)
y <- drop(X \%*\% beta) + rnorm(n) 
dataf <- data.frame(cbind(y, x, z))
names(dataf) <- c("y", "x", "z")

#### Fit a constant model for median
qu <- 0.5
fit <- qgam(y~1, qu = qu, data = dataf)

# Look at what happens along x: clearly there is non linear pattern here
cqcheck(obj = fit, v = c("x"), X = dataf, y = y) 

#### Add a smooth for x
fit <- qgam(y~s(x), qu = qu, err = 0.05, data = dataf)
cqcheck(obj = fit, v = c("x"), X = dataf, y = y) # Better!

# Lets look across x and z. As we move along z (x2 in the plot) 
# the colour changes from green to red
cqcheck(obj = fit, v = c("x", "z"), X = dataf, y = y, nbin = c(5, 5))

# The effect look pretty linear
cqcheck(obj = fit, v = c("z"), X = dataf, y = y, nbin = c(10))

#### Lets add a linear effect for z 
fit <- qgam(y~s(x)+z, qu = qu, data = dataf)

# Looks better!
cqcheck(obj = fit, v = c("z"))

# Lets look across x and y again: green prevails on the top-left to bottom-right
# diagonal, while the other diagonal is mainly red.
cqcheck(obj = fit, v = c("x", "z"), nbin = c(5, 5))

### Maybe adding an interaction would help?
fit <- qgam(y~s(x)+z+I(x*z), qu = qu, data = dataf)

# It does! The real model is: y ~ 1 + x + x^2 + z + x*z/2 + e, e ~ N(0, 1)
cqcheck(obj = fit, v = c("x", "z"), nbin = c(5, 5))
}

}
\author{
Matteo Fasiolo <matteo.fasiolo@gmail.com>.
}
