--- title: "8 State-space models" author: "Jan-Ole Fischer" output: rmarkdown::html_vignette # output: pdf_document vignette: > %\VignetteIndexEntry{State space models} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: refs.bib link-citations: yes --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", # fig.path = "img/", fig.align = "center", fig.dim = c(8, 6), out.width = "85%" ) ```

This vignette has not yet been updated to work with RTMB

> Before diving into this vignette, we recommend reading the vignette [**Introduction to LaMa**](https://janolefi.github.io/LaMa/articles/Intro_to_LaMa.html). This vignette shows how to fit state-space models which can be interpreted as generalisation of HMMs to continuous state spaces. Several approaches exist to fitting such models, but @langrock2011some showed that very **general state-space models** can be fitted via approximate maximum likelihood estimation, when the continuous state space is **finely discretised**. This is equivalent to numerical integration over the state process using midpoint quadrature. Here, we will showcase this approach for a basic **stochastic volatility model** which, while being very simple, captures most of the stylised facts of financial time series. In this model the unobserved marked volatility is described by an AR(1) process: $$ g_t = \phi g_{t-1} + \sigma \eta_t, \qquad \eta_t \sim N(0,1), $$ with autoregression parameter $\phi < 1$ and dispersion parameter $\sigma$. Share returns $y_t$ can then be modelled as $$ y_t = \beta \epsilon_t \exp(g_t / 2), $$ where $\epsilon_t \sim N(0,1)$ and $\beta > 0$ is the baseline standard deviation of the returns (when $g_t$ is in equilibrium), which implies $$ y_t \mid g_t \sim N(0, (\beta e^{g_t / 2})^2). $$ ### Simulating data from the stochastic volatility model We start by simulating data from the above specified model: ```{r, setup} # loading the package library(LaMa) ``` ```{r data} beta = 2 # baseline standard deviation phi = 0.95 # AR parameter of the log-volatility process sigma = 0.5 # variability of the log-volatility process n = 1000 set.seed(123) g = rep(NA, n) g[1] = rnorm(1, 0, sigma / sqrt(1-phi^2)) # stationary distribution of AR(1) process for(t in 2:n){ # sampling next state based on previous state and AR(1) equation g[t] = rnorm(1, phi*g[t-1], sigma) } # sampling zero-mean observations with standard deviation given by latent process y = rnorm(n, 0, beta * exp(g/2)) # share returns oldpar = par(mar = c(5,4,3,4.5)+0.1) plot(y, type = "l", bty = "n", ylim = c(-40,20), yaxt = "n") # true underlying standard deviation lines(beta*exp(g)/7 - 40, col = "deepskyblue", lwd = 2) axis(side=2, at = seq(-20,20,by=5), labels = seq(-20,20,by=5)) axis(side=4, at = seq(0,150,by=75)/7-40, labels = seq(0,150,by=75)) mtext("standard deviation", side=4, line=3, at = -30) par(oldpar) ``` ### Writing the negative log-likelihood function To calculate the likelihood, we need to integrate over the state space. We approximate this high-dimensional integral using a midpoint quadrature which ultimately results in a fine discretisation of the continuous state space into the intervals `b` with width `h` and midpoints `bstar` [@langrock2011some]. Thus, the likelihood below corresponds to a basic HMM likelihood with a large number of states. ```{r mllk} nll = function(par, y, bm, m){ phi = plogis(par[1]) sigma = exp(par[2]) beta = exp(par[3]) b = seq(-bm, bm, length = m+1) # intervals for midpoint quadrature h = b[2] - b[1] # interval width bstar = (b[-1] + b[-(m+1)]) / 2 # interval midpoints # approximating t.p.m. resulting from midpoint quadrature Gamma = sapply(bstar, dnorm, mean = phi * bstar, sd = sigma) * h delta = h * dnorm(bstar, 0, sigma / sqrt(1-phi^2)) # stationary distribution # approximating state-dependent density based on midpoints allprobs = t(sapply(y, dnorm, mean = 0, sd = beta * exp(bstar/2))) # forward algorithm -forward(delta, Gamma, allprobs) } ``` ### Fitting an SSM to the data ```{r model, warning=FALSE, cache = TRUE} par = c(qlogis(0.95), log(0.3), log(1)) bm = 5 # relevant range of underlying volatility (-5,5) m = 100 # number of approximating states system.time( mod <- nlm(nll, par, y = y, bm = bm, m = m) ) ``` ### Results ```{r results} ## parameter estimates (phi = plogis(mod$estimate[1])) (sigma = exp(mod$estimate[2])) (beta = exp(mod$estimate[3])) ## decoding states b = seq(-bm, bm, length = m+1) # intervals for midpoint quadrature h = b[2]-b[1] # interval width bstar = (b[-1] + b[-(m+1)])/2 # interval midpoints Gamma = sapply(bstar, dnorm, mean = phi*bstar, sd = sigma) * h delta = h * dnorm(bstar, 0, sigma/sqrt(1-phi^2)) # stationary distribution # approximating state-dependent density based on midpoints allprobs = t(sapply(y, dnorm, mean = 0, sd = beta * exp(bstar/2))) # actual decoding probs = stateprobs(delta, Gamma, allprobs) # local/ soft decoding states = viterbi(delta, Gamma, allprobs) # global/ hard decoding oldpar = par(mar = c(5,4,3,4.5)+0.1) plot(y, type = "l", bty = "n", ylim = c(-50,20), yaxt = "n") # when there are so many states it is not too sensible to only plot the most probable state, # as its probability might still be very small. Generally, we are approximating continuous # distributions, thus it makes sense to plot the entire conditional distribution. maxprobs = apply(probs, 1, max) for(t in 1:nrow(probs)){ colend = round((probs[t,]/(maxprobs[t]*5))*100) colend[which(colend<10)] = paste0("0", colend[which(colend<10)]) points(rep(t, m), bstar*4-35, col = paste0("#FFA200",colend), pch = 20) } # we can add the viterbi decoded volatility levels as a "mean" lines(bstar[states]*4-35) axis(side=2, at = seq(-20,20,by=5), labels = seq(-20,20,by=5)) axis(side=4, at = seq(-5,5, by = 5)*4-35, labels = seq(-5,5, by = 5)) mtext("g", side=4, line=3, at = -30) par(oldpar) ``` ## References