--- title: "Breakpoint Selection with Spike-and-Slab Priors" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Breakpoint Selection with Spike-and-Slab Priors} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = NOT_CRAN) ``` ## When to use spike-and-slab When fitting multi-breakpoint models, a central question is: *how many change-points does the data actually support?* `smoothbp_ss()` addresses this with a **Kuo & Mallick (1998) spike-and-slab** prior on each slope-change coefficient $\delta_k$. Each $\delta_k$ is paired with a binary inclusion indicator $\gamma_k$: - **$\gamma_k = 0$ (spike)**: $\delta_k$ is structurally zeroed — that breakpoint has no slope change and does not contribute to the mean. - **$\gamma_k = 1$ (slab)**: $\delta_k$ follows a diffuse normal prior and participates in the model. The posterior mean of each $\gamma_k$ is the **posterior inclusion probability (PIP)** — the Bayesian probability that change-point $k$ is real. A PIP near 1 means strong evidence for that breakpoint; a PIP near 0 means the data are consistent with no structural break there. > **Tip**: If you already know where a breakpoint occurred (e.g., a policy change or an intervention), you don't need to estimate its location. Use the `fixed()` helper to test the intervention effect — see `vignette("intervention-analysis")`. --- ## Example 1: Identifying the number of real breakpoints ### Simulate data with one true breakpoint ```{r sim-one-bp} library(smoothbp) library(posterior) library(ggplot2) library(dplyr) library(tidyr) set.seed(42) n <- 200 tau <- seq(0, 10, length.out = n) # True model: one breakpoint at omega = 5, delta = -1.0 om_true <- 5 rho_true <- 4 b0_true <- 2 b1_true <- 0.5 delta_true <- -1.0 di <- tau - om_true si <- plogis(di * rho_true) mu <- b0_true + b1_true * di + delta_true * di * si y <- mu + rnorm(n, sd = 0.2) dat <- data.frame(y = y, tau = tau) ``` ### Fit a 3-breakpoint SS model We let the model have 3 candidate breakpoints but expect only one to receive a high PIP: ```{r fit-ss-one} fit_ss <- smoothbp_ss( formula = y ~ tau, b0 = ~ 1, b1 = ~ 1, deltas = list(~ 1, ~ 1, ~ 1), omega = list(~ 1, ~ 1, ~ 1), rho = list(~ 1, ~ 1, ~ 1), data = dat, spike = prior_spike_slab(pi = 0.1, learn_pi = TRUE), priors = smoothbp_priors( omega = space_omega_priors(K = 3, tau_min = 0, tau_max = 10) ), chains = 2, iter = 2000, warmup = 1000, seed = 42 ) ``` ### Inspect posterior inclusion probabilities ```{r pip-one} pip(fit_ss) #> gamma_b1_(Intercept) gamma_delta1_(Intercept) gamma_delta2_(Intercept) gamma_delta3_(Intercept) #> 1.00 0.06 0.98 0.04 # Full posterior summary (just the gammas) summarise_draws(fit_ss$draws) |> filter(grepl("^gamma_delta", variable)) ``` The second candidate breakpoint has a PIP near 1, matching the data-generating truth (omega = 5 maps to the second candidate position). The other two breakpoints are effectively excluded. --- ## Example 2: Regularization prevents overfitting We demonstrate that fitting a 3-BP model to 1-BP data does **not** overfit when using spike-and-slab priors. ### Visualise posterior inclusion probabilities ```{r pip-plot} # The plot method automatically computes and plots the PIPs along with their 95% HDI plot(pip(fit_ss)) ``` --- ## Example 3: Visualising model fit The spike-and-slab model allows us to perform model selection and estimation simultaneously. We can overlay the posterior mean trajectory and its 95% credible interval on the true data-generating function: ```{r fit-plot} # Get posterior predictions (mean and intervals) pred <- fitted(fit_ss) plot_df <- dat %>% mutate( mu_true = mu, # 'mu' is from the simulation step mu_fit = pred$fitted_mean, lo = pred$fitted_Q2.5, hi = pred$fitted_Q97.5 ) ggplot(plot_df, aes(x = tau)) + geom_point(aes(y = y), alpha = 0.3, size = 1) + geom_ribbon(aes(ymin = lo, ymax = hi), fill = "#2c7bb6", alpha = 0.2) + geom_line(aes(y = mu_true, colour = "Truth"), linewidth = 1) + geom_line(aes(y = mu_fit, colour = "smoothbp_ss"), linewidth = 0.8, linetype = "dashed") + scale_colour_manual(values = c("Truth" = "black", "smoothbp_ss" = "#d7191c")) + labs( title = "Model fit: Truth vs Posterior Predictions", subtitle = "The SS model correctly ignores the 2 redundant breakpoints and recovers the trajectory.", x = "Time (tau)", y = "Response (y)", colour = NULL ) + theme_minimal() ``` --- ## Example 4: Learning the inclusion probability When you have many candidate breakpoints, setting a fixed prior $\pi$ can be influential. Setting `learn_pi = TRUE` places a Beta hyperprior on a shared $\pi$ and lets the data inform the overall sparsity level. ```{r fit-learn-pi} fit_lpi <- smoothbp_ss( formula = y ~ tau, b0 = ~ 1, b1 = ~ 1, deltas = rep(list(~ 1), 5), # 5 candidate breakpoints omega = rep(list(~ 1), 5), rho = rep(list(~ 1), 5), data = dat, spike = prior_spike_slab( learn_pi = TRUE, a = 1, # Beta(1, 4) prior on pi: mean = 0.2 b = 4 ), chains = 2, iter = 3000, warmup = 1500, seed = 42 ) # Posterior for pi (sparsity level) pi_draws <- as.numeric(as_draws_matrix(fit_lpi$draws)[, "pi"]) quantile(pi_draws, c(0.025, 0.5, 0.975)) #> 2.5% 50% 97.5% #> 0.04 0.20 0.47 # Contracted toward sparsity: only ~1/5 breakpoints are real ``` --- ## Practical guidance ### Choosing the slab width The slab standard deviation controls the range of plausible non-zero slope changes. Match the slab width to your expected effect scale: | Expected slope change | Suggested slab | |-----------------------|----------------| | Small (< 0.5) | `prior_normal(0, 0.5)` | | Moderate (0.5–2) | `prior_normal(0, 1)` | | Large / uncertain | `prior_normal(0, 2)` | ### Interpreting PIPs | PIP | Interpretation | |-----|----------------| | > 0.95 | Strong evidence for that breakpoint | | 0.75–0.95 | Moderate evidence | | 0.25–0.75 | Inconclusive | | < 0.25 | Little evidence — breakpoint not needed | For formal model selection, the **median probability model** includes all breakpoints with PIP > 0.5. ### Diagnosing the sampler The gamma indicators should switch between 0 and 1 frequently during sampling. Check with: ```{r diag-ss} # Gamma trace plots — look for healthy switching trace_plot(fit_ss, pars = grep("^gamma_delta", posterior::variables(fit_ss$draws), value = TRUE)) ``` If a gamma is stuck at 1 across all iterations, consider: - Widening the slab (the prior may be too narrow to allow exclusion). - Checking the prior on omega (a badly-placed omega prior can pin the breakpoint in an implausible region). If a gamma is stuck at 0, the breakpoint is genuinely not supported by the data — this is the desired behaviour. --- ## Comparing `smoothbp_ss` against `smoothbp` with LOO Spike-and-slab and LOO are complementary model-selection tools: ```{r ss-vs-loo} # Fit explicit 0-BP and 1-BP models for context fit0 <- smoothbp(y ~ tau, deltas = list(), omega = list(), rho = list(), data = dat, chains = 2, iter = 2000, warmup = 1000) fit1 <- smoothbp(y ~ tau, deltas = list(~ 1), omega = list(~ 1), rho = list(~ 1), data = dat, chains = 2, iter = 2000, warmup = 1000) # LOO comparison: fixed-dimension vs spike-and-slab loo::loo_compare(loo(fit1), loo(fit_ss)) #> elpd_diff se_diff #> fit1 0.0 0.0 #> fit_ss -0.2 0.4 # Both models perform identically # PIP from the SS model agrees: pip(fit_ss)["gamma_delta2_(Intercept)",] #> 0.98 ``` Both approaches correctly identify that one breakpoint is present. --- ## References - Kuo, L. and Mallick, B. (1998). Variable selection for regression models. *Sankhya B*, 60(1):65--81. - Barbieri, M. M. and Berger, J. O. (2004). Optimal predictive model selection. *Annals of Statistics*, 32(3):870--897.