Skip to contents

Generates replicated outcome samples \(y^{rep}\) from a matrix of posterior draws under a specified likelihood family. When a design matrix X is supplied the linear predictor \(\eta = X \beta\) is computed first and the appropriate inverse-link is applied.

Usage

simulate_ppc(
  posterior_draws,
  X = NULL,
  family = "gaussian",
  sigma_posterior = NULL,
  n_trials = 1L
)

Arguments

posterior_draws

A numeric matrix of dimension \(S \times P\), where \(S\) is the number of posterior iterations and \(P\) the number of parameters. When X is NULL, \(P\) is interpreted as the number of observations \(n\) and each column is the marginal posterior mean for one observation. When X is supplied, \(P\) must equal ncol(X).

X

Optional numeric design matrix of dimension \(n \times P\). If provided, the linear predictor \(X \beta\) is computed for each posterior draw. Defaults to NULL.

family

Character string specifying the likelihood family. Currently supported: "gaussian" and "binomial". Defaults to "gaussian".

sigma_posterior

Optional numeric vector of length \(S\) containing posterior draws of the residual standard deviation \(\sigma\). Only used when family = "gaussian". If NULL, the empirical standard deviation of the linear predictor across observations is used as a fallback.

n_trials

Integer. Number of binomial trials per observation. Only used when family = "binomial". Defaults to 1 (Bernoulli).

Value

A numeric matrix of dimension \(S \times n\) containing one replicated data set per posterior draw.

Details

For family = "gaussian", each replicated observation is drawn as $$y^{rep}_{si} \sim \mathcal{N}(\mu_{si},\, \sigma_s^2),$$ where \(\mu_{si}\) is the fitted mean for draw \(s\) and observation \(i\).

For family = "binomial", the linear predictor is passed through the logistic function \(p = 1 / (1 + e^{-\eta})\) and $$y^{rep}_{si} \sim \mathrm{Binomial}(n\_trials,\, p_{si}).$$

Examples

set.seed(42)
S <- 200   # posterior draws
n <- 50    # observations
draws <- matrix(rnorm(S * n, mean = 2), nrow = S, ncol = n)
y_rep  <- simulate_ppc(draws, family = "gaussian")
dim(y_rep)  # 200 x 50
#> [1] 200  50