Slice Sampling - Stepout MCMC

class pints.SliceStepoutMCMC(x0, sigma0=None)[source]

Implements Slice Sampling with Stepout, as described in [1].

This is a univariate method, which is applied in a Slice-Sampling-within-Gibbs framework to allow MCMC sampling from multivariate models.

Generates samples by sampling uniformly from the volume underneath the posterior (f). It does so by introducing an auxiliary variable (y) and by definying a Markov chain.

If the distribution is univariate, sampling follows:

  1. Calculate the PDF (\(f(x0)\)) of the current sample (\(x0\)).

  2. Draw a real value (\(y\)) uniformly from :math`(0, f(x0))`, defining a horizontal ‘slice’ \(S = {x: y < f (x)}\). Note that \(x0\) is always within \(S\).

  3. Find an interval (\(I = (L, R)\)) around \(x0\) that contains all, or much, of the slice.

  4. Draw a new point (\(x1\)) from the part of the slice within this interval.

If the distribution is multivariate, we apply the univariate algorithm to each variable in turn, where the other variables are set at their current values.

This implementation uses the “Stepout” method to estimate the interval \(I = (L, R)\), as described in [1] Fig. 3. pp.715 and consists of the following steps:

  1. \(U \sim uniform(0, 1)\)

  2. \(L = x_0 - wU\)

  3. \(R = L + w\)

  4. \(V \sim uniform(0, 1)\)

  5. \(J = floor(mV)\)

  6. \(K = (m - 1) - J\)

  7. while \(J > 0\) and \(y < f(L), L = L - w, J = J - 1\)

  8. while \(K > 0\) and \(y < f(R), R = R + w, K = K - 1\)

Intuitively, the interval I is estimated by expanding the initial interval by a width w in each direction until both edges fall outside the slice, or until a pre-determined limit is reached. The parameters m (an integer, which determines the limit of slice size) and w (the estimate of typical slice width) are hyperparameters.

To sample from the interval \(I = (L, R)\), such that the sample x satisfies \(y < f(x)\), we use the “Shrinkage” procedure, which reduces the size of the interval after rejecting a trial point, as defined in [1] Fig. 5. pp.716. This algorithm consists of the following steps:

  1. \(\bar{L} = L\) and \(\bar{R} = R\)

  2. Repeat:
    1. \(U \sim uniform(0, 1)\)

    2. \(x_1 = \bar{L} + U (\bar{R} - \bar{L})\)

    3. if \(y < f(x_1)\) accept \(x_1\) and exit loop, else: if \(x_1 < x_0\), \(\bar{L} = x_1\) else \(\bar{R} = x_1\)

Intuitively, we uniformly sample a trial point from the interval I, and subsequently shrink the interval each time a trial point is rejected.

The following implementation includes the possibility of carrying out “overrelaxed” slice sampling steps, as described in [1] pp. 726. Overrelaxed steps increase sampling efficiency in highly correlated unimodal distributions by suppressing the random walk behaviour of single-variable slice sampling: each variable is still updated in turn, but rather than drawing a new value for a variable from its conditional distribution independently of the current value, the new value is instead chosen to be on the opposite side of the mode from the current value. The interval I is still calculated via Stepout, and the edges l,r are used to estimate the slice endpoints via bisection. To obtain a full sampling scheme, overrelaxed updates are alternated with normal Stepout updates. To obtain the full benefits of overrelaxation, [1] suggests to set almost every update to being overrelaxed and to set the limit m for finding I to infinity. The algorithm consists of the following steps:

  1. \(\bar{L} = L, \bar{R} = R, \bar{w} = w, \bar{a} = a\)

  2. while \(R - L < 1.1 * w\):
    1. \(M = (\bar{L} + \bar{R})/ 2\)

    2. if \(\bar{a} = 0 ` or :math:`y < f(M)\), exit loop

    3. if \(x_0 > M\), \(\bar{L} = M\) else, \(\bar{R} = M\)

    4. \(\bar{a} = \bar{a} - 1\)

    5. \(\bar{w} = \bar{w} / 2\)

  3. \(\hat{L} = \bar{L}, \hat{R} = \bar{R}\)

  4. while \(\bar{a} > 0\):
    1. \(\bar{a} = \bar{a} - 1\)

    2. \(\bar{w} = \bar{w} \ 2\)

    3. if \(y >= f(\hat{L} + \bar{w})\), then \(\hat{L} = \hat{L} + \bar{w}\)

    4. if \(y >= f(\hat{R} - \bar{w})\), then \(\hat{R} = \hat{R} - \bar{W}\)

  5. \(x_1 = \hat{L} + \hat{R} - x_0\)

  6. if \(x_1 < \bar{L}\) or \(x_1 >= \bar{R}\) or \(y >= f(x_1)\), then \(x_1 = x_0\)

The probability of pursuing an overrelaxed step and the number of bisection iterations are hyperparameters.

To avoid floating-point underflow, we implement the suggestion advanced in [1] pp.712. We use the log pdf of the un-normalised posterior (\(g(x) = log(f(x))\)) instead of \(f(x)\). In doing so, we use an auxiliary variable \(z = log(y) = g(x0) - \epsilon\), where \(\epsilon \sim \text{exp}(1)\) and define the slice as \(S = {x : z < g(x)}\).

Extends SingleChainMCMC.

References

ask()[source]

See SingleChainMCMC.ask().

bisection_steps()[source]

Returns integer limit overrelaxation endpoint accuracy to 2^(-bisection steps) * width.

current_slice_height()[source]

Returns current height value used to define the current slice.

expansion_steps()[source]

Returns integer used for limiting interval expansion.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

prob_overrelaxed()[source]

Returns probability of carrying out an overrelaxed step.

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

set_bisection_steps(a)[source]

Set integer for limiting the bisection process in overrelaxed steps.

set_expansion_steps(m)[source]

Set integer for limiting the interval expansion.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [width, expansion steps, prob_overrelaxed, bisection steps]. See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_prob_overrelaxed(prob)[source]

Set the probability of a step being overrelaxed.

set_width(w)[source]

Sets the width for generating the interval.

This can either be a single number or an array with the same number of elements as the number of variables to update.

tell(fx)[source]

See pints.SingleChainMCMC.tell().

width()[source]

Returns the width used for generating the interval.