Welcome to the pints documentation

Pints is hosted on GitHub, where you can find downloads and installation instructions.

Detailed examples can also be found there.

This page provides the API, or developer documentation for pints.

Contents

Boundaries

Simple boundaries for an optimisation can be created using RectangularBoundaries. More complex types can be made using LogPDFBoundaries or a custom implementation of the Boundaries interface.

Overview:

class pints.Boundaries[source]

Abstract class representing boundaries on a parameter space.

check(parameters)[source]

Returns True if and only if the given point in parameter space is within the boundaries.

Parameters:parameters – A point in parameter space
n_parameters()[source]

Returns the dimension of the parameter space these boundaries are defined on.

sample(n=1)[source]

Returns n random samples from within the boundaries, for example to use as starting points for an optimisation.

The returned value is a NumPy array with shape (n, d) where n is the requested number of samples, and d is the dimension of the parameter space these boundaries are defined on.

Note that implementing :meth:`sample()` is optional, so some boundary types may not support it.

Parameters:n (int) – The number of points to sample
class pints.LogPDFBoundaries(log_pdf, threshold=-inf)[source]

Uses a pints.LogPDF (e.g. a LogPrior) as boundaries), accepting log-likelihoods above a given threshold as within bounds.

For a pints.LogPrior based on pints.Boundaries, see pints.UniformLogPrior.

Extends pints.Boundaries.

Parameters:
  • log_pdf – A pints.LogPdf to use.
  • threshold – A threshold to determine whether a given log-prior value counts as within bounds. Anything _above_ the threshold counts as within bounds.
check(parameters)[source]

See pints.Boundaries.check().

n_parameters()[source]

See pints.Boundaries.n_parameters().

sample(n=1)[source]

See pints.Boundaries.sample().

Note: This method is implemented only when the error measure is based on a pints.LogPrior that supports sampling.

class pints.RectangularBoundaries(lower, upper)[source]

Represents a set of lower and upper boundaries for model parameters.

A point x is considered within the boundaries if (and only if) lower <= x < upper.

Extends pints.Boundaries.

Parameters:
  • lower – A 1d array of lower boundaries.
  • upper – The corresponding upper boundaries
check(parameters)[source]

See pints.Boundaries.check().

lower()[source]

Returns the lower boundaries for all parameters (as a read-only NumPy array).

n_parameters()[source]

See pints.Boundaries.n_parameters().

range()[source]

Returns the size of the parameter space (i.e. upper - lower).

sample(n=1)[source]

See pints.Boundaries.sample().

upper()[source]

Returns the upper boundary for all parameters (as a read-only NumPy array).

Core classes and methods

Pints provides the SingleOutputProblem and MultiOutputProblem classes to formulate inverse problems based on time series data and ForwardModel.

Overview:

pints.version(formatted=False)[source]

Returns the version number, as a 3-part integer (major, minor, revision). If formatted=True, it returns a string formatted version (for example “Pints 1.0.0”).

class pints.TunableMethod[source]

Defines an interface for a numerical method with a given number of hyper-parameters.

Each optimiser or sampler method implemented in pints has a number of parameters which alters its behaviour, which can be called “hyper-parameters”. The optimiser/sampler method will provide member functions to set each of these hyper-parameters individually. In contrast, this interface provides a generic way to set the hyper-parameters, which allows the user to, for example, use an optimiser to tune the hyper-parameters of the method.

Note that set_hyper_parameters() takes an array of parameters, which might be of the same type (e.g. a NumPy array). So derived classes should not raise any errors if individual hyper parameters are set using the wrong type (e.g. float rather than int), but should instead implicitly convert the argument to the correct type.

n_hyper_parameters()[source]

Returns the number of hyper-parameters for this method (see TunableMethod).

set_hyper_parameters(x)[source]

Sets the hyper-parameters for the method with the given vector of values (see TunableMethod).

Parameters:x – An array of length n_hyper_parameters used to set the hyper-parameters.

Forward model

class pints.ForwardModel[source]

Defines an interface for user-supplied forward models.

Classes extending ForwardModel can implement the required methods directly in Python or interface with other languages (for example via Python wrappers around C code).

n_outputs()[source]

Returns the number of outputs this model has. The default is 1.

n_parameters()[source]

Returns the dimension of the parameter space.

simulate(parameters, times)[source]

Runs a forward simulation with the given parameters and returns a time-series with data points corresponding to the given times.

Returns a sequence of length n_times (for single output problems) or a NumPy array of shape (n_times, n_outputs) (for multi-output problems), representing the values of the model at the given times.

Parameters:
  • parameters – An ordered sequence of parameter values.
  • times – The times at which to evaluate. Must be an ordered sequence, without duplicates, and without negative values. All simulations are started at time 0, regardless of whether this value appears in times.

Forward model with sensitivities

class pints.ForwardModelS1[source]

Defines an interface for user-supplied forward models which can calculate the first-order derivative of the simulated values with respect to the parameters.

Extends pints.ForwardModel.

n_outputs()

Returns the number of outputs this model has. The default is 1.

n_parameters()

Returns the dimension of the parameter space.

simulate(parameters, times)

Runs a forward simulation with the given parameters and returns a time-series with data points corresponding to the given times.

Returns a sequence of length n_times (for single output problems) or a NumPy array of shape (n_times, n_outputs) (for multi-output problems), representing the values of the model at the given times.

Parameters:
  • parameters – An ordered sequence of parameter values.
  • times – The times at which to evaluate. Must be an ordered sequence, without duplicates, and without negative values. All simulations are started at time 0, regardless of whether this value appears in times.
simulateS1(parameters, times)[source]

Runs a forward simulation with the given parameters and returns a time-series with data points corresponding to the given times, along with the sensitivities of the forward simulation with respect to the parameters.

Parameters:
  • parameters – An ordered list of parameter values.
  • times – The times at which to evaluate. Must be an ordered sequence, without duplicates, and without negative values. All simulations are started at time 0, regardless of whether this value appears in times.
Returns:

  • y – The simulated values, as a sequence of n_times values, or a NumPy array of shape (n_times, n_outputs).
  • y’ – The corresponding derivatives, as a NumPy array of shape (n_times, n_parameters) or an array of shape (n_times, n_outputs, n_parameters).

Problems

class pints.SingleOutputProblem(model, times, values)[source]

Represents an inference problem where a model is fit to a single time series, such as measured from a system with a single output.

Parameters:
  • model – A model or model wrapper extending ForwardModel.
  • times – A sequence of points in time. Must be non-negative and increasing.
  • values – A sequence of scalar output values, measured at the times in times.
evaluate(parameters)[source]

Runs a simulation using the given parameters, returning the simulated values as a NumPy array of shape (n_times,).

evaluateS1(parameters)[source]

Runs a simulation with first-order sensitivity calculation, returning the simulated values and derivatives.

The returned data is a tuple of NumPy arrays (y, y'), where y has shape (self._n_times,) while y' has shape (n_times, n_parameters).

This method only works for problems with a model that implements the :class:`ForwardModelS1` interface.

n_outputs()[source]

Returns the number of outputs for this problem (always 1).

n_parameters()[source]

Returns the dimension (the number of parameters) of this problem.

n_times()[source]

Returns the number of sampling points, i.e. the length of the vectors returned by times() and values().

times()[source]

Returns this problem’s times.

The returned value is a read-only NumPy array of shape (n_times, ), where n_times is the number of time points.

values()[source]

Returns this problem’s values.

The returned value is a read-only NumPy array of shape (n_times, ), where n_times is the number of time points.

class pints.MultiOutputProblem(model, times, values)[source]

Represents an inference problem where a model is fit to a multi-valued time series, such as measured from a system with multiple outputs.

Parameters:
  • model – A model or model wrapper extending ForwardModel.
  • times – A sequence of points in time. Must be non-negative and non-decreasing.
  • values – A sequence of multi-valued measurements. Must have shape (n_times, n_outputs), where n_times is the number of points in times and n_outputs is the number of outputs in the model.
evaluate(parameters)[source]

Runs a simulation using the given parameters, returning the simulated values.

The returned data is a NumPy array with shape (n_times, n_outputs).

evaluateS1(parameters)[source]

Runs a simulation using the given parameters, returning the simulated values.

The returned data is a tuple of NumPy arrays (y, y'), where y has shape (n_times, n_outputs), while y' has shape (n_times, n_outputs, n_parameters).

This method only works for problems whose model implements the :class:`ForwardModelS1` interface.

n_outputs()[source]

Returns the number of outputs for this problem.

n_parameters()[source]

Returns the dimension (the number of parameters) of this problem.

n_times()[source]

Returns the number of sampling points, i.e. the length of the vectors returned by times() and values().

times()[source]

Returns this problem’s times.

The returned value is a read-only NumPy array of shape (n_times, n_outputs), where n_times is the number of time points and n_outputs is the number of outputs.

values()[source]

Returns this problem’s values.

The returned value is a read-only NumPy array of shape (n_times, n_outputs), where n_times is the number of time points and n_outputs is the number of outputs.

Diagnosing MCMC results

Pints provides a number of functions to diagnose MCMC progress and convergence.

Overview:

pints.rhat(chains, warm_up=0.0)[source]

Returns the convergence measure \(\hat{R}\) for the approximate posterior according to [1].

\(\hat{R}\) diagnoses convergence by checking mixing and stationarity of \(m\) chains (at least two, \(m\geq 2\)). To diminish the influence of starting values, the first portion of each chain can be excluded from the computation. Subsequently, the truncated chains are split in half, resulting in a total number of \(m'=2m\) chains of length \(n'=(1-\text{warm_up})n/2\). The mean of the variances within and between the resulting chains are computed, \(W\) and \(B\) respectively. Based on those variances an estimator of the marginal posterior variance is constructed

\[\widehat{\text{var}}^+ = \frac{n'-1}{n'}W + \frac{1}{n'}B,\]

The estimator overestimates the variance of the marginal posterior if the chains are not well mixed and stationary, but is unbiased if the original chains equal the target distribution. At the same time, the mean within variance \(W\) underestimates the marginal posterior variance for finite \(n\), but converges to the true variance for \(n\rightarrow \infty\). By comparing \(\widehat{\text{var}}^+\) and \(W\) the mixing and stationarity of the chains can be quantified

\[\hat{R} = \sqrt{\frac{\widehat{\text{var}}^+}{W}}.\]

For well mixed and stationary chains \(\hat{R}\) will be close to one.

The mean within \(W\) and mean between \(B\) variance of the \(m'=2m\) chains of length \(n'=(1-\text{warm_up})n/2\) are defined as

\[W = \frac{1}{m'}\sum _{j=1}^{m'}s_j^2\quad \text{where}\quad s_j^2=\frac{1}{n'-1}\sum _{i=1}^{n'}(\psi _{ij} - \bar{\psi} _j)^2,\]
\[B = \frac{n'}{m'-1}\sum _{j=1}^{m'}(\bar{\psi} _j - \bar{\psi})^2.\]

Here, \(\psi _{ij}\) is the jth sample of the ith chain, \(\bar{\psi _j}=\sum _{i=1}^{n'}\psi _{ij}/n'\) is the within chain mean of the parameter \(\psi\) and \(\bar{\psi } = \sum _{j=1}^{m'}\bar{\psi} _{j}/m'\) is the between chain mean of the within chain means.

References

[1]“Bayesian data analysis”, ch. 11.4 ‘Inference and assessing convergence’, 3rd edition, Gelman et al., 2014.
Parameters:
  • chains (np.ndarray of shape (m, n) or (m, n, p)) – A numpy array with \(n\) samples for each of \(m\) chains. Optionally the \(\hat{R}\) for \(p\) parameters can be computed by passing a numpy array with \(m\) chains of length \(n\) for \(p\) parameters.
  • warm_up (float) – First portion of each chain that will not be used for the computation of \(\hat{R}\).
Returns:

rhat\(\hat{R}\) of the posteriors for each parameter.

Return type:

float or np.ndarray of shape (p,)

pints.rhat_all_params(chains)[source]

Deprecated alias of rhat().

pints.effective_sample_size(samples)[source]

Calculates effective sample size (ESS) for a list of n-dimensional samples.

Parameters:samples – A 2d array of shape (n_samples, n_parameters).

Diagnostic plots

For users who have Matplotlib installed, Pints offers a number of diagnostic plots that can be used to quickly check obtained results.

Plotting functions:

Diagnosing MCMC results:

Functions

pints.plot.function(f, x, lower=None, upper=None, evaluations=20)[source]

Creates 1d plots of a LogPDF or a ErrorMeasure around a point x (i.e. a 1-dimensional plot in each direction).

Returns a matplotlib figure object and axes handle.

Parameters:
  • f – A pints.LogPDF or pints.ErrorMeasure to plot.
  • x – A point in the function’s input space.
  • lower – Optional lower bounds for each parameter, used to specify the lower bounds of the plot.
  • upper – Optional upper bounds for each parameter, used to specify the upper bounds of the plot.
  • evaluations – The number of evaluations to use in each plot.
pints.plot.function_between_points(f, point_1, point_2, padding=0.25, evaluations=20)[source]

Creates and returns a plot of a function between two points in parameter space.

Returns a matplotlib figure object and axes handle.

Parameters:
  • f – A pints.LogPDF or pints.ErrorMeasure to plot.
  • point_1 – The first point in parameter space. The method will find a line from point_1 to point_2 and plot f at several points along it.
  • point_2 – The second point.
  • padding – Specifies the amount of padding around the line segment [point_1, point_2] that will be shown in the plot.
  • evaluations – The number of evaluation along the line in parameter space.
pints.plot.surface(points, values, boundaries=None, markers='+', figsize=None)[source]

Takes irregularly spaced points and function evaluations in a two-dimensional parameter space and creates a coloured surface plot using a voronoi diagram.

Returns a matplotlib figure object and axes handle.

Parameters:
  • points – A list of (two-dimensional) points in parameter space.
  • values – The values (e.g. error measure evaluations) corresponding to these points.
  • boundaries – An optional pints.RectangularBoundaries object to restrict the area shown. If set to None boundaries will be determined from the given points.
  • markers – An optional string indicating the matplotlib markers to use to plot the points. Set to None to hide.
  • figsize – An optional tuple (width, height) that will be passed to matplotlib when creating the figure. If set to None matplotlib will use its default figure size.

MCMC Diagnostics

pints.plot.autocorrelation(samples, max_lags=100, parameter_names=None)[source]

Creates and returns an autocorrelation plot for a given markov chain or list of samples.

Returns a matplotlib figure object and axes handle.

Parameters:
  • samples – A list of samples, with shape (n_samples, n_parameters), where n_samples is the number of samples in the list and n_parameters is the number of parameters.
  • max_lags – The maximum autocorrelation lag to plot.
  • parameter_names – A list of parameter names, which will be displayed in the legend of the autocorrelation subplots. If no names are provided, the parameters are enumerated.
pints.plot.histogram(samples, kde=False, n_percentiles=None, parameter_names=None, ref_parameters=None)[source]

Takes one or more markov chains or lists of samples as input and creates and returns a plot showing histograms for each chain or list of samples.

Returns a matplotlib figure object and axes handle.

Parameters:
  • samples – A list of lists of samples, with shape (n_lists, n_samples, n_parameters), where n_lists is the number of lists of samples, n_samples is the number of samples in one list and n_parameters is the number of parameters.
  • kde – Set to True to include kernel-density estimation for the histograms.
  • n_percentiles – Shows only the middle n-th percentiles of the distribution. Default shows all samples in samples.
  • parameter_names – A list of parameter names, which will be displayed on the x-axis of the histogram subplots. If no names are provided, the parameters are enumerated.
  • ref_parameters – A set of parameters for reference in the plot. For example, if true values of parameters are known, they can be passed in for plotting.
pints.plot.pairwise(samples, kde=False, heatmap=False, opacity=None, n_percentiles=None, parameter_names=None, ref_parameters=None)[source]

Takes a markov chain or list of samples and creates a set of pairwise scatterplots for all parameters (p1 versus p2, p1 versus p3, p2 versus p3, etc.).

The returned plot is in a ‘matrix’ form, with histograms of each individual parameter on the diagonal, and scatter plots of parameters i and j on each entry (i, j) below the diagonal.

Returns a matplotlib figure object and axes handle.

Parameters:
  • samples – A list of samples, with shape (n_samples, n_parameters), where n_samples is the number of samples in the list and n_parameters is the number of parameters.
  • kde – Set to True to use kernel-density estimation for the histograms and scatter plots. Cannot use together with heatmap.
  • heatmap – Set to True to plot heatmap for the pairwise plots. Cannot be used together with kde.
  • Opacity – This value can be used to manually set the opacity of the points in the scatter plots (when kde=False and heatmap=False only).
  • n_percentiles – Shows only the middle n-th percentiles of the distribution. Default shows all samples in samples.
  • parameter_names – A list of parameter names, which will be displayed on the x-axis of the trace subplots. If no names are provided, the parameters are enumerated.
  • ref_parameters – A set of parameters for reference in the plot. For example, if true values of parameters are known, they can be passed in for plotting.
pints.plot.series(samples, problem, ref_parameters=None, thinning=None)[source]

Creates and returns a plot of predicted time series for a given list of samples and a single-output or multi-output problem.

Because this method runs simulations, it can take a considerable time to run.

Returns a matplotlib figure object and axes handle.

Parameters:
  • samples – A list of samples, with shape (n_samples, n_parameters), where n_samples is the number of samples in the list and n_parameters is the number of parameters.
  • problem – A :class:pints.SingleOutputProblem or :class:pints.MultiOutputProblem of a n_parameters equal to or greater than the n_parameters of the samples. Any extra parameters present in the chain but not accepted by the SingleOutputProblem or MultiOutputProblem (for example parameters added by a noise model) will be ignored.
  • ref_parameters – A set of parameters for reference in the plot. For example, if true values of parameters are known, they can be passed in for plotting.
  • thinning – An integer exceeding zero. If specified, only every n-th sample (with n = thinning) in the samples will be used. If left at the default value None, a value will be chosen so that 200 to 400 predictions are shown.
pints.plot.trace(samples, n_percentiles=None, parameter_names=None, ref_parameters=None)[source]

Takes one or more markov chains or lists of samples as input and creates and returns a plot showing histograms and traces for each chain or list of samples.

Returns a matplotlib figure object and axes handle.

Parameters:
  • samples – A list of lists of samples, with shape (n_lists, n_samples, n_parameters), where n_lists is the number of lists of samples, n_samples is the number of samples in one list and n_parameters is the number of parameters.
  • n_percentiles – Shows only the middle n-th percentiles of the distribution. Default shows all samples in samples.
  • parameter_names – A list of parameter names, which will be displayed on the x-axis of the trace subplots. If no names are provided, the parameters are enumerated.
  • ref_parameters – A set of parameters for reference in the plot. For example, if true values of parameters are known, they can be passed in for plotting.

Error measures

Error measures are callable objects that return some scalar representing the error between a model and an experiment.

Example:

error = pints.SumOfSquaresError(problem)
x = [1,2,3]
fx = error(x)

Overview:

class pints.ErrorMeasure[source]

Abstract base class for objects that calculate some scalar measure of goodness-of-fit (for a model and a data set), such that a smaller value means a better fit.

ErrorMeasures are callable objects: If e is an instance of an ErrorMeasure class you can calculate the error by calling e(p) where p is a point in parameter space.

evaluateS1(x)[source]

Evaluates this error measure, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data has the shape (e, e') where e is a scalar value and e' is a sequence of length n_parameters.

This is an optional method that is not always implemented.

n_parameters()[source]

Returns the dimension of the parameter space this measure is defined over.

class pints.MeanSquaredError(problem, weights=None)[source]

Calculates the mean square error:

\[f = \sum _i^n \frac{(y_i - x_i)^2}{n},\]

where \(y\) is the data, \(x\) the model output and \(n\) is the total number of data points.

Extends ProblemErrorMeasure.

Parameters:
  • problem – A pints.SingleOutputProblem or pints.MultiOutputProblem.
  • weights – An optional sequence of (float) weights, exactly one per problem output. If given, the error in each individual output will be multiplied by the corresponding weight. If no weights are specified all outputs will be weighted equally.
evaluateS1(x)[source]

See ErrorMeasure.evaluateS1().

n_parameters()

See ErrorMeasure.n_parameters().

class pints.NormalisedRootMeanSquaredError(problem)[source]

Calculates a normalised root mean squared error:

\[f = \frac{1}{C}\sqrt{\frac{\sum _i^n (y_i - x_i) ^ 2}{n}},\]

where \(C\) is the normalising constant, \(y\) is the data, \(x\) the model output and \(n\) is the total number of data points. The normalising constant is given by

\[C = \sqrt{\frac{\sum _i^n y_i^2}{n}}.\]

This error measure is similar to the (unnormalised) RootMeanSquaredError.

Extends ProblemErrorMeasure.

Parameters:problem – A pints.SingleOutputProblem.
evaluateS1(x)

Evaluates this error measure, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data has the shape (e, e') where e is a scalar value and e' is a sequence of length n_parameters.

This is an optional method that is not always implemented.

n_parameters()

See ErrorMeasure.n_parameters().

class pints.ProbabilityBasedError(log_pdf)[source]

Changes the sign of a LogPDF to use it as an error. Minimising this error will maximise the probability.

Extends ErrorMeasure.

Parameters:log_pdf (pints.LogPDF) – The LogPDF to base this error on.
evaluateS1(x)[source]

See ErrorMeasure.evaluateS1().

This method only works if the underlying LogPDF implements the optional method LogPDF.evaluateS1()!

n_parameters()[source]

See ErrorMeasure.n_parameters().

class pints.ProblemErrorMeasure(problem=None)[source]

Abstract base class for ErrorMeasures defined for single or multi-output problems.

evaluateS1(x)

Evaluates this error measure, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data has the shape (e, e') where e is a scalar value and e' is a sequence of length n_parameters.

This is an optional method that is not always implemented.

n_parameters()[source]

See ErrorMeasure.n_parameters().

class pints.RootMeanSquaredError(problem)[source]

Calculates a normalised root mean squared error:

\[f = \sqrt{\frac{\sum _i^n (y_i - x_i) ^ 2}{n}},\]

where \(y\) is the data, \(x\) the model output and \(n\) is the total number of data points.

Extends ProblemErrorMeasure.

Parameters:problem – A pints.SingleOutputProblem.
evaluateS1(x)

Evaluates this error measure, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data has the shape (e, e') where e is a scalar value and e' is a sequence of length n_parameters.

This is an optional method that is not always implemented.

n_parameters()

See ErrorMeasure.n_parameters().

class pints.SumOfErrors(error_measures, weights=None)[source]

Calculates a sum of ErrorMeasure objects, all defined on the same parameter space

\[f = \sum _i f_i,\]

where \(f_i\) are the individual error meaures.

Extends ErrorMeasure.

Parameters:
  • error_measures – A sequence of error measures.
  • weights – An optional sequence of (float) weights, exactly one per error measure. If given, each individual error will be multiplied by the corresponding weight. If no weights are given all sums will be weighted equally.

Examples

errors = [
    pints.MeanSquaredError(problem1),
    pints.MeanSquaredError(problem2),
]

# Equally weighted
e1 = pints.SumOfErrors(errors)

# Differrent weights:
weights = [
    1.0,
    2.7,
]
e2 = pints.SumOfErrors(errors, weights)
evaluateS1(x)[source]

See ErrorMeasure.evaluateS1().

This method only works if all the underlying :class:`ErrorMeasure` objects implement the optional method :meth:`ErrorMeasure.evaluateS1()`!

n_parameters()[source]

See ErrorMeasure.n_parameters().

class pints.SumOfSquaresError(problem, weights=None)[source]
Calculates a sum of squares error:
\[f = \sum _i^n (y_i - x_i) ^ 2,\]

where \(y\) is the data, \(x\) the model output and \(n\) is the total number of data points.

Extends ErrorMeasure.

Parameters:problem – A pints.SingleOutputProblem or pints.MultiOutputProblem.
evaluateS1(x)[source]

See ErrorMeasure.evaluateS1().

n_parameters()

See ErrorMeasure.n_parameters().

Function evaluation

The Evaluator classes provide an abstraction layer that makes it easier to implement sequential and/or parallel evaluation of functions.

Example:

f = pints.SumOfSquaresError(problem)
e = pints.ParallelEvaluator(f)
x = [[1, 2],
     [3, 4],
     [5, 6],
     [7, 8],
    ]
fx = e.evaluate(x)

Overview:

pints.evaluate(f, x, parallel=False, args=None)[source]

Evaluates the function f on every value present in x and returns a sequence of evaluations f(x[i]).

It is possible for the evaluation of f to involve the generation of random numbers (using numpy). In this case, the results from calling evaluate can be made reproducible by first seeding numpy’s generator with a fixed number. However, a call with parallel=True will use a different (but consistent) sequence of random numbers than a call with parallel=False.

Parameters:
  • f (callable) – The function to evaluate, called as f(x[i], *args).
  • x – A list of values to evaluate f with
  • parallel (boolean) – Run in parallel or not. If set to True, the evaluations will happen in parallel using a number of worker processes equal to the detected cpu core count. The number of workers can be set explicitly by setting parallel to an integer greater than 0. Parallelisation can be disabled by setting parallel to 0 or False.
  • args (sequence) – Optional extra arguments to pass into f.
class pints.Evaluator(function, args=None)[source]

Abstract base class for classes that take a function (or callable object) f(x) and evaluate it for list of input values x.

This interface is shared by a parallel and a sequential implementation, allowing easy switching between parallel or sequential implementations of the same algorithm.

It is possible for the evaluation of f to involve the generation of random numbers (using numpy). In this case, the results from calling evaluate can be made reproducible by first seeding numpy’s generator with a fixed number. However, different Evaluator implementations may use a different random sequence. In other words, each Evaluator can be made to return consistent results, but the results returned by different Evaluators may vary.

Parameters:
  • function (callable) – A function or other callable object f that takes a value x and returns an evaluation f(x).
  • args (sequence) – An optional sequence of extra arguments to f. If args is specified, f will be called as f(x, *args).
evaluate(positions)[source]

Evaluate the function for every value in the sequence positions.

Returns a list with the returned evaluations.

class pints.ParallelEvaluator(function, n_workers=None, max_tasks_per_worker=500, n_numpy_threads=1, args=None)[source]

Evaluates a single-valued function object for any set of input values given, using all available cores.

Shares an interface with the SequentialEvaluator, allowing parallelism to be switched on and off with minimal hassle. Parallelism takes a little time to be set up, so as a general rule of thumb it’s only useful for if the total run-time is at least ten seconds (anno 2015).

By default, the number of processes (“workers”) used to evaluate the function is set equal to the number of CPU cores reported by python’s multiprocessing module. To override the number of workers used, set n_workers to some integer greater than 0.

There are two important caveats for using multiprocessing to evaluate functions:

  1. Processes don’t share memory. This means the function to be evaluated will be duplicated (via pickling) for each process (see Avoid shared state for details).
  2. On windows systems your code should be within an if __name__ == '__main__': block (see Windows for details).

The evaluator will keep it’s subprocesses alive and running until it is tidied up by garbage collection.

Note that while this class uses multiprocessing, it is not thread/process safe itself: It should not be used by more than a single thread/process at a time.

Extends Evaluator.

Parameters:
  • function – The function to evaluate
  • n_workers – The number of worker processes to use. If left at the default value n_workers=None the number of workers will equal the number of CPU cores in the machine this is run on. In many cases this will provide good performance.
  • max_tasks_per_worker – Python garbage collection does not seem to be optimized for multi-process function evaluation. In many cases, some time can be saved by refreshing the worker processes after every max_tasks_per_worker evaluations. This number can be tweaked for best performance on a given task / system.
  • n_numpy_threads – Numpy and other scientific libraries may make use of threading in C or C++ based BLAS libraries, which can interfere with PINTS multiprocessing and cause slower execution. To prevent this, the number of threads to use will be limited to 1 by default, using the threadpoolctl module. To use the current numpy default instead, set n_numpy_threads to None, to use the BLAS/OpenMP etc. defaults, set n_numpy_threads to 0, or to use a specific number of threads pass in any integer greater than 1.
  • args – An optional sequence of extra arguments to f. If args is specified, f will be called as f(x, *args).
static cpu_count()[source]

Uses the multiprocessing module to guess the number of available cores.

For machines with simultaneous multithreading (“hyperthreading”) this will return the number of virtual cores.

evaluate(positions)

Evaluate the function for every value in the sequence positions.

Returns a list with the returned evaluations.

class pints.SequentialEvaluator(function, args=None)[source]

Evaluates a function (or callable object) for a list of input values, and returns a list containing the calculated function evaluations.

Runs sequentially, but shares an interface with the ParallelEvaluator, allowing parallelism to be switched on/off.

Extends Evaluator.

Parameters:
  • function (callable) – The function to evaluate.
  • args (sequence) – An optional tuple containing extra arguments to f. If args is specified, f will be called as f(x, *args).
evaluate(positions)

Evaluate the function for every value in the sequence positions.

Returns a list with the returned evaluations.

I/O Helper classes

pints.io.load_samples(filename, n=None)[source]

Loads samples from the given filename and returns a 2d NumPy array containing them.

If the optional argument n is given, the method assumes there are n files, with names based on filename such that e.g. test.csv would become test_0.csv, test_1.csv, …, test_n.csv. In this case a list of 2d NumPy arrays is returned.

Assumes the first line in each file is a header.

See also save_samples().

pints.io.save_samples(filename, *sample_lists)[source]

Stores one or multiple lists of samples at the path given by filename.

If one list of samples is given, the filename is used as is. If multiple lists are given, the filenames are updated to include _0, _1, _2, etc.

For example, save_samples('test.csv', samples) will store information from samples in test.csv. Using save_samples('test.csv', samples_0, samples_1) will store the samples from samples_0 to test_0.csv and samples_1 to test_1.csv.

See also: load_samples().

Log-likelihoods

The classes below all implement the ProblemLogLikelihood interface, and can calculate a log-likelihood based on some time-series Problem and an assumed noise model.

Example:

logpdf = pints.GaussianLogLikelihood(problem)
x = [1, 2, 3]
fx = logpdf(x)

Overview:

class pints.AR1LogLikelihood(problem)[source]

Calculates a log-likelihood assuming AR(1) (autoregressive order 1) errors.

In this error model, the ith error term \(\epsilon_i = x_i - f_i(\theta)\) is assumed to obey the following relationship.

\[\epsilon_i = \rho \epsilon_{i-1} + \nu_i\]

where \(\nu_i\) is IID Gaussian white noise with variance \(\sigma^2 (1-\rho^2)\). Therefore, this likelihood is appropriate when error terms are autocorrelated, and the parameter \(\rho\) determines the level of autocorrelation.

This model is parameterised as such because it leads to a simple marginal distribution \(\epsilon_i \sim N(0, \sigma)\).

This class treats the error at the first time point (i=1) as fixed, which simplifies the calculations. For sufficiently long time-series, this conditioning on the first observation has at most a small effect on the likelihood. Further details as well as the alternative unconditional likelihood are available in [1] , chapter 5.2.

Noting that

\[\nu_i = \epsilon_i - \rho \epsilon_{i-1} \sim N(0, \sigma^2 (1-\rho^2))\]

we thus calculate the likelihood as the product of normal likelihoods from \(i=2,...,N\), for a time series with N time points.

\[L(\theta, \sigma, \rho|\boldsymbol{x}) = -\frac{N-1}{2} \log(2\pi) - (N-1) \log(\sigma') - \frac{1}{2\sigma'^2} \sum_{i=2}^N (\epsilon_i - \rho \epsilon_{i-1})^2\]

for \(\sigma' = \sigma \sqrt{1-\rho^2}\).

Extends ProblemLogLikelihood.

Parameters:problem – A SingleOutputProblem or MultiOutputProblem. For a single-output problem two parameters are added (rho, sigma), for a multi-output problem 2 * n_outputs parameters are added.

References

[1]Hamilton, James D. Time series analysis. Vol. 2. New Jersey: Princeton, 1994.
evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()

See LogPDF.n_parameters().

class pints.ARMA11LogLikelihood(problem)[source]

Calculates a log-likelihood assuming ARMA(1,1) errors.

The ARMA(1,1) model has 1 autoregressive term and 1 moving average term. It assumes that the errors \(\epsilon_i = x_i - f_i(\theta)\) obey

\[\epsilon_i = \rho \epsilon_{i-1} + \nu_i + \phi \nu_{i-1}\]

where \(\nu_i\) is IID Gaussian white noise with standard deviation \(\sigma'\).

\[\sigma' = \sigma \sqrt{\frac{1 - \rho^2}{1 + 2 \phi \rho + \phi^2}}\]

This model is parameterised as such because it leads to a simple marginal distribution \(\epsilon_i \sim N(0, \sigma)\).

Due to the complexity of the exact ARMA(1,1) likelihood, this class calculates a likelihood conditioned on initial values. This topic is discussed further in [2] , chapter 5.6. Thus, for a time series defined at points \(i=1,...,N\), summation begins at \(i=3\), and the conditional log-likelihood is

\[L(\theta, \sigma, \rho, \phi|\boldsymbol{x}) = -\frac{N-2}{2} \log(2\pi) - (N-2) \log(\sigma') - \frac{1}{2\sigma'^2} \sum_{i=3}^N (\nu_i)^2\]

where the values of \(\nu_i\) are calculated from the observations according to

\[\nu_i = \epsilon_i - \rho \epsilon_{i-1} - \phi (\epsilon_{i-1} - \rho \epsilon_{i-2})\]

Extends ProblemLogLikelihood.

Parameters:problem – A SingleOutputProblem or MultiOutputProblem. For a single-output problem three parameters are added (rho, phi, sigma), for a multi-output problem 3 * n_outputs parameters are added.

References

[2]Hamilton, James D. Time series analysis. Vol. 2. New Jersey: Princeton, 1994.
evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()

See LogPDF.n_parameters().

class pints.CauchyLogLikelihood(problem)[source]

Calculates a log-likelihood assuming independent Cauchy-distributed noise at each time point, and adds one parameter: the scale (sigma).

For a noise characterised by sigma, the log-likelihood is of the form:

\[\log{L(\theta, \sigma)} = -N\log \pi - N\log \sigma -\sum_{i=1}^N\log(1 + \frac{x_i - f(\theta)}{\sigma}^2)\]

Extends ProblemLogLikelihood.

Parameters:problem – A SingleOutputProblem or MultiOutputProblem. For a single-output problem one parameter is added sigma, where sigma is scale, for a multi-output problem n_outputs parameters are added.
evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()

See LogPDF.n_parameters().

class pints.ConstantAndMultiplicativeGaussianLogLikelihood(problem)[source]

Calculates the log-likelihood assuming a mixed error model of a Gaussian base-level noise and a Gaussian heteroscedastic noise.

For a time series model \(f(t| \theta)\) with parameters \(\theta\) , the ConstantAndMultiplicativeGaussianLogLikelihood assumes that the model predictions \(X\) are Gaussian distributed according to

\[X(t| \theta , \sigma _{\text{base}}, \sigma _{\text{rel}}) = f(t| \theta) + (\sigma _{\text{base}} + \sigma _{\text{rel}} f(t| \theta)^\eta ) \, \epsilon ,\]

where \(\epsilon\) is a i.i.d. standard Gaussian random variable

\[\epsilon \sim \mathcal{N}(0, 1).\]

For each output in the problem, this likelihood introduces three new scalar parameters: a base-level scale \(\sigma _{\text{base}}\); an exponential power \(\eta\); and a scale relative to the model output \(\sigma _{\text{rel}}\).

The resulting log-likelihood of a constant and multiplicative Gaussian error model is

\[\log L(\theta, \sigma _{\text{base}}, \eta , \sigma _{\text{rel}} | X^{\text{obs}}) = -\frac{n_t}{2} \log 2 \pi -\sum_{i=1}^{n_t}\log \sigma _{\text{tot}, i} - \sum_{i=1}^{n_t} \frac{(X^{\text{obs}}_i - f(t_i| \theta))^2} {2\sigma ^2_{\text{tot}, i}},\]

where \(n_t\) is the number of measured time points in the time series, \(X^{\text{obs}}_i\) is the observation at time point \(t_i\), and \(\sigma _{\text{tot}, i}=\sigma _{\text{base}} +\sigma _{\text{rel}} f(t_i| \theta)^\eta\) is the total standard deviation of the error at time \(t_i\).

For a system with \(n_o\) outputs, this becomes

\[\log L(\theta, \sigma _{\text{base}}, \eta , \sigma _{\text{rel}} | X^{\text{obs}}) = -\frac{n_tn_o}{2} \log 2 \pi -\sum_{j=1}^{n_0}\sum_{i=1}^{n_t}\log \sigma _{\text{tot}, ij} - \sum_{j=1}^{n_0}\sum_{i=1}^{n_t} \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {2\sigma ^2_{\text{tot}, ij}},\]

where \(n_o\) is the number of outputs of the model, \(X^{\text{obs}}_{ij}\) is the observation at time point \(t_i\) of output \(j\), and \(\sigma _{\text{tot}, ij}=\sigma _{\text{base}, j} + \sigma _{\text{rel}, j}f_j(t_i| \theta)^{\eta _j}\) is the total standard deviation of the error at time \(t_i\) of output \(j\).

Extends ProblemLogLikelihood.

Parameters:problem – A SingleOutputProblem or MultiOutputProblem. For a single-output problem three parameters are added (\(\sigma _{\text{base}}\), \(\eta\), \(\sigma _{\text{rel}}\)), for a multi-output problem \(3n_o\) parameters are added (\(\sigma _{\text{base},1},\ldots , \sigma _{\text{base},n_o}, \eta _1,\ldots , \eta _{n_o}, \sigma _{\text{rel},1}, \ldots , \sigma _{\text{rel},n_o})\).
evaluateS1(parameters)[source]

See LogPDF.evaluateS1().

The partial derivatives of the log-likelihood w.r.t. the model parameters are

\[\begin{split}\frac{\partial \log L}{\partial \theta _k} =& -\sum_{i,j}\sigma _{\text{rel},j}\eta _j\frac{ f_j(t_i| \theta)^{\eta _j-1}} {\sigma _{\text{tot}, ij}} \frac{\partial f_j(t_i| \theta)}{\partial \theta _k} + \sum_{i,j} \frac{X^{\text{obs}}_{ij} - f_j(t_i| \theta)} {\sigma ^2_{\text{tot}, ij}} \frac{\partial f_j(t_i| \theta)}{\partial \theta _k} \\ &+\sum_{i,j}\sigma _{\text{rel},j}\eta _j \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {\sigma ^3_{\text{tot}, ij}}f_j(t_i| \theta)^{\eta _j-1} \frac{\partial f_j(t_i| \theta)}{\partial \theta _k} \\ \frac{\partial \log L}{\partial \sigma _{\text{base}, j}} =& -\sum ^{n_t}_{i=1}\frac{1}{\sigma _{\text{tot}, ij}} +\sum ^{n_t}_{i=1} \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {\sigma ^3_{\text{tot}, ij}} \\ \frac{\partial \log L}{\partial \eta _j} =& -\sigma _{\text{rel},j}\eta _j\sum ^{n_t}_{i=1} \frac{f_j(t_i| \theta)^{\eta _j}\log f_j(t_i| \theta)} {\sigma _{\text{tot}, ij}} + \sigma _{\text{rel},j}\eta _j \sum ^{n_t}_{i=1} \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {\sigma ^3_{\text{tot}, ij}}f_j(t_i| \theta)^{\eta _j} \log f_j(t_i| \theta) \\ \frac{\partial \log L}{\partial \sigma _{\text{rel},j}} =& -\sum ^{n_t}_{i=1} \frac{f_j(t_i| \theta)^{\eta _j}}{\sigma _{\text{tot}, ij}} + \sum ^{n_t}_{i=1} \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {\sigma ^3_{\text{tot}, ij}}f_j(t_i| \theta)^{\eta _j},\end{split}\]

where \(i\) sums over the measurement time points and \(j\) over the outputs of the model.

n_parameters()

See LogPDF.n_parameters().

class pints.GaussianIntegratedUniformLogLikelihood(problem, lower, upper)[source]

Calculates a log-likelihood assuming independent Gaussian-distributed noise at each time point where \(\sigma\sim U(a,b)\) has been integrated out of the joint posterior of \(p(\theta,\sigma|X)\),

\[\begin{split}\begin{align} p(\theta|X) &= \int_{0}^{\infty} p(\theta, \sigma|X) \mathrm{d}\sigma\\ &\propto \int_{0}^{\infty} p(X|\theta, \sigma) p(\theta, \sigma) \mathrm{d}\sigma,\end{align}\end{split}\]

Note that this is exactly the same statistical model as pints.GaussianLogLikelihood with a uniform prior on \(\sigma\).

A possible advantage of this log-likelihood compared with using a pints.GaussianLogLikelihood, is that it has one fewer parameters (\(sigma\)) which may speed up convergence to the posterior distribution, especially for multi-output problems which will have n_outputs fewer parameter dimensions.

The log-likelihood is given in terms of the sum of squared errors:

\[SSE = \sum_{i=1}^n (f_i(\theta) - y_i)^2\]

and is given up to a normalisation constant by:

\[\begin{split}\begin{align} \text{log} L = & - n / 2 \text{log}(\pi) \\ & - \text{log}(2 (b - a) \sqrt(2)) \\ & + (1 / 2 - n / 2) \text{log}(SSE) \\ & + \text{log}\left[\Gamma((n - 1) / 2, \frac{SSE}{2 b^2}) - \Gamma((n - 1) / 2, \frac{SSE}{2 a^2}) \right] \end{align}\end{split}\]

where \(\Gamma(u,v)\) is the upper incomplete gamma function as defined here: https://en.wikipedia.org/wiki/Incomplete_gamma_function

This log-likelihood is inherently a Bayesian method since it assumes a uniform prior on \(\sigma\sim U(a,b)\). However using this likelihood in optimisation routines should yield the same estimates as the full pints.GaussianLogLikelihood.

Extends ProblemLogLikelihood.

Parameters:
  • problem – A SingleOutputProblem or MultiOutputProblem.
  • lower – The lower limit on the uniform prior on sigma. Must be non-negative.
  • upper – The upper limit on the uniform prior on sigma.
evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()

See LogPDF.n_parameters().

class pints.GaussianKnownSigmaLogLikelihood(problem, sigma)[source]

Calculates a log-likelihood assuming independent Gaussian noise at each time point, using a known value for the standard deviation (sigma) of that noise:

\[\log{L(\theta | \sigma,\boldsymbol{x})} = -\frac{N}{2}\log{2\pi} -N\log{\sigma} -\frac{1}{2\sigma^2}\sum_{i=1}^N{(x_i - f_i(\theta))^2}\]

Extends ProblemLogLikelihood.

Parameters:
  • problem – A SingleOutputProblem or MultiOutputProblem.
  • sigma – The standard devation(s) of the noise. Can be a single value or a sequence of sigma’s for each output. Must be greater than zero.
evaluateS1(x)[source]

See LogPDF.evaluateS1().

n_parameters()

See LogPDF.n_parameters().

class pints.GaussianLogLikelihood(problem)[source]

Calculates a log-likelihood assuming independent Gaussian noise at each time point, and adds a parameter representing the standard deviation (sigma) of the noise on each output.

For a noise level of sigma, the likelihood becomes:

\[L(\theta, \sigma|\boldsymbol{x}) = p(\boldsymbol{x} | \theta, \sigma) = \prod_{j=1}^{n_t} \frac{1}{\sqrt{2\pi\sigma^2}}\exp\left( -\frac{(x_j - f_j(\theta))^2}{2\sigma^2}\right)\]

leading to a log likelihood of:

\[\log{L(\theta, \sigma|\boldsymbol{x})} = -\frac{n_t}{2} \log{2\pi} -n_t \log{\sigma} -\frac{1}{2\sigma^2}\sum_{j=1}^{n_t}{(x_j - f_j(\theta))^2}\]

where n_t is the number of time points in the series, x_j is the sampled data at time j and f_j is the simulated data at time j.

For a system with n_o outputs, this becomes

\[\log{L(\theta, \sigma|\boldsymbol{x})} = -\frac{n_t n_o}{2}\log{2\pi} -\sum_{i=1}^{n_o}{ {n_t}\log{\sigma_i} } -\sum_{i=1}^{n_o}{\left[ \frac{1}{2\sigma_i^2}\sum_{j=1}^{n_t}{(x_j - f_j(\theta))^2} \right]}\]

Extends ProblemLogLikelihood.

Parameters:problem – A SingleOutputProblem or MultiOutputProblem. For a single-output problem a single parameter is added, for a multi-output problem n_outputs parameters are added.
evaluateS1(x)[source]

See LogPDF.evaluateS1().

n_parameters()

See LogPDF.n_parameters().

class pints.KnownNoiseLogLikelihood(problem, sigma)[source]

Deprecated alias of GaussianKnownSigmaLogLikelihood.

evaluateS1(x)

See LogPDF.evaluateS1().

n_parameters()

See LogPDF.n_parameters().

class pints.MultiplicativeGaussianLogLikelihood(problem)[source]

Calculates the log-likelihood for a time-series model assuming a heteroscedastic Gaussian error of the model predictions \(f(t, \theta )\).

This likelihood introduces two new scalar parameters for each dimension of the model output: an exponential power \(\eta\) and a scale \(\sigma\).

A heteroscedascic Gaussian noise model assumes that the observable \(X\) is Gaussian distributed around the model predictions \(f(t, \theta )\) with a standard deviation that scales with \(f(t, \theta )\)

\[X(t) = f(t, \theta) + \sigma f(t, \theta)^\eta v(t)\]

where \(v(t)\) is a standard i.i.d. Gaussian random variable

\[v(t) \sim \mathcal{N}(0, 1).\]

This model leads to a log likelihood of the model parameters of

\[\log{L(\theta, \eta , \sigma | X^{\text{obs}})} = -\frac{n_t}{2} \log{2 \pi} -\sum_{i=1}^{n_t}{\log{f(t_i, \theta)^\eta \sigma}} -\frac{1}{2}\sum_{i=1}^{n_t}\left( \frac{X^{\text{obs}}_{i} - f(t_i, \theta)} {f(t_i, \theta)^\eta \sigma}\right) ^2,\]

where \(n_t\) is the number of time points in the series, and \(X^{\text{obs}}_{i}\) the measurement at time \(t_i\).

For a system with \(n_o\) outputs, this becomes

\[\log{L(\theta, \eta , \sigma | X^{\text{obs}})} = -\frac{n_t n_o}{2} \log{2 \pi} -\sum ^{n_o}_{j=1}\sum_{i=1}^{n_t}{\log{f_j(t_i, \theta)^\eta \sigma _j}} -\frac{1}{2}\sum ^{n_o}_{j=1}\sum_{i=1}^{n_t}\left( \frac{X^{\text{obs}}_{ij} - f_j(t_i, \theta)} {f_j(t_i, \theta)^\eta \sigma _j}\right) ^2,\]

where \(n_o\) is the number of outputs of the model, and \(X^{\text{obs}}_{ij}\) the measurement of output \(j\) at time point \(t_i\).

Extends ProblemLogLikelihood.

Parameters:problem – A SingleOutputProblem or MultiOutputProblem. For a single-output problem two parameters are added (\(\eta\), \(\sigma\)), for a multi-output problem 2 times \(n_o\) parameters are added.
evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()

See LogPDF.n_parameters().

class pints.ScaledLogLikelihood(log_likelihood)[source]

Calculates a log-likelihood based on a (conditional) ProblemLogLikelihood divided by the number of time samples.

The returned value will be (1 / n) * log_likelihood(x|problem), where n is the number of time samples multiplied by the number of outputs.

This log-likelihood operates on both single and multi-output problems.

Extends ProblemLogLikelihood.

Parameters:log_likelihood – A ProblemLogLikelihood to scale.
evaluateS1(x)[source]

See LogPDF.evaluateS1().

This method only works if the underlying LogPDF object implements the optional method LogPDF.evaluateS1()!

n_parameters()

See LogPDF.n_parameters().

class pints.StudentTLogLikelihood(problem)[source]

Calculates a log-likelihood assuming independent Student-t-distributed noise at each time point, and adds two parameters: one representing the degrees of freedom (nu), the other representing the scale (sigma).

For a noise characterised by nu and sigma, the log likelihood is of the form:

\[\log{L(\theta, \nu, \sigma|\boldsymbol{x})} = N\frac{\nu}{2}\log(\nu) - N\log(\sigma) - N\log B(\nu/2, 1/2) -\frac{1+\nu}{2}\sum_{i=1}^N\log(\nu + \frac{x_i - f(\theta)}{\sigma}^2)\]

where B(.,.) is a beta function.

Extends ProblemLogLikelihood.

Parameters:problem – A SingleOutputProblem or MultiOutputProblem. For a single-output problem two parameters are added (nu, sigma), where nu is the degrees of freedom and sigma is scale, for a multi-output problem 2 * n_outputs parameters are added.
evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()

See LogPDF.n_parameters().

class pints.UnknownNoiseLogLikelihood(problem)[source]

Deprecated alias of GaussianLogLikelihood

evaluateS1(x)

See LogPDF.evaluateS1().

n_parameters()

See LogPDF.n_parameters().

Log-PDFs

LogPDFs are callable objects that represent distributions, including likelihoods and Bayesian priors and posteriors. They are unnormalised, i.e. their area does not necessarily sum up to 1, and for efficiency reasons we always work with the logarithm e.g. a log-likelihood instead of a likelihood.

Example:

p = pints.GaussianLogPrior(mean=0, variance=1)
x = p(0.1)

Overview:

class pints.LogPDF[source]

Represents the natural logarithm of a (not necessarily normalised) probability density function (PDF).

All LogPDF types are callable: when called with a vector argument p they return some value log(f(p)) where f(p) is an unnormalised PDF. The size of the argument p is given by n_parameters().

evaluateS1(x)[source]

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()[source]

Returns the dimension of the space this LogPDF is defined over.

class pints.LogPrior[source]

Represents the natural logarithm log(f(theta)) of a known probability density function f(theta).

Priors are usually normalised (i.e. the integral f(theta) over all points theta in parameter space sums to 1), but this is not a strict requirement.

Extends LogPDF.

cdf(x)[source]

Returns the cumulative density function at point(s) x.

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_from_unit_cube(u)[source]

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)[source]

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

icdf(p)[source]

Returns the inverse cumulative density function at cumulative probability/probabilities p.

p should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

mean()[source]

Returns the analytical value of the expectation of a random variable distributed according to this LogPDF.

n_parameters()

Returns the dimension of the space this LogPDF is defined over.

sample(n=1)[source]

Returns n random samples from the underlying prior distribution.

The returned value is a NumPy array with shape (n, d) where n is the requested number of samples, and d is the dimension of the prior.

class pints.LogPosterior(log_likelihood, log_prior)[source]

Represents the sum of a LogPDF and a LogPrior defined on the same parameter space.

As an optimisation, if the LogPrior evaluates as -inf for a particular point in parameter space, the corresponding LogPDF will not be evaluated.

Extends LogPDF.

Parameters:
  • log_likelihood – A LogPDF, defined on the same parameter space.
  • log_prior – A LogPrior, representing prior knowledge of the parameter space.
evaluateS1(x)[source]

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data has the shape (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

This method only works if the underlying :class:`LogPDF` and :class:`LogPrior` implement the optional method :meth:`LogPDF.evaluateS1()`!

log_likelihood()[source]

Returns the LogLikelihood used by this posterior.

log_prior()[source]

Returns the LogPrior used by this posterior.

n_parameters()[source]

See LogPDF.n_parameters().

class pints.PooledLogPDF(log_pdfs, pooled)[source]

Combines \(m\) LogPDFs, each with \(n\) parameters, into a single LogPDF where \(k\) parameters are “pooled” (i.e. have the same value for each LogPDF), so that the resulting combined LogPDF has \(m (n - k) + k\) independent parameters.

This is useful for e.g. modelling the time-series of multiple individuals (each individual defines a separate LogPDF), and some parameters are expected to be the same across individuals (for example, the noise parameter across different individuals within the same experiment).

For two LogPDFs \(L _1\) and \(L _2\) with four parameters \((\psi ^{(1)}_1, \psi ^{(1)}_2, \psi ^{(1)}_3, \psi ^{(1)}_4)\) and \((\psi ^{(2)}_1, \psi ^{(2)}_2, \psi ^{(2)}_3, \psi ^{(2)}_4)\) respectively, a pooling of the second and third parameter \(\psi _2 := \psi ^{(1)}_2 = \psi ^{(2)}_2\), \(\psi _3 := \psi ^{(1)}_3 = \psi ^{(2)}_3\) results in a pooled log-pdf of the form

\[L(\psi ^{(1)}_1, \psi ^{(1)}_4, \psi ^{(2)}_1, \psi ^{(2)}_4, \psi _2, \psi _3 | D_1, D_2) = L _1(\psi ^{(1)}_1, \psi _2, \psi _3, \psi ^{(1)}_4 | D_1) + L _2(\psi ^{(2)}_1, \psi _2, \psi _3, \psi ^{(2)}_4 | D_2),\]

\(D_i\) is the measured time-series of individual \(i\). As \(k=2\) parameters where pooled across the log-likelihoods, the pooled log-likelihood has six parameters in the following order: \((\psi ^{(1)}_1, \psi ^{(1)}_4, \psi ^{(2)}_1, \psi ^{(2)}_4, \psi _2, \psi _3)\).

Note that the input parameters of a PooledLogPDF are not just a simple concatenation of the parameters of the individual LogPDFs. The pooled parameters are only listed once and are moved to the end of the parameter list. This avoids inputting the value of the pooled parameters at mutliple positions. Otherwise the order of the parameters is determined firstly by the order of the likelihoods and then by the order of the parameters of those likelihoods.

Extends LogPDF.

Parameters:
  • log_pdfs – A sequence of LogPDF objects.
  • pooled – A sequence of booleans indicating which parameters across the likelihoods are pooled (True) or remain unpooled (False).

Example

pooled_log_likelihood = pints.PooledLogPDF(
    log_pdfs=[
        pints.GaussianLogLikelihood(problem1),
        pints.GaussianLogLikelihood(problem2)],
    pooled=[False, True])
evaluateS1(parameters)[source]

See LogPDF.evaluateS1().

The partial derivatives of the pooled log-likelihood with respect to unpooled parameters equals the partial derivative of the corresponding indiviudal log-likelihood.

\[\frac{\partial L}{\partial \psi} = \frac{\partial L_i}{\partial \psi},\]

where \(L\) is the pooled log-likelihood, \(\psi\) an unpooled parameter and \(L _i\) the individual log-likelihood that depends on \(\psi\).

For a pooled parameter \(\theta\) the partial derivative of the pooled log-likelihood equals to the sum of partial derivatives of all individual log-likelihoods

\[\frac{\partial L}{\partial \theta} = \sum _{i=1}^n\frac{\partial L_i}{\partial \theta}.\]

Here \(n\) is the number of individual log-likelihoods.

This method only works if all the underlying :class:`LogPDF` objects implement the optional method :meth:`LogPDF.evaluateS1()`!

n_parameters()[source]

See LogPDF.n_parameters().

class pints.ProblemLogLikelihood(problem)[source]

Represents a log-likelihood on a problem’s parameter space, used to indicate the likelihood of an observed (fixed) time-series given a particular parameter set (variable).

Extends LogPDF.

Parameters:problem – The time-series problem this log-likelihood is defined for.
evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()[source]

See LogPDF.n_parameters().

class pints.SumOfIndependentLogPDFs(log_likelihoods)[source]

Calculates a sum of LogPDF objects, all defined on the same parameter space.

This is useful for e.g. Bayesian inference using a single model evaluated on two independent data sets D and E. In this case,

\[\begin{split}f(\theta|D,E) &= \frac{f(D, E|\theta)f(\theta)}{f(D, E)} \\ &= \frac{f(D|\theta)f(E|\theta)f(\theta)}{f(D, E)}\end{split}\]

Extends LogPDF.

Parameters:log_likelihoods – A sequence of LogPDF objects.

Example

log_likelihood = pints.SumOfIndependentLogPDFs([
    pints.GaussianLogLikelihood(problem1),
    pints.GaussianLogLikelihood(problem2),
])
evaluateS1(x)[source]

See LogPDF.evaluateS1().

This method only works if all the underlying :class:`LogPDF` objects implement the optional method :meth:`LogPDF.evaluateS1()`!

n_parameters()[source]

See LogPDF.n_parameters().

Log-priors

A number of LogPriors are provided for use in e.g. Bayesian inference.

Example:

p = pints.GaussianLogPrior(mean=0, variance=1)
x = p(0.1)

Overview:

class pints.BetaLogPrior(a, b)[source]

Defines a beta (log) prior with given shape parameters a and b, with pdf

\[f(x|a,b) = \frac{x^{a-1} (1-x)^{b-1}}{\mathrm{B}(a,b)}\]

where \(\mathrm{B}\) is the Beta function. A random variable \(X\) distributed according to this pdf has expectation

\[\mathrm{E}(X)=\frac{a}{a+b}.\]

For example, to create a prior with shape parameters a=5 and b=1, use:

p = pints.BetaLogPrior(5, 1)

Extends LogPrior.

cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.CauchyLogPrior(location, scale)[source]

Defines a 1-d Cauchy (log) prior with a given location, and scale, with pdf

\[f(x|\text{location}, \text{scale}) = \frac{1}{\pi\;\text{scale} \left[1 + \left(\frac{x-\text{location}}{\text{scale}}\right)^2 \right]}.\]

A random variable distributed according to this pdf has undefined expectation.

For example, to create a prior centered around 0 and a scale of 5, use:

p = pints.CauchyLogPrior(0, 5)

Extends LogPrior.

Parameters:
  • location – The center of the distribution.
  • scale – The scale of the distribution.
cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.ComposedLogPrior(*priors)[source]

N-dimensional LogPrior composed of one or more other \(N_i\)- dimensional LogPriors, such that \(\sum _i N_i = N\). The evaluation of the composed log-prior assumes the input log-priors are all independent from each other.

For example, a composed log prior

p = pints.ComposedLogPrior(log_prior1, log_prior2, log_prior3),

where log_prior1, log_prior2, and log_prior3 each have dimension 1, 2 and 1, will have dimension 4.

The dimensionality of the individual priors does not have to be the same, i.e. \(N_i\neq N_j\) is allowed.

The input parameters of the ComposedLogPrior have to be ordered in the same way as the individual priors. In the above example the prior may be evaluated by p(x), where:

x = [parameter1_log_prior1, parameter1_log_prior2, parameter2_log_prior2, parameter1_log_prior3].

Extends LogPrior.

cdf(x)[source]

See LogPrior.cdf().

This method only works if the underlying :class:`LogPrior` classes all implement the optional method :class:`LogPDF.cdf().`.

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

This method only works if the underlying :class:`LogPrior` classes all implement the optional method :class:`LogPDF.evaluateS1().`.

icdf(x)[source]

See LogPrior.icdf().

This method only works if the underlying :class:`LogPrior` classes all implement the optional method :class:`LogPDF.icdf().`.

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.ExponentialLogPrior(rate)[source]

Defines an exponential (log) prior with given rate parameter rate with pdf

\[f(x|\text{rate}) = \text{rate} \; e^{-\text{rate}\;x}.\]

A random variable \(X\) distributed according to this pdf has expectation

\[\mathrm{E}(X)=\frac{1}{\text{rate}}.\]

For example, to create a prior with rate=0.5 use:

p = pints.ExponentialLogPrior(0.5)

Extends LogPrior.

cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.GammaLogPrior(a, b)[source]

Defines a gamma (log) prior with given shape parameter a and rate parameter b, with pdf

\[f(x|a,b)=\frac{b^a x^{a-1} e^{-bx}}{\mathrm{\Gamma}(a)}.\]

where \(\Gamma\) is the Gamma function. A random variable \(X\) distributed according to this pdf has expectation

\[\mathrm{E}(X)=\frac{a}{b}.\]

For example, to create a prior with shape parameters a=5 and b=1, use:

p = pints.GammaLogPrior(5, 1)

Extends LogPrior.

cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.GaussianLogPrior(mean, sd)[source]

Defines a 1-d Gaussian (log) prior with a given mean and standard deviation sd, with pdf

\[f(x|\text{mean},\text{sd}) = \frac{1}{\text{sd}\sqrt{2\pi}} \exp\left(-\frac{(x-\text{mean})^2}{2\;\text{sd}^2}\right).\]

A random variable \(X\) distributed according to this pdf has expectation

\[\mathrm{E}(X)=\text{mean}.\]

For example, to create a prior with mean of 0 and a standard deviation of 1, use:

p = pints.GaussianLogPrior(0, 1)

Extends LogPrior.

cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.HalfCauchyLogPrior(location, scale)[source]

Defines a 1-d half-Cauchy (log) prior with a given location and scale. This is a Cauchy distribution that has been truncated to lie in between \((0,\infty)\), with pdf

\[\begin{split}f(x|\text{location},\text{scale})=\begin{cases}\frac{1}{\pi\; \text{scale}\left(\frac{1}{\pi}\arctan\left(\frac{\text{location}} {\text{scale} }\right)+\frac{1}{2}\right)\left(\frac{(x-\text{location} )^2}{\text{scale}^2}+1\right)},&x>0\\0,&\text{otherwise.}\end{cases}\end{split}\]

A random variable distributed according to this pdf has undefined expectation.

For example, to create a prior centered around 0 and a scale of 5, use:

p = pints.HalfCauchyLogPrior(0, 5)

Extends LogPrior.

Parameters:
  • location – The center of the distribution.
  • scale – The scale of the distribution.
cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.InverseGammaLogPrior(a, b)[source]

Defines an inverse gamma (log) prior with given shape parameter a and scale parameter b, with pdf

\[\begin{split}f(x|a,b)=\begin{cases}\frac{b^a}{\Gamma(a)}x^{-a-1}\exp \left(-\frac{b}{x}\right),&x>0\\0,&\text{otherwise.}\end{cases}\end{split}\]

where \(\Gamma\) is the Gamma function. A random variable \(X\) distributed according to this pdf has expectation

\[\begin{split}\mathrm{E}(X)=\begin{cases}\frac{b}{a-1},&a>1\\ \text{undefined},&\text{otherwise.}\end{cases}\end{split}\]

For example, to create a prior with shape parameter a=5 and scale parameter b=1, use:

p = pints.InverseGammaLogPrior(5, 1)

Extends LogPrior.

cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.LogNormalLogPrior(log_mean, scale)[source]

Defines a log-normal (log) prior with a given log_mean and scale scale. The log_mean parameter of a log-normal distribution is the mean of a normal distribution whose random samples, when exponentiated, yield samples from a log-normal distribution. This log-normal distribution has pdf

\[f(x|\text{log_mean},\text{scale}) = \frac{1}{x\;\text{scale} \sqrt{2\pi}}\exp\left(-\frac{(\log x-\text{log_mean})^2}{2\; \text{scale}^2}\right).\]

A random variable \(X\) distributed according to this pdf has expectation

\[\mathrm{E}(X)=\exp\left(\text{log_mean}+\frac{\text{scale}^2}{2} \right).\]

For example, to create a prior with log_mean of 0 and a scale of 1, use:

p = pints.LogNormalLogPrior(0, 1)

Extends LogPrior.

cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.MultivariateGaussianLogPrior(mean, cov)[source]

Defines a multivariate Gaussian (log) prior with a given mean and covariance matrix cov, with pdf

\[f(x|\text{mean},\text{cov}) = \frac{1}{(2\pi)^{d/2}| \text{cov}|^{1/2}} \exp\left(-\frac{1}{2}(x-\text{mean})' \text{cov}^{-1}(x-\text{mean})\right).\]

A random variable \(X\) distributed according to this pdf has expectation

\[\mathrm{E}(X)=\text{mean}.\]

For example, to create a prior with zero mean and identity covariance, use:

p = pints.MultivariateGaussianLogPrior(
        np.array([0, 0]), np.array([[1, 0],[0, 1]]))

Extends LogPrior.

cdf(x)

Returns the cumulative density function at point(s) x.

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_from_unit_cube(u)[source]

Converts a sample u uniformly drawn from the unit cube into one drawn from the prior space, using MultivariateGaussianLogPrior.pseudo_icdf().

convert_to_unit_cube(x)[source]

Converts a sample from the prior x to be drawn uniformly from the unit cube using MultivariateGaussianLogPrior.pseudo_cdf().

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)

Returns the inverse cumulative density function at cumulative probability/probabilities p.

p should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

pseudo_cdf(xs)[source]

Calculates a pseudo-cdf for a multivariate Gaussian as described in Feroz et al. (2009) (“Multnest…”). In this approach, a multivariate Gaussian is factorised:

\[\pi(\theta_1,\theta_2,...,\theta_d) = \pi_1(\theta_1) \pi_2(\theta_2|\theta_1)... \pi_d(\theta_d|\theta_1, \theta_2,...,\theta_{d-1})\]

The cdfs we report are then the values for each individual conditional. For example, for the second component, we calculate:

\[u_2 = \int_{-\infty}^{\theta_2} \pi_2(\theta_2|\theta_1)d\theta_2\]

So that we return a vector of cdfs (u_1,u_2,…,u_d). Note that, this function is mainly to facilitate Multinest sampling since the distribution (u_1,u_2,…,u_d) is uniform within the unit cube.

pseudo_icdf(ps)[source]

Calculates a pseudo-icdf for a multivariate Gaussian as described in Feroz et al. (2009) (“Multnest…”). In this approach, a multivariate Gaussian is factorised:

\[\pi(\theta_1,\theta_2,...,\theta_d) = \pi_1(\theta_1) \pi_2(\theta_2|\theta_1)... \pi_d(\theta_d|\theta_1, \theta_2,...,\theta_{d-1})\]

The icdfs we report are then the values for each individual conditional. For example, for the second component, we calculate the theta_2 value that satisfies:

\[u_2 = \int_{-\infty}^{\theta_2} \pi_2(\theta_2|\theta_1)d\theta_2\]

So that we return a vector of icdfs (theta_1,theta_2,…,theta_d) Note that, this function is mainly to facilitate Multinest sampling since the distribution (u_1,u_2,…,u_d) is uniform within the unit cube.

sample(n=1)[source]

See LogPrior.call().

class pints.NormalLogPrior(mean, standard_deviation)[source]

Deprecated alias of GaussianLogPrior.

cdf(x)

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)

See LogPDF.evaluateS1().

icdf(p)

See LogPrior.icdf().

mean()

See LogPrior.mean().

n_parameters()

See LogPrior.n_parameters().

sample(n=1)

See LogPrior.sample().

class pints.StudentTLogPrior(location, df, scale)[source]

Defines a 1-d Student-t (log) prior with a given location, degrees of freedom df, and scale with pdf

\[f(x|\text{location},\text{scale},\text{df})=\frac{\left(\frac{ \text{df}}{\text{df}+\frac{(x-\text{location})^2}{\text{scale}^2}} \right)^{\frac{\text{df}+1}{2}}}{\sqrt{\text{df}}\;\text{scale} \;\mathrm{B}\left(\frac{\text{df} }{2},\frac{1}{2}\right)}.\]

where \(\mathrm{B}\) is the Beta function. A random variable \(X\) distributed according to this pdf has expectation

\[\begin{split}\mathrm{E}(X)=\begin{cases}\text{location},&\text{df}>1\\\ \text{undefined},&\text{otherwise.}\end{cases}\end{split}\]

For example, to create a prior centered around 0 with 3 degrees of freedom and a scale of 1, use:

p = pints.StudentTLogPrior(0, 3, 1)

Extends LogPrior.

Parameters:
  • location – The center of the distribution.
  • df (int) – The number of degrees of freedom of the distribution.
  • scale – The scale of the distribution.
cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.TruncatedGaussianLogPrior(mean, sd, a, b)[source]

Defines a truncated Gaussian log prior.

This distribution is also known as the truncated Normal distribution.

The truncated Gaussian distribution is similar to the Gaussian distribution, but constrained to lie between two values.

The parameters are the mean mean and standard deviation sd, as in the Gaussian distribution, as well as a lower bound a and an upper bound b.

The pdf of the truncated Gaussian distribution is given by

\[f(x|\mu, \sigma, a, b) = \frac{1}{\sigma\sqrt{2\pi}} \exp \left(-\frac{(x-\mu)^2}{2\sigma^2}\right) \frac{1} {\Phi((b-\mu) / \sigma) - \Phi((a-\mu) / \sigma)}\]

for \(x \in [a, b]\), where \(\mu\) indicates the mean and \(\sigma\) indicates the standard deviation, and \(\Phi\) is the standard normal CDF.

For example, to create a prior with mean of 0 and a standard deviation of 1, bounded above at 3 and below at -2, use:

p = pints.TruncatedGaussianLogPrior(0, 1, -2, 3)

For a Gaussian distribution truncated on only one side, numpy.inf or -numpy.inf can be used for the unbounded side.

Extends LogPrior.

cdf(x)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPDF.evaluateS1().

icdf(p)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

class pints.UniformLogPrior(lower_or_boundaries, upper=None)[source]

Defines a uniform prior over a given range.

The range includes the lower, but not the upper boundaries, so that any point x with a non-zero prior must have lower <= x < upper.

In 1D this has pdf

\[\begin{split}f(x|\text{lower},\text{upper})=\begin{cases}0,&\text{if }x\not\in [\text{lower},\text{upper})\\\frac{1}{\text{upper}-\text{lower}} ,&\text{if }x\in[\text{lower},\text{upper})\end{cases}.\end{split}\]

A random variable \(X\) distributed according to this pdf has expectation

\[\mathrm{E}(X)=\frac{1}{2}(\text{lower}+\text{upper}).\]

For example, to create a prior with \(x\in[0,4]\), \(y\in[1,5]\), and \(z\in[2,6]\) use either:

p = pints.UniformLogPrior([0, 1, 2], [4, 5, 6])

or:

p = pints.UniformLogPrior(RectangularBoundaries([0, 1, 2], [4, 5, 6]))

Extends LogPrior.

cdf(xs)[source]

See LogPrior.cdf().

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(x)[source]

See LogPrior.evaluateS1().

icdf(ps)[source]

See LogPrior.icdf().

mean()[source]

See LogPrior.mean().

n_parameters()[source]

See LogPrior.n_parameters().

sample(n=1)[source]

See LogPrior.sample().

MCMC Samplers

Pints provides a number of MCMC methods, all implementing the MCMC interface, that can be used to sample from an unknown PDF (usually a Bayesian Posterior).

Running an MCMC routine

pints.mcmc_sample(log_pdf, chains, x0, sigma0=None, transformation=None, method=None)[source]

Sample from a pints.LogPDF using a Markov Chain Monte Carlo (MCMC) method.

Parameters:
  • log_pdf (pints.LogPDF) – A LogPDF function that evaluates points in the parameter space.
  • chains (int) – The number of MCMC chains to generate.
  • x0 – A sequence of starting points. Can be a list of lists, a 2-dimensional array, or any other structure such that x0[i] is the starting point for chain i.
  • sigma0 – An optional initial covariance matrix, i.e., a guess of the covariance in logpdf around the points in x0 (the same sigma0 is used for each point in x0). Can be specified as a (d, d) matrix (where d is the dimension of the parameterspace) or as a (d, ) vector, in which case diag(sigma0) will be used.
  • transformation (pints.Transformation) – An optional pints.Transformation to allow the sampler to work in a transformed parameter space. If used, points shown or returned to the user will first be detransformed back to the original space.
  • method (class) – The class of MCMCSampler to use. If no method is specified, HaarioBardenetACMC is used.
class pints.MCMCController(log_pdf, chains, x0, sigma0=None, transformation=None, method=None)[source]

Samples from a pints.LogPDF using a Markov Chain Monte Carlo (MCMC) method.

The method to use (either a SingleChainMCMC class or a MultiChainMCMC class) is specified at runtime. For example:

mcmc = pints.MCMCController(
    log_pdf, 3, x0, method=pints.HaarioBardenetACMC)

Properties related to the number if iterations, parallelisation, and logging can be set directly on the MCMCController object, e.g.:

mcmc.set_max_iterations(1000)

Sampler specific properties must be set on the internal samplers themselves, e.g.:

for sampler in mcmc.samplers():
    sampler.set_target_acceptance_rate(0.2)

Finally, to run an MCMC routine, call:

chains = mcmc.run()

By default, an MCMCController run will write regular progress updates to screen. This can be disabled using set_log_to_screen(). To write a similar progress log to a file, use set_log_to_file(). To store the chains and/or evaluations generated by run() to a file, use set_chain_filename() and set_log_pdf_filename().

Parameters:
  • log_pdf (pints.LogPDF) – A LogPDF function that evaluates points in the parameter space.
  • chains (int) – The number of MCMC chains to generate.
  • x0 – A sequence of starting points. Can be a list of lists, a 2-dimensional array, or any other structure such that x0[i] is the starting point for chain i.
  • sigma0 – An optional initial covariance matrix, i.e., a guess of the covariance in logpdf around the points in x0 (the same sigma0 is used for each point in x0). Can be specified as a (d, d) matrix (where d is the dimension of the parameter space) or as a (d, ) vector, in which case diag(sigma0) will be used.
  • transformation (pints.Transformation) – An optional pints.Transformation to allow the sampler to work in a transformed parameter space. If used, points shown or returned to the user will first be detransformed back to the original space.
  • method (class) – The class of MCMCSampler to use. If no method is specified, HaarioBardenetACMC is used.
chains()[source]

Returns the chains generated by run().

The returned array has shape (n_chains, n_iterations, n_parameters).

If the controller has not run yet, or if chain storage to memory is disabled, this method will return None.

initial_phase_iterations()[source]

For methods that require an initial phase (e.g. an adaptation-free phase for the adaptive covariance MCMC method), this returns the number of iterations that the initial phase will take.

For methods that do not require an initial phase, a NotImplementedError is raised.

log_pdfs()[source]

Returns the LogPDF evaluations generated by run().

If a LogPosterior was used, the returned array will have shape (n_chains, n_iterations, 3), and for each sample the LogPDF, LogLikelihood, and LogPrior will be stored. For all other cases, only the full LogPDF evaluations are returned, in an array of shape (n_chains, n_iterations).

If the controller has not run yet, or if storage of evaluations to memory is disabled (default), this method will return None.

max_iterations()[source]

Returns the maximum iterations if this stopping criterion is set, or None if it is not. See set_max_iterations().

method_needs_initial_phase()[source]

Returns true if this sampler has been created with a method that has an initial phase (see MCMCSampler.needs_initial_phase().)

n_evaluations()[source]

Returns the number of evaluations performed during the last run, or None if the controller hasn’t run yet.

parallel()[source]

Returns the number of parallel worker processes this routine will be run on, or False if parallelisation is disabled.

run()[source]

Runs the MCMC sampler(s) and returns the result.

By default, this method returns an array of shape (n_chains, n_iterations, n_parameters). If storing chains to memory has been disabled with set_chain_storage(), then None is returned instead.

sampler()[source]

Returns the underlying MultiChainMCMC object, or raises an error if SingleChainMCMC objects are being used.

See also: samplers().

samplers()[source]

Returns a list containing the underlying sampler objects.

If a SingleChainMCMC method was selected, this will be a list containing as many SingleChainMCMC objects as the number of chains. If a MultiChainMCMC method was selected, this will be a list containing a single MultiChainMCMC instance.

set_chain_filename(chain_file)[source]

Write chains to disk as they are generated.

If a chain_file is specified, a CSV file will be created for each chain, to which samples will be written as they are accepted. To disable logging of chains, set chain_file=None.

Filenames for each chain file will be derived from chain_file, e.g. if chain_file='chain.csv' and there are 2 chains, then the files chain_0.csv and chain_1.csv will be created. Each CSV file will start with a header (e.g. "p0","p1","p2",...) and contain a sample on each subsequent line.

set_chain_storage(store_in_memory=True)[source]

Store chains in memory as they are generated.

By default, all generated chains are stored in memory as they are generated, and returned by run(). This method allows this behaviour to be disabled, which can be useful for very large chains which are already stored to disk (see set_chain_filename()).

set_initial_phase_iterations(iterations=200)[source]

For methods that require an initial phase (e.g. an adaptation-free phase for the adaptive covariance MCMC method), this sets the number of iterations that the initial phase will take.

For methods that do not require an initial phase, a NotImplementedError is raised.

set_log_interval(iters=20, warm_up=3)[source]

Changes the frequency with which messages are logged.

Parameters:
  • iters (int) – A log message will be shown every iters iterations.
  • warm_up (int) – A log message will be shown every iteration, for the first warm_up iterations.
set_log_pdf_filename(log_pdf_file)[source]

Write LogPDF evaluations to disk as they are generated.

If an evaluation_file is specified, a CSV file will be created for each chain, to which LogPDF evaluations will be written for every accepted sample. To disable this feature, set evaluation_file=None. If the LogPDF being evaluated is a LogPosterior, the individual likelihood and prior will also be stored.

Filenames for each evaluation file will be derived from evaluation_file, e.g. if evaluation_file='evals.csv' and there are 2 chains, then the files evals_0.csv and evals_1.csv will be created. Each CSV file will start with a header (e.g. "logposterior","loglikelihood","logprior") and contain the evaluations for i-th accepted sample on the i-th subsequent line.

set_log_pdf_storage(store_in_memory=False)[source]

Store LogPDF evaluations in memory as they are generated.

By default, evaluations of the LogPDF are not stored. This method can be used to enable storage of the evaluations for the accepted samples. After running, evaluations can be obtained using evaluations().

set_log_to_file(filename=None, csv=False)[source]

Enables progress logging to file when a filename is passed in, disables it if filename is False or None.

The argument csv can be set to True to write the file in comma separated value (CSV) format. By default, the file contents will be similar to the output on screen.

set_log_to_screen(enabled)[source]

Enables or disables progress logging to screen.

set_max_iterations(iterations=10000)[source]

Adds a stopping criterion, allowing the routine to halt after the given number of iterations.

This criterion is enabled by default. To disable it, use set_max_iterations(None).

set_parallel(parallel=False)[source]

Enables/disables parallel evaluation.

If parallel=True, the method will run using a number of worker processes equal to the detected cpu core count. The number of workers can be set explicitly by setting parallel to an integer greater than 0. Parallelisation can be disabled by setting parallel to 0 or False.

time()[source]

Returns the time needed for the last run, in seconds, or None if the controller hasn’t run yet.

class pints.MCMCSampling(log_pdf, chains, x0, sigma0=None, transformation=None, method=None)[source]

Deprecated alias for MCMCController.

chains()

Returns the chains generated by run().

The returned array has shape (n_chains, n_iterations, n_parameters).

If the controller has not run yet, or if chain storage to memory is disabled, this method will return None.

initial_phase_iterations()

For methods that require an initial phase (e.g. an adaptation-free phase for the adaptive covariance MCMC method), this returns the number of iterations that the initial phase will take.

For methods that do not require an initial phase, a NotImplementedError is raised.

log_pdfs()

Returns the LogPDF evaluations generated by run().

If a LogPosterior was used, the returned array will have shape (n_chains, n_iterations, 3), and for each sample the LogPDF, LogLikelihood, and LogPrior will be stored. For all other cases, only the full LogPDF evaluations are returned, in an array of shape (n_chains, n_iterations).

If the controller has not run yet, or if storage of evaluations to memory is disabled (default), this method will return None.

max_iterations()

Returns the maximum iterations if this stopping criterion is set, or None if it is not. See set_max_iterations().

method_needs_initial_phase()

Returns true if this sampler has been created with a method that has an initial phase (see MCMCSampler.needs_initial_phase().)

n_evaluations()

Returns the number of evaluations performed during the last run, or None if the controller hasn’t run yet.

parallel()

Returns the number of parallel worker processes this routine will be run on, or False if parallelisation is disabled.

run()

Runs the MCMC sampler(s) and returns the result.

By default, this method returns an array of shape (n_chains, n_iterations, n_parameters). If storing chains to memory has been disabled with set_chain_storage(), then None is returned instead.

sampler()

Returns the underlying MultiChainMCMC object, or raises an error if SingleChainMCMC objects are being used.

See also: samplers().

samplers()

Returns a list containing the underlying sampler objects.

If a SingleChainMCMC method was selected, this will be a list containing as many SingleChainMCMC objects as the number of chains. If a MultiChainMCMC method was selected, this will be a list containing a single MultiChainMCMC instance.

set_chain_filename(chain_file)

Write chains to disk as they are generated.

If a chain_file is specified, a CSV file will be created for each chain, to which samples will be written as they are accepted. To disable logging of chains, set chain_file=None.

Filenames for each chain file will be derived from chain_file, e.g. if chain_file='chain.csv' and there are 2 chains, then the files chain_0.csv and chain_1.csv will be created. Each CSV file will start with a header (e.g. "p0","p1","p2",...) and contain a sample on each subsequent line.

set_chain_storage(store_in_memory=True)

Store chains in memory as they are generated.

By default, all generated chains are stored in memory as they are generated, and returned by run(). This method allows this behaviour to be disabled, which can be useful for very large chains which are already stored to disk (see set_chain_filename()).

set_initial_phase_iterations(iterations=200)

For methods that require an initial phase (e.g. an adaptation-free phase for the adaptive covariance MCMC method), this sets the number of iterations that the initial phase will take.

For methods that do not require an initial phase, a NotImplementedError is raised.

set_log_interval(iters=20, warm_up=3)

Changes the frequency with which messages are logged.

Parameters:
  • iters (int) – A log message will be shown every iters iterations.
  • warm_up (int) – A log message will be shown every iteration, for the first warm_up iterations.
set_log_pdf_filename(log_pdf_file)

Write LogPDF evaluations to disk as they are generated.

If an evaluation_file is specified, a CSV file will be created for each chain, to which LogPDF evaluations will be written for every accepted sample. To disable this feature, set evaluation_file=None. If the LogPDF being evaluated is a LogPosterior, the individual likelihood and prior will also be stored.

Filenames for each evaluation file will be derived from evaluation_file, e.g. if evaluation_file='evals.csv' and there are 2 chains, then the files evals_0.csv and evals_1.csv will be created. Each CSV file will start with a header (e.g. "logposterior","loglikelihood","logprior") and contain the evaluations for i-th accepted sample on the i-th subsequent line.

set_log_pdf_storage(store_in_memory=False)

Store LogPDF evaluations in memory as they are generated.

By default, evaluations of the LogPDF are not stored. This method can be used to enable storage of the evaluations for the accepted samples. After running, evaluations can be obtained using evaluations().

set_log_to_file(filename=None, csv=False)

Enables progress logging to file when a filename is passed in, disables it if filename is False or None.

The argument csv can be set to True to write the file in comma separated value (CSV) format. By default, the file contents will be similar to the output on screen.

set_log_to_screen(enabled)

Enables or disables progress logging to screen.

set_max_iterations(iterations=10000)

Adds a stopping criterion, allowing the routine to halt after the given number of iterations.

This criterion is enabled by default. To disable it, use set_max_iterations(None).

set_parallel(parallel=False)

Enables/disables parallel evaluation.

If parallel=True, the method will run using a number of worker processes equal to the detected cpu core count. The number of workers can be set explicitly by setting parallel to an integer greater than 0. Parallelisation can be disabled by setting parallel to 0 or False.

time()

Returns the time needed for the last run, in seconds, or None if the controller hasn’t run yet.

MCMC Sampler base classes

class pints.MCMCSampler[source]

Abstract base class for (single or multi-chain) MCMC methods.

All MCMC samplers implement the pints.Loggable and pints.TunableMethod interfaces.

in_initial_phase()[source]

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()

Returns the number of hyper-parameters for this method (see TunableMethod).

name()[source]

Returns this method’s full name.

needs_initial_phase()[source]

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()[source]

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

set_hyper_parameters(x)

Sets the hyper-parameters for the method with the given vector of values (see TunableMethod).

Parameters:x – An array of length n_hyper_parameters used to set the hyper-parameters.
set_initial_phase(in_initial_phase)[source]

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

class pints.SingleChainMCMC(x0, sigma0=None)[source]

Abstract base class for MCMC methods that generate a single markov chain, via an ask-and-tell interface.

Extends MCMCSampler.

Parameters:
  • x0 – An starting point in the parameter space.
  • sigma0 – An optional (initial) covariance matrix, i.e., a guess of the covariance of the distribution to estimate, around x0.
ask()[source]

Returns a parameter vector to evaluate the LogPDF for.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()

Returns the number of hyper-parameters for this method (see TunableMethod).

name()

Returns this method’s full name.

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)[source]

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

set_hyper_parameters(x)

Sets the hyper-parameters for the method with the given vector of values (see TunableMethod).

Parameters:x – An array of length n_hyper_parameters used to set the hyper-parameters.
set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

tell(fx)[source]

Performs an iteration of the MCMC algorithm, using the pints.LogPDF evaluation fx of the point x specified by ask.

For methods that require sensitivities (see MCMCSamper.needs_sensitivities()), fx should be a tuple (log_pdf, sensitivities), containing the values returned by pints.LogPdf.evaluateS1().

After a successful call, tell() returns a tuple (x, fx, accepted), where x contains the current position of the chain, fx contains the corresponding evaluation, and accepted is a boolean indicating whether the last evaluated sample was added to the chain.

Some methods may require multiple ask-tell calls per iteration. These methods can return None to indicate an iteration is still in progress.

class pints.MultiChainMCMC(chains, x0, sigma0=None)[source]

Abstract base class for MCMC methods that generate multiple markov chains, via an ask-and-tell interface.

Extends MCMCSampler.

Parameters:
  • chains (int) – The number of MCMC chains to generate.
  • x0 – A sequence of starting points. Can be a list of lists, a 2-dimensional array, or any other structure such that x0[i] is the starting point for chain i.
  • sigma0 – An optional initial covariance matrix, i.e., a guess of the covariance in logpdf around the points in x0 (the same sigma0 is used for each point in x0). Can be specified as a (d, d) matrix (where d is the dimension of the parameterspace) or as a (d, ) vector, in which case diag(sigma0) will be used.
ask()[source]

Returns a sequence of parameter vectors to evaluate a LogPDF for.

current_log_pdfs()[source]

Returns the log pdf values of the current points (i.e. of the most recent points returned by tell()).

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()

Returns the number of hyper-parameters for this method (see TunableMethod).

name()

Returns this method’s full name.

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

set_hyper_parameters(x)

Sets the hyper-parameters for the method with the given vector of values (see TunableMethod).

Parameters:x – An array of length n_hyper_parameters used to set the hyper-parameters.
set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

tell(fxs)[source]

Performs an iteration of the MCMC algorithm, using the pints.LogPDF evaluations fxs of the points xs specified by ask.

For methods that require sensitivities (see MCMCSamper.needs_sensitivities()), each entry in fxs should be a tuple (log_pdf, sensitivities), containing the values returned by pints.LogPdf.evaluateS1().

After a successful call, tell() returns a tuple (xs, fxs, accepted), where x contains the current position of the chain, fx contains the corresponding evaluation, and accepted is an array of booleans indicating whether the last evaluated sample was added to the chain.

Some methods may require multiple ask-tell calls per iteration. These methods can return None to indicate an iteration is still in progress.

Adaptive Covariance MC

class pints.AdaptiveCovarianceMC(x0, sigma0=None)[source]

Base class for single chain MCMC methods that globally adapt a proposal covariance matrix when running, in order to control the acceptance rate.

Each subclass should provide a method _generate_proposal() that will be called by ask().

Adaptation is implemented with three methods, which are called in sequence, at the end of every tell(): _adapt_mu(), _adapt_sigma(), and _adapt_internal(). A basic implementation is provided for each, which extending methods can choose to override.

Extends SingleChainMCMC.

acceptance_rate()[source]

Returns the current (measured) acceptance rate.

ask()[source]

See SingleChainMCMC.ask().

eta()[source]

Returns eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

in_initial_phase()[source]

See pints.MCMCSampler.in_initial_phase().

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()

Returns this method’s full name.

needs_initial_phase()[source]

See pints.MCMCSampler.needs_initial_phase().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)[source]

See pints.SingleChainMCMC.replace().

set_eta(eta)[source]

Updates eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [eta].

See TunableMethod.set_hyper_parameters().

set_initial_phase(initial_phase)[source]

See pints.MCMCSampler.set_initial_phase().

set_target_acceptance_rate(rate=0.234)[source]

Sets the target acceptance rate.

target_acceptance_rate()[source]

Returns the target acceptance rate.

tell(fx)[source]

See pints.SingleChainMCMC.tell().

Differential Evolution MCMC

class pints.DifferentialEvolutionMCMC(chains, x0, sigma0=None)[source]

Uses differential evolution MCMC as described in [1] to perform posterior sampling from the posterior.

In each step of the algorithm n chains are evolved using the evolution equation:

x_proposed = x[i,r] + gamma * (X[i,r1] - x[i,r2]) + epsilon

where r1 and r2 are random chain indices chosen (without replacement) from the n available chains, which must not equal i or each other, where i indicates the current time step, and epsilon ~ N(0,b) where d is the dimensionality of the parameter vector.

If x_proposed / x[i,r] > u ~ U(0,1), then x[i+1,r] = x_proposed; otherwise, x[i+1,r] = x[i].

Extends MultiChainMCMC.

Note

This sampler requires a number of chains \(n \ge 3\), and recommends \(n \ge 1.5 d\).

References

[1]“A Markov Chain Monte Carlo version of the genetic algorithm Differential Evolution: easy Bayesian computing for real parameter spaces”. Cajo J. F. Ter Braak (2006) Statistical Computing https://doi.org/10.1007/s11222-006-8769-1
ask()[source]

See pints.MultiChainMCMC.ask().

current_log_pdfs()

Returns the log pdf values of the current points (i.e. of the most recent points returned by tell()).

gamma()[source]

Returns the coefficient gamma used in updating the position of each chain.

gamma_switch_rate()[source]

Returns the number of steps between iterations where gamma is set to 1 (then reset immediately afterwards).

gaussian_error()[source]

Returns whether a Gaussian versus uniform error process is used.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

relative_scaling()[source]

Returns whether an error process whose standard deviation scales relatively is used (False indicates absolute scale).

scale_coefficient()[source]

Sets the scale coefficient b of the error process used in updating the position of each chain.

set_gamma(gamma)[source]

Sets the coefficient gamma used in updating the position of each chain.

set_gamma_switch_rate(gamma_switch_rate)[source]

Sets the number of steps between iterations where gamma is set to 1 (then reset immediately afterwards).

set_gaussian_error(gaussian_error)[source]

If True sets the error process to be a gaussian error, N(0, b*); if False, it uses a uniform error U(-b*, b*); where b* = b if absolute scaling used and b* = mu * b if relative scaling is used instead.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [gamma, gaussian_scale_coefficient, gamma_switch_rate, gaussian_error, relative_scaling].

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_relative_scaling(relative_scaling)[source]

Sets whether to use an error process whose standard deviation scales relatively (scale = self._mu * self_b) or absolutely (scale = self._b in all dimensions).

set_scale_coefficient(b)[source]

Sets the scale coefficient b of the error process used in updating the position of each chain.

tell(proposed_log_pdfs)[source]

See pints.MultiChainMCMC.tell().

Dram ACMC

class pints.DramACMC(x0, sigma0=None)[source]

DRAM (Delayed Rejection Adaptive Covariance) MCMC, as described in [1].

In this method, rejections do not necessarily lead an iteration to end. Instead, if a rejection occurs, another point is proposed although typically from a narrower (i.e. more conservative) proposal kernel than was used for the first proposal.

In this approach, in each iteration, the following steps return the next state of the Markov chain (assuming the current state is theta_0 and that there are 2 proposal kernels):

theta_1 ~ N(theta_0, lambda * scale_1 * sigma)
alpha_1(theta_0, theta_1) = min(1, p(theta_1|X) / p(theta_0|X))
u_1 ~ uniform(0, 1)
if alpha_1(theta_0, theta_1) > u_1:
    return theta_1
theta_2 ~ N(theta_0, lambda * scale_2 * sigma0)
alpha_2(theta_0, theta_1, theta_2) =
    min(1, p(theta_2|X) (1 - alpha_1(theta_2, theta_1)) /
           (p(theta_0|X) (1 - alpha_1(theta_0, theta_1))))
u_2 ~ uniform(0, 1)
if alpha_2(theta_0, theta_1, theta_2) > u_2:
    return theta_2
else:
    return theta_0

Our implementation also allows more than 2 proposal kernels to be used. This means that k accept-reject steps are taken. In each step (i), the probability that a proposal theta_i is accepted is:

alpha_i(theta_0, theta_1, ..., theta_i) = min(1, p(theta_i|X) /
                                          p(theta_0|X) * n_i / d_i)

where:

n_i = (1 - alpha_1(theta_i, theta_i-1)) *
      (1 - alpha_2(theta_i, theta_i-1, theta_i-2)) *
       ...
      (1 - alpha_i-1(theta_i, theta_i-1, ..., theta_0))
d_i = (1 - alpha_1(theta_0, theta_1)) *
      (1 - alpha_2(theta_0, theta_1, theta_2)) *
      ...
      (1 - alpha_i-1(theta_0, theta_1, ..., theta_i-1))

If k proposals have been rejected, the initial point theta_0 is returned.

At the end of each iterations, a ‘base’ proposal kernel is adapted:

mu = (1 - gamma) mu + gamma theta
sigma = (1 - gamma) sigma + gamma (theta - mu)(theta - mu)^t
log_lambda = log_lambda + gamma (accepted - target_acceptance_rate)

where gamma = adaptations^-eta, theta is the current state of the Markov chain and accepted is a binary indicator for whether any of the series of proposals were accepted. The kernels for the all proposals are then adapted as [scale_1, scale_2, ..., scale_k] * sigma, where the scale factors are set using set_sigma_scale.

Extends: GlobalAdaptiveCovarianceMC

References

[1](1, 2) “DRAM: Efficient adaptive MCMC”. H Haario, M Laine, A Mira, E Saksman, (2006) Statistical Computing https://doi.org/10.1007/s11222-006-9438-0
acceptance_rate()

Returns the current (measured) acceptance rate.

ask()

See SingleChainMCMC.ask().

eta()

Returns eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

in_initial_phase()

See pints.MCMCSampler.in_initial_phase().

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

n_kernels()[source]

Returns number of proposal kernels.

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

See pints.MCMCSampler.needs_initial_phase().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)

See pints.SingleChainMCMC.replace().

set_eta(eta)

Updates eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [eta, n_kernels, upper_scale].

See TunableMethod.set_hyper_parameters().

set_initial_phase(initial_phase)

See pints.MCMCSampler.set_initial_phase().

set_n_kernels(n_kernels)[source]

Sets number of proposal kernels.

set_sigma_scale()[source]

Set the scale of initial covariance matrix multipliers for each of the kernels: [0,...,upper] where the gradations are uniform on the log10 scale meaning the proposal covariance matrices are: [10^upper,..., 1] * sigma.

set_target_acceptance_rate(rate=0.234)

Sets the target acceptance rate.

set_upper_scale(upper_scale)[source]

Set the upper scale of initial covariance matrix multipliers for each of the kernels: [0,...,upper] where the gradations are uniform on the log10 scale meaning the proposal covariance matrices are: [10^upper,..., 1] * sigma.

sigma_scale()[source]

Returns scale factors used to multiply a base covariance matrix, resulting in proposal matrices for each accept-reject step.

target_acceptance_rate()

Returns the target acceptance rate.

tell(fx)[source]

If first proposal, then accept with ordinary Metropolis probability; if a later proposal, use probability determined by [1].

upper_scale()[source]

Returns upper scale limit (see pints.DramACMC.set_upper_scale()).

DreamMCMC

class pints.DreamMCMC(chains, x0, sigma0=None)[source]

Uses differential evolution adaptive Metropolis (DREAM) MCMC as described in [1] to perform posterior sampling from the posterior.

In each step of the algorithm N chains are evolved using the following steps:

  1. Select proposal:

    x_proposed = x[i,r] + (1 + e) * gamma(delta, d, p_g) *
                 sum_j=1^delta (X[i,r1[j]] - x[i,r2[j]])
                 + epsilon
    

where [r1[j], r2[j]] are random chain indices chosen (without replacement) from the N available chains, which must not equal each other or i, where i indicates the current time step; delta ~ uniform_discrete(1,D) determines the number of terms to include in the summation:

e ~ U(-b*, b*) in d dimensions;
gamma(delta, d, p_g) =
  if p_g < u1 ~ U(0,1):
    2.38 / sqrt(2 * delta * d)
  else:
    1

epsilon ~ N(0,b) in d dimensions (where d is the dimensionality of the parameter vector).

2. Modify random subsets of the proposal according to a crossover probability CR:

for j in 1:N:
  if 1 - CR > u2 ~ U(0,1):
    x_proposed[j] = x[j]
  else:
    x_proposed[j] = x_proposed[j] from 1

If x_proposed / x[i,r] > u ~ U(0,1), then x[i+1,r] = x_proposed; otherwise, x[i+1,r] = x[i].

Here b > 0, b* > 0, 1 >= p_g >= 0, 1 >= CR >= 0.

Extends MultiChainMCMC.

References

[1]“Accelerating Markov Chain Monte Carlo Simulation by Differential Evolution with Self-Adaptive Randomized Subspace Sampling”, 2009, Vrugt et al., International Journal of Nonlinear Sciences and Numerical Simulation. https://doi.org/10.1515/IJNSNS.2009.10.3.273
CR()[source]

Returns the probability of crossover occurring if constant crossover mode is enabled (see set_CR()).

ask()[source]

See pints.MultiChainMCMC.ask().

b()[source]

Returns the Gaussian scale coefficient used in updating the position of each chain.

b_star()[source]

Returns b*, which determines the weight given to other chains’ positions in determining new positions (see set_b_star()).

constant_crossover()[source]

Returns True if constant crossover mode is enabled.

current_log_pdfs()

Returns the log pdf values of the current points (i.e. of the most recent points returned by tell()).

delta_max()[source]

Returns the maximum number of other chains’ positions to use to determine the next sampler position (see set_delta_max()).

in_initial_phase()[source]

See pints.MCMCSampler.in_initial_phase().

nCR()[source]

Returns the size of the discrete crossover probability distribution (only used if constant crossover mode is disabled), see set_nCR().

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()[source]

See pints.MCMCSampler.needs_initial_phase().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

p_g()[source]

Returns p_g. See set_p_g().

set_CR(CR)[source]

Sets the probability of crossover occurring if constant crossover mode is enabled. CR is a probability and so must be in the range [0, 1].

set_b(b)[source]

Sets the Gaussian scale coefficient used in updating the position of each chain (must be non-negative).

set_b_star(b_star)[source]

Sets b*, which determines the weight given to other chains’ positions in determining new positions (must be non-negative).

set_constant_crossover(enabled)[source]

Enables/disables constant-crossover mode (must be bool).

set_delta_max(delta_max)[source]

Sets the maximum number of other chains’ positions to use to determine the next sampler position. delta_max must be in the range [1, nchains - 2].

set_hyper_parameters(x)[source]

The hyper-parameter vector is [b, b_star, p_g, delta_max, initial_phase, constant_crossover, CR, nCR].

See TunableMethod.set_hyper_parameters().

set_initial_phase(initial_phase)[source]

See pints.MCMCSampler.needs_initial_phase().

set_nCR(nCR)[source]

Sets the size of the discrete crossover probability distribution (only used if constant crossover mode is disabled). nCR must be greater than or equal to 2.

set_p_g(p_g)[source]

Sets p_g which is the probability of choosing a higher gamma versus regular (a higher gamma means that other chains are given more weight). p_g must be in the range [0, 1].

tell(proposed_log_pdfs)[source]

See pints.MultiChainMCMC.tell().

Dual Averaging

Dual averaging is not a sampling method, but is a method of adaptivly tuning the Hamintonian Monte Carlo (HMC) step size and mass matrix for the particular log-posterior being sampled. Pint’s NUTS sampler uses dual averaging, but we have defined the dual averaging method separately so that in the future it can be used in HMC and other HMC-derived samplers.

class pints.DualAveragingAdaption(num_warmup_steps, target_accept_prob, init_epsilon, init_inv_mass_matrix)[source]

Dual Averaging method to adaptively tune the step size and mass matrix of a Hamiltonian Monte Carlo (HMC) routine (as used e.g. in NUTS).

Implements a Dual Averaging scheme to adapt the step size epsilon, as per [1] (section 3.2.1 and algorithm 6), and estimates the inverse mass matrix using the sample covariance of the accepted parameter, as suggested in [2]. The mass matrix can either be given as a fully dense matrix represented as a 2D ndarray, or a diagonal matrix represented as a 1D ndarray.

During iteration m of adaption, the parameter epsilon is updated using the following scheme:

\[\bar{H} = (1 - 1/(m + t_0)) \bar{H} + 1/(m + t_0)(\delta_t - \delta) \text{log} \epsilon = \mu - \sqrt{m}/\gamma \bar{H}\]

where $delta_t$ is the target acceptence probability set by the user and $delta$ is the acceptence probability reported by the algorithm (i.e. that is provided as an argument to the step() method.

The adaption is done using the same windowing method employed by Stan, which is done over three or more windows:

  • initial window: epsilon is adapted using dual averaging (no adaption of the mass matrix).
  • base window: epsilon continues to be adapted using dual averaging; this adaption completes at the end of this window. The inverse mass matrix is adaped at the end of the window by taking the sample covariance of all parameter points within this window.
  • terminal window: epsilon is adapted using dual averaging, holding the mass matrix constant, and completes at the end of the window.

If the number of warmup steps requested by the user is greater than the sum of these three windows, then additional base windows are added, each with a size double that of the previous window.

References

[1]Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.
[2]Betancourt, M. (2018). A Conceptual Introduction to Hamiltonian Monte Carlo. https://arxiv.org/abs/1701.02434.
Parameters:
  • num_warmup_steps

    ???

  • target_accept_prob

    ???

  • init_epsilon – An initial guess for the step size epsilon
  • init_inv_mass_matrix – An initial guess for the inverse adapted mass matrix
adapt_epsilon(accept_prob)[source]

Perform a single step of the dual averaging scheme.

add_parameter_sample(sample)[source]

Store the parameter samples to calculate a sample covariance matrix later on.

calculate_sample_variance()[source]

Return the sample covariance of all the stored samples.

final_epsilon()[source]

Perform the final step of the dual averaging scheme.

get_epsilon()[source]

return the step size

get_mass_matrix()[source]

Return the mass matrix.

init_adapt_epsilon(epsilon)[source]

Start a new dual averaging adaption for epsilon.

init_sample_covariance(size)[source]

Start a new adaption window for the inverse mass matrix.

set_inv_mass_matrix(inv_mass_matrix)[source]

We calculate the mass matrix whenever the inverse mass matrix is set.

step(x, accept_prob)[source]

Perform a single step of the adaption.

Parameters:
  • x (ndarray) – The next accepted mcmc parameter point.
  • accept_prob (float) – The acceptance probability of the last NUTS/HMC mcmc step.

EmceeHammerMCMC

class pints.EmceeHammerMCMC(chains, x0, sigma0=None)[source]

Uses the differential evolution algorithm “emcee: the MCMC hammer”, described in Algorithm 2 in [1].

For k in 1:N:

  • Draw a walker X_j at random from the “complementary ensemble” (the group of chains not including k) without replacement.
  • Sample z ~ g(z), (see below).
  • Set Y = X_j(t) + z[X_k(t) - X_j(t)].
  • Set q = z^{d - 1} p(Y) / p(X_k(t)).
  • Sample r ~ U(0, 1).
  • If r <= q, set X_k(t + 1) equal to Y, if not use X_k(t).

Here, N is the number of chains (or walkers), d is the dimensionality of the space, and g(z) is proportional to 1 / sqrt(z) if z is in [1 / a, a] or to 0, otherwise (where a is a parameter with default value 2).

References

[1]“emcee: The MCMC Hammer”, Daniel Foreman-Mackey, David W. Hogg, Dustin Lang, Jonathan Goodman, 2013, arXiv, https://arxiv.org/pdf/1202.3665.pdf
ask()[source]

See pints.MultiChainMCMC.ask().

current_log_pdfs()

Returns the log pdf values of the current points (i.e. of the most recent points returned by tell()).

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

scale()[source]

Returns the scale coefficient a used in updating the position of the chains.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [scale].

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_scale(scale)[source]

Sets the scale coefficient a used in updating the position of the chains.

tell(fx)[source]

See pints.MultiChainMCMC.tell().

Haario ACMC

class pints.HaarioACMC(x0, sigma0=None)[source]

Adaptive Metropolis MCMC, which is algorithm 4 in [1] and is described in the text in [2].

This algorithm differs from HaarioBardenetACMC only through its use of alpha in the updating of log_lambda (rather than a binary accept/reject).

Initialise:

mu
Sigma
adaptation_count = 0
log lambda = 0

In each adaptive iteration (t):

adaptation_count = adaptation_count + 1
gamma = (adaptation_count)^-eta
theta* ~ N(theta_t, lambda * Sigma)
alpha = min(1, p(theta*|data) / p(theta_t|data))
u ~ uniform(0, 1)
if alpha > u:
    theta_(t+1) = theta*
    accepted = 1
else:
    theta_(t+1) = theta_t
    accepted = 0

mu = (1 - gamma) mu + gamma theta_(t+1)
Sigma = (1 - gamma) Sigma + gamma (theta_(t+1) - mu)(theta_(t+1) - mu)
log lambda = log lambda + gamma (alpha - self._target_acceptance)
gamma = adaptation_count^-eta

Extends AdaptiveCovarianceMC.

References

[1]A tutorial on adaptive MCMC Christophe Andrieu and Johannes Thoms, Statistical Computing, 2008, 18: 343-373. https://doi.org/10.1007/s11222-008-9110-y
[2]An adaptive Metropolis algorithm Heikki Haario, Eero Saksman, and Johanna Tamminen (2001) Bernoulli.
acceptance_rate()

Returns the current (measured) acceptance rate.

ask()

See SingleChainMCMC.ask().

eta()

Returns eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

in_initial_phase()

See pints.MCMCSampler.in_initial_phase().

n_hyper_parameters()

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

See pints.MCMCSampler.needs_initial_phase().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)

See pints.SingleChainMCMC.replace().

set_eta(eta)

Updates eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

set_hyper_parameters(x)

The hyper-parameter vector is [eta].

See TunableMethod.set_hyper_parameters().

set_initial_phase(initial_phase)

See pints.MCMCSampler.set_initial_phase().

set_target_acceptance_rate(rate=0.234)

Sets the target acceptance rate.

target_acceptance_rate()

Returns the target acceptance rate.

tell(fx)

See pints.SingleChainMCMC.tell().

Haario Bardenet ACMC

class pints.HaarioBardenetACMC(x0, sigma0=None)[source]

Adaptive Metropolis MCMC, which is algorithm in the supplementary materials of [1], which in turn is based on [2].

Initialise:

mu
Sigma
adaptation_count = 0
log lambda = 0

In each adaptive iteration (t):

adaptation_count = adaptation_count + 1
gamma = (adaptation_count)^-eta
theta* ~ N(theta_t, lambda * Sigma)
alpha = min(1, p(theta*|data) / p(theta_t|data))
u ~ uniform(0, 1)
if alpha > u:
    theta_(t+1) = theta*
    accepted = 1
else:
    theta_(t+1) = theta_t
    accepted = 0

alpha = accepted

mu = (1 - gamma) mu + gamma theta_(t+1)
Sigma = (1 - gamma) Sigma + gamma (theta_(t+1) - mu)(theta_(t+1) - mu)
log lambda = log lambda + gamma (alpha - self._target_acceptance)
gamma = adaptation_count^-eta

Extends AdaptiveCovarianceMC.

References

[1]Johnstone, Chang, Bardenet, de Boer, Gavaghan, Pathmanathan, Clayton, Mirams (2015) “Uncertainty and variability in models of the cardiac action potential: Can we build trustworthy models?” Journal of Molecular and Cellular Cardiology. https://10.1016/j.yjmcc.2015.11.018
[2]Haario, Saksman, Tamminen (2001) “An adaptive Metropolis algorithm” Bernoulli. https://doi.org/10.2307/3318737
acceptance_rate()

Returns the current (measured) acceptance rate.

ask()

See SingleChainMCMC.ask().

eta()

Returns eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

in_initial_phase()

See pints.MCMCSampler.in_initial_phase().

n_hyper_parameters()

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

See pints.MCMCSampler.needs_initial_phase().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)

See pints.SingleChainMCMC.replace().

set_eta(eta)

Updates eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

set_hyper_parameters(x)

The hyper-parameter vector is [eta].

See TunableMethod.set_hyper_parameters().

set_initial_phase(initial_phase)

See pints.MCMCSampler.set_initial_phase().

set_target_acceptance_rate(rate=0.234)

Sets the target acceptance rate.

target_acceptance_rate()

Returns the target acceptance rate.

tell(fx)

See pints.SingleChainMCMC.tell().

class pints.AdaptiveCovarianceMCMC(x0, sigma0=None)[source]

Deprecated alias of pints.HaarioBardenetACMC.

acceptance_rate()

Returns the current (measured) acceptance rate.

ask()

See SingleChainMCMC.ask().

eta()

Returns eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

in_initial_phase()

See pints.MCMCSampler.in_initial_phase().

n_hyper_parameters()

See TunableMethod.n_hyper_parameters().

name()

See pints.MCMCSampler.name().

needs_initial_phase()

See pints.MCMCSampler.needs_initial_phase().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)

See pints.SingleChainMCMC.replace().

set_eta(eta)

Updates eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

set_hyper_parameters(x)

The hyper-parameter vector is [eta].

See TunableMethod.set_hyper_parameters().

set_initial_phase(initial_phase)

See pints.MCMCSampler.set_initial_phase().

set_target_acceptance_rate(rate=0.234)

Sets the target acceptance rate.

target_acceptance_rate()

Returns the target acceptance rate.

tell(fx)

See pints.SingleChainMCMC.tell().

Hamiltonian MCMC

class pints.HamiltonianMCMC(x0, sigma0=None)[source]

Implements Hamiltonian Monte Carlo as described in [1].

Uses a physical analogy of a particle moving across a landscape under Hamiltonian dynamics to aid efficient exploration of parameter space. Introduces an auxilary variable – the momentum (p_i) of a particle moving in dimension i of negative log posterior space – which supplements the position (q_i) of the particle in parameter space. The particle’s motion is dictated by solutions to Hamilton’s equations,

\[\begin{split}dq_i/dt &= \partial H/\partial p_i\\ dp_i/dt &= - \partial H/\partial q_i.\end{split}\]

The Hamiltonian is given by,

\[\begin{split}H(q,p) &= U(q) + KE(p)\\ &= -log(p(q|X)p(q)) + \Sigma_{i=1}^{d} p_i^2/2m_i,\end{split}\]

where d is the dimensionality of model and m_i is the ‘mass’ given to each particle (often chosen to be 1 as default).

To numerically integrate Hamilton’s equations, it is essential to use a sympletic discretisation routine, of which the most typical approach is the leapfrog method,

\[\begin{split}p_i(t + \epsilon/2) &= p_i(t) - (\epsilon/2) d U(q_i(t))/dq_i\\ q_i(t + \epsilon) &= q_i(t) + \epsilon p_i(t + \epsilon/2) / m_i\\ p_i(t + \epsilon) &= p_i(t + \epsilon/2) - (\epsilon/2) d U(q_i(t + \epsilon))/dq_i\end{split}\]

In particular, the algorithm we implement follows eqs. (4.14)-(4.16) in [1], since we allow different epsilon according to dimension.

Extends SingleChainMCMC.

References

[1](1, 2) “MCMC using Hamiltonian dynamics”. Radford M. Neal, Chapter 5 of the Handbook of Markov Chain Monte Carlo by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng.
ask()[source]

See SingleChainMCMC.ask().

divergent_iterations()[source]

Returns the iteration number of any divergent iterations

epsilon()[source]

Returns epsilon used in leapfrog algorithm

hamiltonian_threshold()[source]

Returns threshold difference in Hamiltonian value from one iteration to next which determines whether an iteration is divergent.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

leapfrog_step_size()[source]

Returns the step size for the leapfrog algorithm.

leapfrog_steps()[source]

Returns the number of leapfrog steps to carry out for each iteration.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()[source]

See pints.MCMCSampler.needs_sensitivities().

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

scaled_epsilon()[source]

Returns scaled epsilon used in leapfrog algorithm

set_epsilon(epsilon)[source]

Sets epsilon for the leapfrog algorithm

set_hamiltonian_threshold(hamiltonian_threshold)[source]

Sets threshold difference in Hamiltonian value from one iteration to next which determines whether an iteration is divergent.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [leapfrog_steps, leapfrog_step_size].

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_leapfrog_step_size(step_size)[source]

Sets the step size for the leapfrog algorithm.

set_leapfrog_steps(steps)[source]

Sets the number of leapfrog steps to carry out for each iteration.

tell(reply)[source]

See pints.SingleChainMCMC.tell().

Metropolis-Adjusted Langevin Algorithm (MALA) MCMC

class pints.MALAMCMC(x0, sigma0=None)[source]

Metropolis-Adjusted Langevin Algorithm (MALA), an MCMC sampler as described in [1].

This method involves simulating Langevin diffusion such that the solution to the time evolution equation (the Fokker-Planck PDE) is a stationary distribution that equals the target density (in Bayesian problems, the posterior distribution). The stochastic differential equation (SDE) given below ensures that if \(u(\theta, 0) = \pi(\theta)\), then \(\partial u / \partial t = 0\),

\[\mathrm{d}\Theta_t = 1/2 \nabla \; \text{log} \pi(\Theta_t) \mathrm{d}t + \mathrm{d}W_t\]

where \(\pi(\theta)\) is the target density and \(W\) is a standard multivariate Wiener process.

In general, the above SDE cannot be solved exactly and the below first-order Euler discretisation is used instead,

\[\theta^* = \theta_t + \epsilon^2 1/2 \nabla \; \text{log} \pi(\theta_t) + \epsilon z\]

where \(z \sim \mathcal{N}(0, I)\) resulting in a mean \(\mu(\theta^*) = \theta_t + \epsilon^2 1/2 \nabla \; \text{log} \pi(\theta_t)\).

To correct for first-order integration error that is introduced from discretisation, a Metropolis-Hastings acceptance probability is calculated after a step,

\[\alpha = \frac{\pi(\theta^*)q(\theta_t|\theta^*)}{\pi(\theta_t) q(\theta^*|\theta_t)}\]

where \(q(\theta_2|\theta_1) = \mathcal{N}(\theta_2|\mu(\theta_1), \epsilon I)\) and \(\theta^*\) is accepted with probability \(\text{min}(1, \alpha)\).

Here we consider a slight variant of the above method discussed in [1], which is to use a preconditioning matrix \(M\) to allow differing degrees of freedom in each dimension.

\[\theta^* = \theta_t + \epsilon'^2 1/2 \nabla \; \text{log} \pi(\theta_t) + \epsilon' z\]

leading to \(q(\theta_2|\theta_1) = \mathcal{N}(\theta_2|\mu(\theta_1), \epsilon')\).

where \(\epsilon' = \epsilon \sqrt{M}\) is given by the initial value of sigma0.

Extends SingleChainMCMC.

References

[1](1, 2) Girolami, M. and Calderhead, B., 2011. Riemann manifold langevin and hamiltonian monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2), pp.123-214. https://doi.org/10.1111/j.1467-9868.2010.00765.x
acceptance_rate()[source]

Returns the current (measured) acceptance rate.

ask()[source]

See SingleChainMCMC.ask().

epsilon()[source]

Returns epsilon which is the effective step size used in proposals.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()[source]

See pints.MCMCSampler.needs_sensitivities().

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

set_epsilon(epsilon=None)[source]

Sets epsilon, which is the effective step size used in proposals. If epsilon not specified, then epsilon = 0.2 * diag(sigma0) will be used.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [epsilon].

The effective step size (epsilon) is step_size * scale_vector.

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

tell(reply)[source]

See pints.SingleChainMCMC.tell().

Metropolis Random Walk MCMC

class pints.MetropolisRandomWalkMCMC(x0, sigma0=None)[source]

Metropolis Random Walk MCMC, as described in [1].

Metropolis using multivariate Gaussian distribution as proposal step, also known as Metropolis Random Walk MCMC. In each iteration (t) of the algorithm, the following occurs:

propose x' ~ N(x_t, Sigma)
generate u ~ U(0, 1)
calculate r = pi(x') / pi(x_t)
if r > u, x_t+1 = x'; otherwise, x_t+1 = x_t

here Sigma is the covariance matrix of the proposal.

Extends SingleChainMCMC.

References

[1]“Equation of state calculations by fast computing machines”. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H. and Teller, E. (1953) The journal of chemical physics, 21(6), pp.1087-1092 https://doi.org/10.1063/1.1699114
acceptance_rate()[source]

Returns the current (measured) acceptance rate.

ask()[source]

See SingleChainMCMC.ask().

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()

Returns the number of hyper-parameters for this method (see TunableMethod).

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current=None, current_log_pdf=None, proposed=None)[source]

See pints.SingleChainMCMC.replace().

set_hyper_parameters(x)

Sets the hyper-parameters for the method with the given vector of values (see TunableMethod).

Parameters:x – An array of length n_hyper_parameters used to set the hyper-parameters.
set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

tell(fx)[source]

See pints.SingleChainMCMC.tell().

Monomial-Gamma Hamiltonian MCMC

class pints.MonomialGammaHamiltonianMCMC(x0, sigma0=None)[source]

Implements Monomial Gamma HMC as described in [1] - a generalisation of HMC as described in [2] - involving a non-physical kinetic energy term.

Uses a physical analogy of a particle moving across a landscape under Hamiltonian dynamics to aid efficient exploration of parameter space. Introduces an auxilary variable – the momentum (p_i) of a particle moving in dimension i of negative log posterior space – which supplements the position (q_i) of the particle in parameter space. The particle’s motion is dictated by solutions to Hamilton’s equations,

\[dq_i/dt = \partial H/\partial p_i, dp_i/dt = - \partial H/\partial q_i.\]

The Hamiltonian is given by,

\[H(q,p) = U(q) + K(p) = -log(p(q|X)p(q)) + \Sigma_{i=1}^{d} ( -g(p_i) + (2/c) \text{log}(1 + \text{exp}(cg(p_i))))\]

where d is the dimensionality of model, U is the potential energy and K is the kinetic energy term. Note the kinetic energy is the ‘soft’ version described in [1], where,

\[g(p_i) = (1 / m_i) \text{sign}|p_i|^{1 / a}\]

To numerically integrate Hamilton’s equations, it is essential to use a sympletic discretisation routine, of which the most typical approach is the leapfrog method,

\[\begin{split}p_i(t + \epsilon/2) &= p_i(t) - (\epsilon/2) dU(q_i)/ dq_i\\ q_i(t + \epsilon) &= q_i(t) + \epsilon d K(p_i(t + \epsilon/2))/dp_i\\ p_i(t + \epsilon) &= p_i(t + \epsilon/2) - (\epsilon/2) dU(q_i + \epsilon)/ dq_i\end{split}\]

The derivative of the soft kinetic energy term is given by,

\[d K(p_i)/dp_i = |p_i|^{-1 + 1 / a}\text{sign}(p_i) \times \text{tanh}(c|p_i|^{1/a}\text{sign}(p_i) / {2 m_i}) / {a m_i}\]

In particular, the algorithm we implement follows eqs. (4.14)-(4.16) in [2], since we allow different epsilon according to dimension.

Extends SingleChainMCMC.

References

[1]Towards Unifying Hamiltonian Monte Carlo and Slice Sampling Yizhe Zhang, Xiangyu Wang, Changyou Chen, Ricardo Henao, Kai Fan, Lawrence Cari. Advances in Neural Information Processing Systems (NIPS)
[2](1, 2) MCMC using Hamiltonian dynamics Radford M. Neal, Chapter 5 of the Handbook of Markov Chain Monte Carlo by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng.
a()[source]

Returns a in kinetic energy function.

ask()[source]

See SingleChainMCMC.ask().

c()[source]

Returns c in kinetic energy function.

divergent_iterations()[source]

Returns the iteration number of any divergent iterations.

epsilon()[source]

Returns epsilon used in leapfrog algorithm.

hamiltonian_threshold()[source]

Returns threshold difference in Hamiltonian value from one iteration to next which determines whether an iteration is divergent.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

leapfrog_step_size()[source]

Returns the step size for the leapfrog algorithm.

leapfrog_steps()[source]

Returns the number of leapfrog steps to carry out for each iteration.

mass()[source]

Returns mass m in kinetic energy function.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()[source]

See pints.MCMCSampler.needs_sensitivities().

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

scaled_epsilon()[source]

Returns scaled epsilon used in leapfrog algorithm.

set_a(a)[source]

Sets a in kinetic energy function.

set_c(c)[source]

Sets c in kinetic energy function.

set_epsilon(epsilon)[source]

Sets epsilon for the leapfrog algorithm.

set_hamiltonian_threshold(hamiltonian_threshold)[source]

Sets threshold difference in Hamiltonian value from one iteration to next which determines whether an iteration is divergent.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [leapfrog_steps, leapfrog_step_size, a, c, mass].

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_leapfrog_step_size(step_size)[source]

Sets the step size for the leapfrog algorithm.

set_leapfrog_steps(steps)[source]

Sets the number of leapfrog steps to carry out for each iteration.

set_mass(m)[source]

Sets mass m in kinetic energy function.

tell(reply)[source]

See pints.SingleChainMCMC.tell().

No-U-Turn MCMC Sampler

class pints.NoUTurnMCMC(x0, sigma0=None)[source]

Implements the No U-Turn Sampler (NUTS) with dual averaging, as described in Algorithm 6 in [1].

Implements the multinomial sampling suggested in [2]. Implements a mass matrix for the dynamics, which is detailed in [2]. Both the step size and the mass matrix is adapted using a combination of the dual averaging detailed in [1], and the windowed adaption for the mass matrix and step size implemented in the Stan library (https://github.com/stan-dev/stan).

Like Hamiltonian Monte Carlo, NUTS imagines a particle moving over negative log-posterior (NLP) space to generate proposals. Naturally, the particle tends to move to locations of low NLP – meaning high posterior density. Unlike HMC, NUTS allows the number of steps taken through parameter space to depend on position, allowing local adaptation.

Note: This sampler is only supported on Python versions 3.3 and newer.

Extends SingleChainMCMC.

References

[1](1, 2) Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.
[2](1, 2) Betancourt, M. (2018). A Conceptual Introduction to Hamiltonian Monte Carlo, https://arxiv.org/abs/1701.02434.
ask()[source]

See SingleChainMCMC.ask().

delta()[source]

Returns delta used in leapfrog algorithm.

divergent_iterations()[source]

Returns the iteration number of any divergent iterations.

hamiltonian_threshold()[source]

Returns threshold difference in Hamiltonian value from one iteration to next which determines whether an iteration is divergent.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

max_tree_depth()[source]

Returns the maximum tree depth D for the algorithm. For each iteration, the number of leapfrog steps will not be greater than 2^D.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()[source]

See pints.MCMCSampler.needs_sensitivities().

number_adaption_steps()[source]

Returns number of adaption steps used in the NUTS algorithm.

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

set_delta(delta)[source]

Sets delta for the nuts algorithm. This is the goal acceptance probability for the algorithm. Used to set the scalar magnitude of the leapfrog step size.

set_hamiltonian_threshold(hamiltonian_threshold)[source]

Sets threshold difference in Hamiltonian value from one iteration to next which determines whether an iteration is divergent.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [number_adaption_steps].

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_max_tree_depth(max_tree_depth)[source]

Sets the maximum tree depth D for the algorithm. For each iteration, the number of leapfrog steps will not be greater than 2^D

set_number_adaption_steps(n)[source]

Sets number of adaptions steps in the nuts algorithm. This is the number of mcmc steps that are used to determin the best value for epsilon, the scalar magnitude of the leafrog step size.

set_use_dense_mass_matrix(use_dense_mass_matrix)[source]

If use_dense_mass_matrix is False then algorithm uses a diagonal matrix for the mass matrix. If True then a fully dense mass matrix is used.

tell(reply)[source]

See pints.SingleChainMCMC.tell().

use_dense_mass_matrix()[source]

Returns if the algorithm uses a dense (True) or diagonal (False) mass matrix.

Population MCMC

class pints.PopulationMCMC(x0, sigma0=None)[source]

Creates a chain of samples from a target distribution, using the population MCMC (simulated tempering) routine described in algorithm 1 in [1].

This method uses several chains internally, but only a single one is updated per iteration, and only a single one is returned at the end, hence this method is classified here as a single chain MCMC method.

The algorithm goes through the following steps (after initialising N internal chains):

1. Mutation: randomly select chain i and update the chain using a Markov kernel that admits p_i as its invariant distribution.

2. Exchange: Select another chain j at random from the remaining and swap the parameter vector of i and j with probability min(1, A),

A = p_i(x_j) * p_j(x_i) / (p_i(x_i) * p_j(x_j))

where x_i and x_j are the current values of chains i and j, respectively, where p_i = p(theta|data) ^ (1 - T_i), where p(theta|data) is the target distribution and T_i is bounded between [0, 1] and represents a tempering parameter.

We use a range of T = (0,delta_T,...,1), where delta_T = 1 / num_temperatures, and the chain with T_i = 0 is the one whose target distribution we want to sample.

Extends SingleChainMCMC.

References

[1]“On population-based simulation for static inference”, Ajay Jasra, David A. Stephens and Christopher C. Holmes, Statistical Computing, 2007. https://doi.org/10.1007/s11222-007-9028-9
ask()[source]

See SingleChainMCMC.ask().

in_initial_phase()[source]

See MCMCController.in_initial_phase().

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()[source]

See pints.MCMCSampler.needs_initial_phase().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [n_temperatures], where n_temperatures is an integer that will be passed to set_temperature_schedule().

Note that, since the hyper-parameter vector should be 1d (without nesting), setting an explicit temperature schedule is not supported via the hyper-parameter interface.

See TunableMethod.set_hyper_parameters().

set_initial_phase(phase)[source]

See MCMCController.set_initial_phase().

set_temperature_schedule(schedule=10)[source]

Sets a temperature schedule.

If schedule is an int it is interpreted as the number of temperatures and a schedule is generated accordingly.

If schedule is a list (or array) it is interpreted as a custom temperature schedule.

tell(fx)[source]

See pints.SingleChainMCMC.tell().

temperature_schedule()[source]

Returns the temperature schedule used in the tempering algorithm. Each temperature T pertains to particular chain whose stationary distribution is p(theta|data) ^ (1 - T).

Rao-Blackwell ACMC

class pints.RaoBlackwellACMC(x0, sigma0=None)[source]

Rao-Blackwell adaptive MCMC, as described by Algorithm 3 in [1]. After initialising mu0 and sigma0, in each iteration after initial phase (t), the following steps occur:

theta* ~ N(theta_t, lambda * sigma0)
alpha(theta_t, theta*) = min(1, p(theta*|data) / p(theta_t|data))
u ~ uniform(0, 1)
if alpha(theta_t, theta*) > u:
    theta_t+1 = theta*
else:
    theta_t+1 = theta_t
mu_t+1 = mu_t + gamma_t+1 * (theta_t+1 - mu_t)
sigma_t+1 = sigma_t + gamma_t+1 *
                (bar((theta_t+1 - mu_t)(theta_t+1 - mu_t)') - sigma_t)

where:

bar(theta_t+1) = alpha(theta_t, theta*) theta* +
                    (1 - alpha(theta_t, theta*)) theta_t

Note that we deviate from the paper in two places:

gamma_t = t^-eta
Y_t+1 ~ N(theta_t, lambda * sigma0) rather than
    Y_t+1 ~ N(theta_t, sigma0)

Extends AdaptiveCovarianceMC.

References

[1]A tutorial on adaptive MCMC Christophe Andrieu and Johannes Thoms, Statistical Computing, 2008, 18: 343-373. https://doi.org/10.1007/s11222-008-9110-y
acceptance_rate()

Returns the current (measured) acceptance rate.

ask()

See SingleChainMCMC.ask().

eta()

Returns eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

in_initial_phase()

See pints.MCMCSampler.in_initial_phase().

n_hyper_parameters()

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

See pints.MCMCSampler.needs_initial_phase().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)

See pints.SingleChainMCMC.replace().

set_eta(eta)

Updates eta which controls the rate of adaptation decay adaptations**(-eta), where eta > 0 to ensure asymptotic ergodicity.

set_hyper_parameters(x)

The hyper-parameter vector is [eta].

See TunableMethod.set_hyper_parameters().

set_initial_phase(initial_phase)

See pints.MCMCSampler.set_initial_phase().

set_target_acceptance_rate(rate=0.234)

Sets the target acceptance rate.

target_acceptance_rate()

Returns the target acceptance rate.

tell(fx)[source]

See pints.AdaptiveCovarianceMC.tell().

Relativistic MCMC

class pints.RelativisticMCMC(x0, sigma0=None)[source]

Implements Relativistic Monte Carlo as described in [1].

Uses a physical analogy of a particle moving across a landscape under Hamiltonian dynamics to aid efficient exploration of parameter space. Introduces an auxilary variable – the momentum (p_i) of a particle moving in dimension i of negative log posterior space – which supplements the position (q_i) of the particle in parameter space. The particle’s motion is dictated by solutions to Hamilton’s equations,

\[\begin{split}dq_i/dt &= \partial H/\partial p_i\\ dp_i/dt &= - \partial H/\partial q_i.\end{split}\]

The Hamiltonian is given by,

\[\begin{split}H(q,p) &= U(q) + KE(p)\\ &= -\text{log}(p(q|X)p(q)) + mc^2 (\Sigma_{i=1}^{d} p_i^2 / (m^2 c^2) + 1)^{0.5}\end{split}\]

where d is the dimensionality of model, m is the scalar ‘mass’ given to each particle (chosen to be 1 as default) and c is the speed of light (chosen to be 10 by default).

To numerically integrate Hamilton’s equations, it is essential to use a sympletic discretisation routine, of which the most typical approach is the leapfrog method,

\[\begin{split}p_i(t + \epsilon/2) &= p_i(t) - (\epsilon/2) d U(q_i(t))/dq_i\\ q_i(t + \epsilon) &= q_i(t) + \epsilon M^{-1}(p_i(t + \epsilon/2)) p_i(t + \epsilon/2)\\ p_i(t + \epsilon) &= p_i(t + \epsilon/2) - (\epsilon/2) d U(q_i(t + \epsilon))/dq_i\end{split}\]

where relativistic mass (a scalar) is,

\[M(p) = m (\Sigma_{i=1}^{d} p_i^2 / (m^2 c^2) + 1)^{0.5}\]

In particular, the algorithm we implement follows eqs. in section 2.1 of [1].

Extends SingleChainMCMC.

References

[1](1, 2) “Relativistic Monte Carlo”. Xiaoyu Lu, Valerio Perrone, Leonard Hasenclever, Yee Whye Teh, Sebastian J. Vollmer, 2017, Proceedings of Machine Learning Research.
ask()[source]

See SingleChainMCMC.ask().

divergent_iterations()[source]

Returns the iteration number of any divergent iterations

epsilon()[source]

Returns epsilon used in leapfrog algorithm

hamiltonian_threshold()[source]

Returns threshold difference in Hamiltonian value from one iteration to next which determines whether an iteration is divergent.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

leapfrog_step_size()[source]

Returns the step size for the leapfrog algorithm.

leapfrog_steps()[source]

Returns the number of leapfrog steps to carry out for each iteration.

mass()[source]

Returns mass which is the rest mass of particle.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()[source]

See pints.MCMCSampler.needs_sensitivities().

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

scaled_epsilon()[source]

Returns scaled epsilon used in leapfrog algorithm

set_epsilon(epsilon)[source]

Sets epsilon for the leapfrog algorithm

set_hamiltonian_threshold(hamiltonian_threshold)[source]

Sets threshold difference in Hamiltonian value from one iteration to next which determines whether an iteration is divergent.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [leapfrog_steps, leapfrog_step_size, mass, c].

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_leapfrog_step_size(step_size)[source]

Sets the step size for the leapfrog algorithm.

set_leapfrog_steps(steps)[source]

Sets the number of leapfrog steps to carry out for each iteration.

set_mass(mass)[source]

Sets scalar mass.

set_speed_of_light(c)[source]

Sets speed of light.

speed_of_light()[source]

Returns speed of light.

tell(reply)[source]

See pints.SingleChainMCMC.tell().

Slice Sampling - Doubling MCMC

class pints.SliceDoublingMCMC(x0, sigma0=None)[source]

Implements Slice Sampling with Doubling, as described in [1].

This is a univariate method, which is applied in a Slice-Sampling-within-Gibbs framework to allow MCMC sampling from multivariate models.

Generates samples by sampling uniformly from the volume underneath the posterior (\(f\)). It does so by introducing an auxiliary variable (\(y\)) and by defining a Markov chain.

If the distribution is univariate, sampling follows:

  1. Calculate the pdf (\(f(x0)\)) of the current sample (\(x0\)).
  2. Draw a real value (\(y\)) uniformly from (0, f(x0)), defining a horizontal “slice”: \(S = {x: y < f (x)}\). Note that \(x0\) is always within S.
  3. Find an interval (\(I = (L, R)\)) around \(x0\) that contains all, or much, of the slice.
  4. Draw a new point (\(x1\)) from the part of the slice within this interval.

If the distribution is multivariate, we apply the univariate algorithm to each variable in turn, where the other variables are set at their current values.

This implementation uses the “Doubling” method to estimate the interval \(I = (L, R)\), as described in [1] Fig. 4. pp.715 and consists of the following steps:

  1. \(U \sim uniform(0, 1)\)
  2. \(L = x_0 - wU\)
  3. \(R = L + w\)
  4. \(K = p\)
  5. while \(K > 0\) and \({y < f(L) or y < f(R)}\):
    1. \(V \sim uniform(0, 1)\)
    2. if \(V < 0.5\), then \(L = L - (R - L)\) else, \(R = R + (R - L)\)
  6. \(K = K - 1\)

Intuitively, the interval I is estimated by expanding the initial interval by producing a sequence of intervals, each twice the size of the previous one, until an interval is found with both ends outside the slice, or until a pre-determined limit is reached. The parameters p (an integer, which determines the limit of slice size) and w (the estimate of typical slice width) are hyperparameters.

To sample from the interval \(I = (L, R)\), such that the sample \(x\) satisfies \(y < f(x)\), we use the “Shrinkage” procedure, which reduces the size of the interval after rejecting a trial point, as defined in [1] Fig. 5. pp.716. This algorithm consists of the following steps:

  1. \(\bar{L} = L\) and \(\bar{R} = R\)
    1. Repeat:
      1. \(U \sim uniform(0, 1)\)
      2. \(x_1 = \bar{L} + U (\bar{R} - \bar{L})\)
      3. if \(y < f(x_1)\) and \(Accept(x_1)\), exit loop else: if \(x_1 < x_0\), then \(\bar{L} = x_1\) else \(\bar{R} = x_1\)

Intuitively, we uniformly sample a trial point from the interval I, and subsequently shrink the interval each time a trial point is rejected.

The Accept(x_1) check is required to guarantee detailed balance. We shall refer to this check as the Acceptance Check. Intuitively, it tests whether starting the doubling expansion at x_1 leads to an earlier termination compared to starting it from the current state x_0. The procedure works backward through the intervals that the doubling expansion would pass through to arrive at I when starting from x_1, checking that none of them has both ends outside the slice. The algorithm is described in [1] Fig. 6. pp.717 and it consists of the following steps:

  1. \(\hat{L} = L\) and \(\hat{R} = R\) and \(D = False\)
    1. while \(\hat{R} - \hat{L} > 1.1w\):
      1. M = \((\hat{L} + \hat{R})/2\)
      2. if {\(x_0 < M\) and \(x_1 >= M\)} or {\(x_0 >= M\) and :math:` x_1 < M`}, then \(D = True\)
      3. if \(x_1 < M\), then \(\hat{R} = M\) else, \(\hat{L} = M\)
      4. if \(D\) and \(y >= f(\hat{L})\) and \(y >= f(\hat{R})\), then reject proposal
    2. If the proposal is not rejected in the previous loop, accept it

The multiplication by 1.1 in the while condition in Step 2 guards against possible round-off errors. The variable D tracks whether the intervals that would be generated from x_1 differ from those leading to x_0: when they don’t, time is saved by omitting the subsequent check.

To avoid floating-point underflow, we implement the suggestion advanced in [1] pp.712. We use the log pdf of the un-normalised posterior (\(g(x) = log(f(x))\)) instead of \(f(x)\). In doing so, we use an auxiliary variable \(z = log(y) = g(x0) - \epsilon\), where \(\epsilon \sim \text{exp}(1)\) and define the slice as \(S = {x : z < g(x)}\).

Extends SingleChainMCMC.

References

[1]Neal, R.M., 2003. Slice sampling. The annals of statistics, 31(3), pp.705-767. https://doi.org/10.1214/aos/1056562461
ask()[source]

See SingleChainMCMC.ask().

current_slice_height()[source]

Returns current height value used to define the current slice.

expansion_steps()[source]

Returns integer used for limiting interval expansion.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

set_expansion_steps(p)[source]

Set integer for limiting interval expansion.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [width, expansion steps].

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_width(w)[source]

Sets the width for generating the interval.

This can either be a single number or an array with the same number of elements as the number of variables to update.

tell(fx)[source]

See pints.SingleChainMCMC.tell().

width()[source]

Returns the width used for generating the interval.

Slice Sampling - Rank Shrinking MCMC

class pints.SliceRankShrinkingMCMC(x0, sigma0=None)[source]

Implements Covariance-Adaptive slice sampling by “rank shrinking”, as introduced in [1] with pseudocode given in Fig. 5.

This is an adaptive multivariate method which uses additional points, called “crumbs”, and rejected proposals to guide the selection of samples.

It generates samples by sampling uniformly from the volume underneath the posterior (\(f\)). It does so by introducing an auxiliary variable (\(y\)) that guide the path of a Markov chain.

Sampling follows:

1. Calculate the pdf (\(f(x_0)\)) of the current sample \((x_0)\). 2. Draw a real value (\(y\)) uniformly from \((0, f(x0))\), defining a horizontal “slice”: \(S = {x: y < f(x)}\). Note that \(x_0\) is always within \(S\). 3. Draw the first crumb (\(c_1\)) from a Gaussian distribution with mean \(x_0\) and precision matrix \(W_1\). 4. Draw a new point (\(x_1\)) from a Gaussian distribution with mean \(c_1\) and precision matrix \(W_2\).

New crumbs are drawn until a new proposal is accepted. In particular, after sampling \(k\) crumbs from Gaussian distributions with mean \(x0\) and precision matrices \((W_1, ..., W_k)\), the distribution for the kth proposal sample is:

\[x_k \sim Normal(\bar{c}_k, \Lambda^{-1}_k)\]

where:

\(\Lambda_k = W_1 + ... + W_k\) \(\bar{c}_k = \Lambda^{-1}_k * (W_1 * c_1 + ... + W_k * c_k)\)

This method aims to conveniently modify the (k+1)th proposal distribution to increase the likelihood of sampling an acceptable point. It does so by calculating the gradient (\(g(f(x))\)) of the unnormalised posterior (\(f(x)\)) at the last rejected point (\(x_k\)). It then sets the conditional variance of the (k + 1)th proposal distribution in the direction of the gradient \(g(f(x_k))\) to 0. This is reasonable in that the gradient at a proposal probably points in a direction where the variance is small, so it is more efficient to move in a different direction.

To avoid floating-point underflow, we implement the suggestion advanced in [2] pp.712. We use the log pdf of the un-normalised posterior (\(\text{log} f(x)\)) instead of \(f(x)\). In doing so, we use an auxiliary variable \(z = log(y) - \epsilon\), where \(\epsilon \sim \text{exp}(1)\) and define the slice as \(S = {x : z < log f(x)}\).

Extends SingleChainMCMC.

References

[1]“Covariance-Adaptive Slice Sampling”, 2010, M Thompson and RM Neal, Technical Report No. 1002, Department of Statistics, University of Toronto
[2]“Slice sampling”, 2003, Neal, R.M., The annals of statistics, 31(3), pp.705-767. https://doi.org/10.1214/aos/1056562461
ask()[source]

See SingleChainMCMC.ask().

current_slice_height()[source]

Returns the height of the current slice.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()[source]

See pints.MCMCSampler.needs_sensitivities().

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [sigma_c]. See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_sigma_c(sigma_c)[source]

Sets standard deviation of initial crumb distribution.

sigma_c()[source]

Returns standard deviation of initial crumb distribution.

tell(reply)[source]

See pints.SingleChainMCMC.tell().

Slice Sampling - Stepout MCMC

class pints.SliceStepoutMCMC(x0, sigma0=None)[source]

Implements Slice Sampling with Stepout, as described in [1].

This is a univariate method, which is applied in a Slice-Sampling-within-Gibbs framework to allow MCMC sampling from multivariate models.

Generates samples by sampling uniformly from the volume underneath the posterior (f). It does so by introducing an auxiliary variable (y) and by definying a Markov chain.

If the distribution is univariate, sampling follows:

  1. Calculate the PDF (\(f(x0)\)) of the current sample (\(x0\)).
  2. Draw a real value (\(y\)) uniformly from :math`(0, f(x0))`, defining a horizontal ‘slice’ \(S = {x: y < f (x)}\). Note that \(x0\) is always within \(S\).
  3. Find an interval (\(I = (L, R)\)) around \(x0\) that contains all, or much, of the slice.
  4. Draw a new point (\(x1\)) from the part of the slice within this interval.

If the distribution is multivariate, we apply the univariate algorithm to each variable in turn, where the other variables are set at their current values.

This implementation uses the “Stepout” method to estimate the interval \(I = (L, R)\), as described in [1] Fig. 3. pp.715 and consists of the following steps:

  1. \(U \sim uniform(0, 1)\)
  2. \(L = x_0 - wU\)
  3. \(R = L + w\)
  4. \(V \sim uniform(0, 1)\)
  5. \(J = floor(mV)\)
  6. \(K = (m - 1) - J\)
  7. while \(J > 0\) and \(y < f(L), L = L - w, J = J - 1\)
  8. while \(K > 0\) and \(y < f(R), R = R + w, K = K - 1\)

Intuitively, the interval I is estimated by expanding the initial interval by a width w in each direction until both edges fall outside the slice, or until a pre-determined limit is reached. The parameters m (an integer, which determines the limit of slice size) and w (the estimate of typical slice width) are hyperparameters.

To sample from the interval \(I = (L, R)\), such that the sample x satisfies \(y < f(x)\), we use the “Shrinkage” procedure, which reduces the size of the interval after rejecting a trial point, as defined in [1] Fig. 5. pp.716. This algorithm consists of the following steps:

  1. \(\bar{L} = L\) and \(\bar{R} = R\)
  2. Repeat:
    1. \(U \sim uniform(0, 1)\)
    2. \(x_1 = \bar{L} + U (\bar{R} - \bar{L})\)
    3. if \(y < f(x_1)\) accept \(x_1\) and exit loop, else: if \(x_1 < x_0\), \(\bar{L} = x_1\) else \(\bar{R} = x_1\)

Intuitively, we uniformly sample a trial point from the interval I, and subsequently shrink the interval each time a trial point is rejected.

The following implementation includes the possibility of carrying out “overrelaxed” slice sampling steps, as described in [1] pp. 726. Overrelaxed steps increase sampling efficiency in highly correlated unimodal distributions by suppressing the random walk behaviour of single-variable slice sampling: each variable is still updated in turn, but rather than drawing a new value for a variable from its conditional distribution independently of the current value, the new value is instead chosen to be on the opposite side of the mode from the current value. The interval I is still calculated via Stepout, and the edges l,r are used to estimate the slice endpoints via bisection. To obtain a full sampling scheme, overrelaxed updates are alternated with normal Stepout updates. To obtain the full benefits of overrelaxation, [1] suggests to set almost every update to being overrelaxed and to set the limit m for finding I to infinity. The algorithm consists of the following steps:

  1. \(\bar{L} = L, \bar{R} = R, \bar{w} = w, \bar{a} = a\)
  2. while \(R - L < 1.1 * w\):
    1. \(M = (\bar{L} + \bar{R})/ 2\)
    2. if \(\bar{a} = 0 \), exit loop
    3. if \(x_0 > M\), \(\bar{L} = M\) else, \(\bar{R} = M\)
    4. \(\bar{a} = \bar{a} - 1\)
    5. \(\bar{w} = \bar{w} / 2\)
  3. \(\hat{L} = \bar{L}, \hat{R} = \bar{R}\)
  4. while \(\bar{a} > 0\):
    1. \(\bar{a} = \bar{a} - 1\)
    2. \(\bar{w} = \bar{w} \ 2\)
    3. if \(y >= f(\hat{L} + \bar{w})\), then \(\hat{L} = \hat{L} + \bar{w}\)
    4. if \(y >= f(\hat{R} - \bar{w})\), then \(\hat{R} = \hat{R} - \bar{W}\)
  5. \(x_1 = \hat{L} + \hat{R} - x_0\)
  6. if \(x_1 < \bar{L}\) or \(x_1 >= \bar{R}\) or \(y >= f(x_1)\), then \(x_1 = x_0\)

The probability of pursuing an overrelaxed step and the number of bisection iterations are hyperparameters.

To avoid floating-point underflow, we implement the suggestion advanced in [1] pp.712. We use the log pdf of the un-normalised posterior (\(g(x) = log(f(x))\)) instead of \(f(x)\). In doing so, we use an auxiliary variable \(z = log(y) = g(x0) - \epsilon\), where \(\epsilon \sim \text{exp}(1)\) and define the slice as \(S = {x : z < g(x)}\).

Extends SingleChainMCMC.

References

[1](1, 2) Neal, R.M., 2003. “Slice sampling”. The annals of statistics, 31(3), pp.705-767. https://doi.org/10.1214/aos/1056562461
ask()[source]

See SingleChainMCMC.ask().

bisection_steps()[source]

Returns integer limit overrelaxation endpoint accuracy to 2^(-bisection steps) * width.

current_slice_height()[source]

Returns current height value used to define the current slice.

expansion_steps()[source]

Returns integer used for limiting interval expansion.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.MCMCSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example an adaptation-free period for adaptive covariance methods, or a warm-up phase for DREAM.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated logpdf.

prob_overrelaxed()[source]

Returns probability of carrying out an overrelaxed step.

replace(current, current_log_pdf, proposed=None)

Replaces the internal current position, current LogPDF, and proposed point (if any) by the user-specified values.

This method can only be used once the initial position and LogPDF have been set (so after at least 1 round of ask-and-tell).

This is an optional method, and some samplers may not support it.

set_bisection_steps(a)[source]

Set integer for limiting the bisection process in overrelaxed steps.

set_expansion_steps(m)[source]

Set integer for limiting the interval expansion.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [width, expansion steps, prob_overrelaxed, bisection steps]. See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_prob_overrelaxed(prob)[source]

Set the probability of a step being overrelaxed.

set_width(w)[source]

Sets the width for generating the interval.

This can either be a single number or an array with the same number of elements as the number of variables to update.

tell(fx)[source]

See pints.SingleChainMCMC.tell().

width()[source]

Returns the width used for generating the interval.

MCMC Summary

class pints.MCMCSummary(chains, time=None, parameter_names=None)[source]

Calculates and prints key summaries of posterior samples and diagnostic quantities from MCMC chains.

These include the posterior mean, standard deviation, quantiles, rhat, effective sample size and (if running time is supplied) effective samples per second.

Parameters:
  • chains – An array or list of chains returned by an MCMC sampler.
  • time (float) – The time taken for the run, in seconds (optional).
  • parameter_names (sequence) – A list of parameter names (optional).

References

[1]“Inference from iterative simulation using multiple sequences”, A Gelman and D Rubin, 1992, Statistical Science.
[2](1, 2) “Bayesian data analysis”, 3rd edition, CRC Press., A Gelman et al., 2014.
chains()[source]

Returns posterior samples from all chains separately.

ess()[source]

Return the effective sample size for each parameter as defined in [2].

ess_per_second()[source]

Return the effective sample size (as defined in [2]) per second of run time for each parameter.

This is only defined if a run time was passed in at construction time, if no run time is known None is returned.

mean()[source]

Return the posterior means of all parameters.

quantiles()[source]

Return the 2.5%, 25%, 50%, 75% and 97.5% posterior quantiles.

rhat()[source]

Return Gelman and Rubin’s rhat value as defined in [1]. If a single chain is used, the chain is split into two halves and rhat is calculated using these two parts.

std()[source]

Return the posterior standard deviation of all parameters.

summary()[source]

Return a list of the parameter name, posterior mean, posterior std deviation, the 2.5%, 25%, 50%, 75% and 97.5% posterior quantiles, rhat, effective sample size (ess) and ess per second of run time.

time()[source]

Return the run time taken for sampling.

Nested samplers

Nested sampler base class

class pints.NestedSampler(log_prior)[source]

Abstract base class for nested samplers.

Parameters:log_prior (pints.LogPrior) – A logprior to draw proposal samples from.
active_points()[source]

Returns the active points from nested sampling run.

ask()[source]

Proposes new point at which to evaluate log-likelihood.

in_initial_phase()[source]

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

min_index()[source]

Returns index of sample with lowest log-likelihood.

n_active_points()[source]

Returns the number of active points that will be used in next run.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

Name of sampler

needs_initial_phase()[source]

Returns True if this method needs an initial phase, for example ellipsoidal nested sampling has a period of running rejection sampling before it starts to fit ellipsoids to points.

needs_sensitivities()[source]

Determines whether sampler uses sensitivities of the solution.

running_log_likelihood()[source]

Returns current value of the threshold log-likelihood value.

set_hyper_parameters(x)[source]

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)[source]

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_n_active_points(active_points)[source]

Sets the number of active points for the next run.

tell(fx)[source]

If a single evaluation is provided as arguments, a single point is accepted and returned if its likelihood exceeds the current threshold; otherwise None is returned.

If multiple evaluations are provided as arguments (for example, if running the algorithm in parallel), None is returned if no points have likelihood exceeding threshold; if a single point passes the threshold, it is returned; if multiple points pass, one is selected uniformly at random and returned and the others are stored for later use.

In all cases, two objects are returned: the proposed point (which may be None) and an array of other points that also pass the threshold (which is empty for single evaluation mode but may be non-empty for multiple evaluation mode).

class pints.NestedController(log_likelihood, log_prior, method=None)[source]

Uses nested sampling to sample from a posterior distribution.

Parameters:

References

[1]“Nested Sampling for General Bayesian Computation”, John Skilling, Bayesian Analysis 1:4 (2006). https://doi.org/10.1214/06-BA127
[2]“Multimodal nested sampling: an efficient and robust alternative to Markov chain Monte Carlo methods for astronomical data analyses” F. Feroz and M. P. Hobson, 2008, Mon. Not. R. Astron. Soc.
active_points()[source]

Returns the active points from nested sampling.

effective_sample_size()[source]

Calculates the effective sample size of posterior samples from a nested sampling run using the formula:

\[ESS = exp(-sum_{i=1}^{m} p_i log p_i),\]

in other words, the information. Given by eqn. (39) in [1].

inactive_points()[source]

Returns the inactive points from nested sampling.

iterations()[source]

Returns the total number of iterations that will be performed in the next run.

log_likelihood_vector()[source]

Returns vector of log likelihoods for each of the stacked [m_active, m_inactive] points.

marginal_log_likelihood()[source]

Calculates the marginal log likelihood of nested sampling run.

marginal_log_likelihood_standard_deviation()[source]

Calculates standard deviation in marginal log likelihood as in [2].

marginal_log_likelihood_threshold()[source]

Returns threshold for determining convergence in estimate of marginal log likelihood which leads to early termination of the algorithm.

n_posterior_samples()[source]

Returns the number of posterior samples that will be returned (see set_n_posterior_samples()).

parallel()[source]

Returns the number of parallel worker processes this routine will be run on, or False if parallelisation is disabled.

posterior_samples()[source]

Returns posterior samples generated during run of nested sampling object.

prior_space()[source]

Returns a vector of X samples which approximates the proportion of prior space compressed.

run()[source]

Runs the nested sampling routine and returns a tuple of the posterior samples and an estimate of the marginal likelihood.

sample_from_posterior(posterior_samples)[source]

Draws posterior samples based on nested sampling run using importance sampling. This function is automatically called in NestedController.run() but can also be called afterwards to obtain new posterior samples.

set_iterations(iterations)[source]

Sets the total number of iterations to be performed in the next run.

set_log_to_file(filename=None, csv=False)[source]

Enables logging to file when a filename is passed in, disables it if filename is False or None.

The argument csv can be set to True to write the file in comma separated value (CSV) format. By default, the file contents will be similar to the output on screen.

set_log_to_screen(enabled)[source]

Enables or disables logging to screen.

set_marginal_log_likelihood_threshold(threshold)[source]

Sets threshold for determining convergence in estimate of marginal log likelihood which leads to early termination of the algorithm.

set_n_posterior_samples(posterior_samples)[source]

Sets the number of posterior samples to generate from points proposed by the nested sampling algorithm.

set_parallel(parallel=False)[source]

Enables/disables parallel evaluation.

If parallel=True, the method will run using a number of worker processes equal to the detected cpu core count. The number of workers can be set explicitly by setting parallel to an integer greater than 0. Parallelisation can be disabled by setting parallel to 0 or False.

time()[source]

Returns the time needed for the last run, in seconds, or None if the controller hasn’t run yet.

Nested ellipsoid sampler

class pints.NestedEllipsoidSampler(log_prior)[source]

Creates a nested sampler that estimates the marginal likelihood and generates samples from the posterior.

This is the form of nested sampler described in [1], where an ellipsoid is drawn around surviving particles (typically with an enlargement factor to avoid missing prior mass), and then random samples are drawn from within the bounds of the ellipsoid. By sampling in the space of surviving particles, the efficiency of this algorithm aims to improve upon simple rejection sampling. This algorithm has the following steps:

Initialise:

Z_0 = 0
X_0 = 1

Draw samples from prior:

for i in 1:n_active_points:
    theta_i ~ p(theta), i.e. sample from the prior
    L_i = p(theta_i|X)
endfor
L_min = min(L)
indexmin = min_index(L)

Run rejection sampling for n_rejection_samples to generate an initial sample, along with updated values of L_min and indexmin.

Fit active points using a minimum volume bounding ellipse. In our approach, we do this with the following procedure (which we term minimum_volume_ellipsoid in what follows) that returns the positive definite matrix A with centre c that define the ellipsoid by \((x - c)^t A (x - c) = 1\):

cov = covariance(transpose(active_points))
cov_inv = inv(cov)
c = mean(points)
for i in n_active_points:
    dist[i] = (points[i] - c) * cov_inv * (points[i] - c)
endfor
enlargement_factor = max(dist)
A = (1.0 / enlargement_factor) * cov_inv
return A, c

From then on, in each iteration (t), the following occurs:

if mod(t, ellipsoid_update_gap) == 0:
    A, c = minimum_volume_ellipsoid(active_points)
else:
    if dynamic_enlargement_factor:
        enlargement_factor *= (
            exp(-(t + 1) / n_active_points)**alpha
        )
    endif
endif
theta* = ellipsoid_sample(enlargement_factor, A, c)
while p(theta*|X) < L_min:
    theta* = ellipsoid_sample(enlargement_factor, A, c)
endwhile
theta_indexmin = theta*
L_indexmin = p(theta*|X)

If the parameter dynamic_enlargement_factor is true, the enlargement factor is shrunk as the sampler runs, to avoid inefficiencies in later iterations. By default, the enlargement factor begins at 1.1.

In ellipsoid_sample, a point is drawn uniformly from within the minimum volume ellipsoid, whose volume is increased by a factor enlargement_factor.

At the end of iterations, there is a final Z increment:

Z = Z + (1 / n_active_points) * (L_1 + L_2 + ..., + L_n_active_points)

The posterior samples are generated as described in [2] on page 849 by weighting each dropped sample in proportion to the volume of the posterior region it was sampled from. That is, the probability for drawing a given sample j is given by:

p_j = L_j * w_j / Z

where j = 1, …, n_iterations.

Extends NestedSampler.

References

[1]“A nested sampling algorithm for cosmological model selection”, Pia Mukherjee, David Parkinson, Andrew R. Liddle, 2008. arXiv: arXiv:astro-ph/0508461v2 11 Jan 2006 https://doi.org/10.1086/501068
active_points()

Returns the active points from nested sampling run.

alpha()[source]

Returns alpha which controls rate of decline of enlargement factor with iteration (when dynamic_enlargement_factor is true).

ask(n_points)[source]

If in initial phase, then uses rejection sampling. Afterwards, points are drawn from within an ellipse (needs to be in uniform sampling regime).

dynamic_enlargement_factor()[source]

Returns dynamic enlargement factor.

ellipsoid_update_gap()[source]

Returns the ellipsoid update gap used in the algorithm (see set_ellipsoid_update_gap()).

enlargement_factor()[source]

Returns the enlargement factor used in the algorithm (see set_enlargement_factor()).

in_initial_phase()[source]

See pints.NestedSampler.in_initial_phase().

min_index()

Returns index of sample with lowest log-likelihood.

n_active_points()

Returns the number of active points that will be used in next run.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

n_rejection_samples()[source]

Returns the number of rejection sample used in the algorithm (see set_n_rejection_samples()).

name()[source]

See pints.NestedSampler.name().

needs_initial_phase()[source]

See pints.NestedSampler.needs_initial_phase().

needs_sensitivities()

Determines whether sampler uses sensitivities of the solution.

running_log_likelihood()

Returns current value of the threshold log-likelihood value.

set_alpha(alpha)[source]

Sets alpha which controls rate of decline of enlargement factor with iteration (when dynamic_enlargement_factor is true).

set_dynamic_enlargement_factor(dynamic_enlargement_factor)[source]

Sets dynamic enlargement factor

set_ellipsoid_update_gap(ellipsoid_update_gap=100)[source]

Sets the frequency with which the minimum volume ellipsoid is re-estimated as part of the nested rejection sampling algorithm.

A higher rate of this parameter means each sample will be more efficiently produced, yet the cost of re-computing the ellipsoid may mean it is better to update this not each iteration – instead, with gaps of ellipsoid_update_gap between each update. By default, the ellipse is updated every 100 iterations.

set_enlargement_factor(enlargement_factor=1.1)[source]

Sets the factor (>1) by which to increase the minimal volume ellipsoidal in rejection sampling.

A higher value means it is less likely that areas of high probability mass will be missed. A low value means that rejection sampling is more efficient.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [# active points, # rejection samples, enlargement factor, ellipsoid update gap, dynamic enlargement factor, alpha].

See TunableMethod.set_hyper_parameters().

set_initial_phase(in_initial_phase)[source]

See pints.NestedSampler.set_initial_phase().

set_n_active_points(active_points)

Sets the number of active points for the next run.

set_n_rejection_samples(rejection_samples=200)[source]

Sets the number of rejection samples to take, which will be assigned weights and ultimately produce a set of posterior samples.

tell(fx)

If a single evaluation is provided as arguments, a single point is accepted and returned if its likelihood exceeds the current threshold; otherwise None is returned.

If multiple evaluations are provided as arguments (for example, if running the algorithm in parallel), None is returned if no points have likelihood exceeding threshold; if a single point passes the threshold, it is returned; if multiple points pass, one is selected uniformly at random and returned and the others are stored for later use.

In all cases, two objects are returned: the proposed point (which may be None) and an array of other points that also pass the threshold (which is empty for single evaluation mode but may be non-empty for multiple evaluation mode).

Nested rejection sampler

class pints.NestedRejectionSampler(log_prior)[source]

Creates a nested sampler that estimates the marginal likelihood and generates samples from the posterior.

This is the simplest form of nested sampler and involves using rejection sampling from the prior as described in the algorithm on page 839 in [1] to estimate the marginal likelihood and generate weights, preliminary samples (with their respective likelihoods), required to generate posterior samples.

The posterior samples are generated as described in [1] on page 849 by randomly sampling the preliminary point, accounting for their weights and likelihoods.

Initialise:

Z = 0
X_0 = 1

Draw samples from prior:

for i in 1:n_active_points:
    theta_i ~ p(theta), i.e. sample from the prior
    L_i = p(theta_i|X)
endfor

In each iteration of the algorithm (t):

L_min = min(L)
indexmin = min_index(L)
X_t = exp(-t / n_active_points)
w_t = X_t - X_t-1
Z = Z + L_min * w_t
theta* ~ p(theta)
while p(theta*|X) < L_min:
    theta* ~ p(theta)
endwhile
theta_indexmin = theta*
L_indexmin = p(theta*|X)

At the end of iterations, there is a final Z increment:

Z = Z + (1 / n_active_points) * (L_1 + L_2 + ..., + L_n_active_points)

The posterior samples are generated as described in [1] on page 849 by weighting each dropped sample in proportion to the volume of the posterior region it was sampled from. That is, the probability for drawing a given sample j is given by:

p_j = L_j * w_j / Z

where j = 1, …, n_iterations.

Extends NestedSampler.

References

[1](1, 2) “Nested Sampling for General Bayesian Computation”, John Skilling, Bayesian Analysis 1:4 (2006). https://doi.org/10.1214/06-BA127
active_points()

Returns the active points from nested sampling run.

ask(n_points)[source]

Proposes new point(s) by sampling from the prior.

in_initial_phase()

For methods that need an initial phase (see needs_initial_phase()), this method returns True if the method is currently configured to be in its initial phase. For other methods a NotImplementedError is returned.

min_index()

Returns index of sample with lowest log-likelihood.

n_active_points()

Returns the number of active points that will be used in next run.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See pints.NestedSampler.name().

needs_initial_phase()

Returns True if this method needs an initial phase, for example ellipsoidal nested sampling has a period of running rejection sampling before it starts to fit ellipsoids to points.

needs_sensitivities()

Determines whether sampler uses sensitivities of the solution.

running_log_likelihood()

Returns current value of the threshold log-likelihood value.

set_hyper_parameters(x)[source]

Hyper-parameter vector is: [active_points_rate]

Parameters:x – An array of length n_hyper_parameters used to set the hyper-parameters
set_initial_phase(in_initial_phase)

For methods that need an initial phase (see needs_initial_phase()), this method toggles the initial phase algorithm. For other methods a NotImplementedError is returned.

set_n_active_points(active_points)

Sets the number of active points for the next run.

tell(fx)

If a single evaluation is provided as arguments, a single point is accepted and returned if its likelihood exceeds the current threshold; otherwise None is returned.

If multiple evaluations are provided as arguments (for example, if running the algorithm in parallel), None is returned if no points have likelihood exceeding threshold; if a single point passes the threshold, it is returned; if multiple points pass, one is selected uniformly at random and returned and the others are stored for later use.

In all cases, two objects are returned: the proposed point (which may be None) and an array of other points that also pass the threshold (which is empty for single evaluation mode but may be non-empty for multiple evaluation mode).

Noise generators

Pints contains a module pints.noise that contains methods that generate
different kinds of noise.
This can then be added to simulation output to create “realistic” experimental

data.

Overview:

pints.noise.ar1(rho, sigma, n)[source]

Generates first-order autoregressive (AR1) noise that can be added to a vector of simulated data.

The generated noise follows the distribution

\[e(t) = \rho e(t - 1) + v(t),\]

where \(v(t) \stackrel{\text{iid}}{\sim }\mathcal{N}(0, \sigma \sqrt{1 - \rho ^2})\).

Returns an array of length n containing the generated noise.

Parameters:
  • rho – Determines the magnitude of the noise \(\rho\) (see above). Must be less than 1.
  • sigma – The marginal standard deviation \(\sigma\) of e(t) (see above). Must be greater than zero.
  • n – The length of the signal. (Only single time-series are supported.)

Example

values = model.simulate(parameters, times)
noisy_values = values + noise.ar1(0.9, 5, len(values))
pints.noise.ar1_unity(rho, sigma, n)[source]

Generates noise following an autoregressive order 1 process of mean 1, that a vector of simulated data can be multiplied with.

Returns an array of length n containing the generated noise.

Parameters:
  • rho – Determines the magnitude of the noise (see ar1()). Must be less than or equal to 1.
  • sigma – The marginal standard deviation of e(t) (see ar()). Must be greater than 0.
  • n (int) – The length of the signal. (Only single time-series are supported.)

Example

values = model.simulate(parameters, times)
noisy_values = values * noise.ar1_unity(0.5, 0.8, len(values))
pints.noise.arma11(rho, theta, sigma, n)[source]

Generates an ARMA(1,1) error process of the form:

\[e(t) = (1 - \rho) + \rho * e(t - 1) + v(t) + \theta * v(t-1),\]

where \(v(t) \stackrel{\text{iid}}{\sim }\mathcal{N}(0, \sigma ')\), and

\[\sigma ' = \sigma \sqrt{\frac{1 - \rho ^ 2}{1 + 2 \theta \rho + \theta ^ 2}}.\]
pints.noise.arma11_unity(rho, theta, sigma, n)[source]

Generates an ARMA(1,1) error process of the form:

e(t) = (1 - rho) + rho * e(t - 1) + v(t) + theta * v[t-1],

where v(t) ~ iid N(0, sigma'),

and sigma' = sigma * sqrt((1 - rho^2) / (1 + 2 * theta * rho + theta^2)).

Returns an array of length n containing the generated noise.

Parameters:
  • rho – Determines the long-run persistence of the noise (see ar1()). Must be less than 1.
  • theta – Contributes to first order autocorrelation of noise. Must be less than 1.
  • sigma – The marginal standard deviation of e(t) (see ar()). Must be greater than 0.
  • n (int) – The length of the signal. (Only single time-series are supported.)

Example

values = model.simulate(parameters, times)
noisy_values = values * noise.ar1_unity(0.5, 0.8, len(values))
pints.noise.independent(sigma, shape)[source]

Generates independent Gaussian noise iid \(\mathcal{N}(0,\sigma)\).

Returns an array of shape shape containing the generated noise.

Parameters:
  • sigma – The standard deviation of the noise. Must be zero or greater.
  • shape – A tuple (or sequence) defining the shape of the generated noise array.

Example

values = model.simulate(parameters, times)
noisy_values = values + noise.independent(5, values.shape)
pints.noise.multiplicative_gaussian(eta, sigma, f)[source]

Generates multiplicative Gaussian noise for a single output.

With multiplicative noise, the measurement error scales with the magnitude of the output. Given a model taking the form,

\[X(t) = f(t; \theta) + \epsilon(t)\]

multiplicative Gaussian noise models the noise term as:

\[\epsilon(t) = f(t; \theta)^\eta v(t)\]

where v(t) is iid Gaussian:

\[v(t) \stackrel{\text{ iid }}{\sim} \mathcal{N}(0, \sigma)\]

The output magnitudes f are required as an input to this function. The noise terms are returned in an array of the same shape as f.

Parameters:
  • eta – The exponential power controlling the rate at which the noise scales with the output. The argument must be either a float (for single-output or multi-output noise) or an array_like of floats (for multi-output noise only, with one value for each output).
  • sigma – The baseline standard deviation of the noise (must be greater than zero). The argument must be either a float (for single-output or multi-output noise) or an array_like of floats (for multi-output noise only, with one value for each output).
  • f – A NumPy array giving the time-series for the output over time. For multiple outputs, the array should have shape (n_outputs, n_times).

Optimisers

Pints provides a number of optimisers, all implementing the Optimiser interface, that can be used to find the parameters that minimise an ErrorMeasure or maximise a LogPDF.

The easiest way to run an optimisation is by using the optimise() method or the OptimisationController class.

Running an optimisation

pints.optimise(function, x0, sigma0=None, boundaries=None, transformation=None, method=None)[source]

Finds the parameter values that minimise an ErrorMeasure or maximise a LogPDF.

Parameters:
  • function – An pints.ErrorMeasure or a pints.LogPDF that evaluates points in the parameter space.
  • x0 – The starting point for searches in the parameter space. This value may be used directly (for example as the initial position of a particle in PSO) or indirectly (for example as the center of a distribution in XNES).
  • sigma0 – An optional initial standard deviation around x0. Can be specified either as a scalar value (one standard deviation for all coordinates) or as an array with one entry per dimension. Not all methods will use this information.
  • boundaries – An optional set of boundaries on the parameter space.
  • transformation – An optional pints.Transformation to allow the optimiser to search in a transformed parameter space. If used, points shown or returned to the user will first be detransformed back to the original space.
  • method – The class of pints.Optimiser to use for the optimisation. If no method is specified, CMAES is used.
Returns:

  • xbest (numpy array) – The best parameter set obtained
  • fbest (float) – The corresponding score.

class pints.OptimisationController(function, x0, sigma0=None, boundaries=None, transformation=None, method=None)[source]

Finds the parameter values that minimise an ErrorMeasure or maximise a LogPDF.

Parameters:
  • function – An pints.ErrorMeasure or a pints.LogPDF that evaluates points in the parameter space.
  • x0 – The starting point for searches in the parameter space. This value may be used directly (for example as the initial position of a particle in PSO) or indirectly (for example as the center of a distribution in XNES).
  • sigma0 – An optional initial standard deviation around x0. Can be specified either as a scalar value (one standard deviation for all coordinates) or as an array with one entry per dimension. Not all methods will use this information.
  • boundaries – An optional set of boundaries on the parameter space.
  • transformation – An optional pints.Transformation to allow the optimiser to search in a transformed parameter space. If used, points shown or returned to the user will first be detransformed back to the original space.
  • method – The class of pints.Optimiser to use for the optimisation. If no method is specified, CMAES is used.
evaluations()[source]

Returns the number of evaluations performed during the last run, or None if the controller hasn’t ran yet.

iterations()[source]

Returns the number of iterations performed during the last run, or None if the controller hasn’t ran yet.

max_iterations()[source]

Returns the maximum iterations if this stopping criterion is set, or None if it is not. See set_max_iterations().

max_unchanged_iterations()[source]

Returns a tuple (iterations, threshold) specifying a maximum unchanged iterations stopping criterion, or (None, None) if no such criterion is set. See set_max_unchanged_iterations().

optimiser()[source]

Returns the underlying optimiser object, allowing detailed configuration.

parallel()[source]

Returns the number of parallel worker processes this routine will be run on, or False if parallelisation is disabled.

run()[source]

Runs the optimisation, returns a tuple (xbest, fbest).

An optional callback function can be passed in that will be called at the end of every iteration. The callback should take the arguments (iteration, optimiser), where iteration is the iteration count (an integer) and optimiser is the optimiser object.

set_callback(cb=None)[source]

Allows a “callback” function to be passed in that will be called at the end of every iteration.

This can be used for e.g. visualising optimiser progress.

Example:

def cb(opt):
    plot(opt.xbest())

opt.set_callback(cb)
set_log_interval(iters=20, warm_up=3)[source]

Changes the frequency with which messages are logged.

Parameters:
  • interval – A log message will be shown every iters iterations.
  • warm_up – A log message will be shown every iteration, for the first warm_up iterations.
set_log_to_file(filename=None, csv=False)[source]

Enables logging to file when a filename is passed in, disables it if filename is False or None.

The argument csv can be set to True to write the file in comma separated value (CSV) format. By default, the file contents will be similar to the output on screen.

set_log_to_screen(enabled)[source]

Enables or disables logging to screen.

set_max_iterations(iterations=10000)[source]

Adds a stopping criterion, allowing the routine to halt after the given number of iterations.

This criterion is enabled by default. To disable it, use set_max_iterations(None).

set_max_unchanged_iterations(iterations=200, threshold=1e-11)[source]

Adds a stopping criterion, allowing the routine to halt if the objective function doesn’t change by more than threshold for the given number of iterations.

This criterion is enabled by default. To disable it, use set_max_unchanged_iterations(None).

set_parallel(parallel=False)[source]

Enables/disables parallel evaluation.

If parallel=True, the method will run using a number of worker processes equal to the detected cpu core count. The number of workers can be set explicitly by setting parallel to an integer greater than 0. Parallelisation can be disabled by setting parallel to 0 or False.

set_threshold(threshold)[source]

Adds a stopping criterion, allowing the routine to halt once the objective function goes below a set threshold.

This criterion is disabled by default, but can be enabled by calling this method with a valid threshold. To disable it, use set_treshold(None).

threshold()[source]

Returns the threshold stopping criterion, or None if no threshold stopping criterion is set. See set_threshold().

time()[source]

Returns the time needed for the last run, in seconds, or None if the controller hasn’t run yet.

class pints.Optimisation(function, x0, sigma0=None, boundaries=None, transformation=None, method=None)[source]

Deprecated alias for OptimisationController.

evaluations()

Returns the number of evaluations performed during the last run, or None if the controller hasn’t ran yet.

iterations()

Returns the number of iterations performed during the last run, or None if the controller hasn’t ran yet.

max_iterations()

Returns the maximum iterations if this stopping criterion is set, or None if it is not. See set_max_iterations().

max_unchanged_iterations()

Returns a tuple (iterations, threshold) specifying a maximum unchanged iterations stopping criterion, or (None, None) if no such criterion is set. See set_max_unchanged_iterations().

optimiser()

Returns the underlying optimiser object, allowing detailed configuration.

parallel()

Returns the number of parallel worker processes this routine will be run on, or False if parallelisation is disabled.

run()

Runs the optimisation, returns a tuple (xbest, fbest).

An optional callback function can be passed in that will be called at the end of every iteration. The callback should take the arguments (iteration, optimiser), where iteration is the iteration count (an integer) and optimiser is the optimiser object.

set_callback(cb=None)

Allows a “callback” function to be passed in that will be called at the end of every iteration.

This can be used for e.g. visualising optimiser progress.

Example:

def cb(opt):
    plot(opt.xbest())

opt.set_callback(cb)
set_log_interval(iters=20, warm_up=3)

Changes the frequency with which messages are logged.

Parameters:
  • interval – A log message will be shown every iters iterations.
  • warm_up – A log message will be shown every iteration, for the first warm_up iterations.
set_log_to_file(filename=None, csv=False)

Enables logging to file when a filename is passed in, disables it if filename is False or None.

The argument csv can be set to True to write the file in comma separated value (CSV) format. By default, the file contents will be similar to the output on screen.

set_log_to_screen(enabled)

Enables or disables logging to screen.

set_max_iterations(iterations=10000)

Adds a stopping criterion, allowing the routine to halt after the given number of iterations.

This criterion is enabled by default. To disable it, use set_max_iterations(None).

set_max_unchanged_iterations(iterations=200, threshold=1e-11)

Adds a stopping criterion, allowing the routine to halt if the objective function doesn’t change by more than threshold for the given number of iterations.

This criterion is enabled by default. To disable it, use set_max_unchanged_iterations(None).

set_parallel(parallel=False)

Enables/disables parallel evaluation.

If parallel=True, the method will run using a number of worker processes equal to the detected cpu core count. The number of workers can be set explicitly by setting parallel to an integer greater than 0. Parallelisation can be disabled by setting parallel to 0 or False.

set_threshold(threshold)

Adds a stopping criterion, allowing the routine to halt once the objective function goes below a set threshold.

This criterion is disabled by default, but can be enabled by calling this method with a valid threshold. To disable it, use set_treshold(None).

threshold()

Returns the threshold stopping criterion, or None if no threshold stopping criterion is set. See set_threshold().

time()

Returns the time needed for the last run, in seconds, or None if the controller hasn’t run yet.

Optimiser base classes

class pints.Optimiser(x0, sigma0=None, boundaries=None)[source]

Base class for optimisers implementing an ask-and-tell interface.

This interface provides fine-grained control. Users seeking to simply run an optimisation may wish to use the OptimisationController instead.

Optimisation using “ask-and-tell” proceed by the user repeatedly “asking” the optimiser for points, and then “telling” it the function evaluations at those points. This allows a user to have fine-grained control over an optimisation, and implement custom parallelisation, logging, stopping criteria etc. Users who don’t need this functionality can use optimisers via the OptimisationController class instead.

All PINTS optimisers are _minimisers_. To maximise a function simply pass in the negative of its evaluations to tell() (this is handled automatically by the OptimisationController).

All optimisers implement the pints.Loggable and pints.TunableMethod interfaces.

Parameters:
  • x0 – A starting point for searches in the parameter space. This value may be used directly (for example as the initial position of a particle in PSO) or indirectly (for example as the center of a distribution in XNES).
  • sigma0 – An optional initial standard deviation around x0. Can be specified either as a scalar value (one standard deviation for all coordinates) or as an array with one entry per dimension. Not all methods will use this information.
  • boundaries – An optional set of boundaries on the parameter space.

Example

An optimisation with ask-and-tell, proceeds roughly as follows:

optimiser = MyOptimiser()
running = True
while running:
    # Ask for points to evaluate
    xs = optimiser.ask()

    # Evaluate the score function or pdf at these points
    # At this point, code to parallelise evaluation can be added in
    fs = [f(x) for x in xs]

    # Tell the optimiser the evaluations; allowing it to update its
    # internal state.
    optimiser.tell(fs)

    # Check stopping criteria
    # At this point, custom stopping criteria can be added in
    if optimiser.fbest() < threshold:
        running = False

    # Check for optimiser issues
    if optimiser.stop():
        running = False

    # At this point, code to visualise or benchmark optimiser behaviour
    # could be added in, for example by plotting `xs` in the parameter
    # space.
ask()[source]

Returns a list of positions in the search space to evaluate.

fbest()[source]

Returns the objective function evaluated at the current best position.

n_hyper_parameters()

Returns the number of hyper-parameters for this method (see TunableMethod).

name()[source]

Returns this method’s full name.

needs_sensitivities()[source]

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated error.

running()[source]

Returns True if this an optimisation is in progress.

set_hyper_parameters(x)

Sets the hyper-parameters for the method with the given vector of values (see TunableMethod).

Parameters:x – An array of length n_hyper_parameters used to set the hyper-parameters.
stop()[source]

Checks if this method has run into trouble and should terminate. Returns False if everything’s fine, or a short message (e.g. “Ill-conditioned matrix.”) if the method should terminate.

tell(fx)[source]

Performs an iteration of the optimiser algorithm, using the evaluations fx of the points x previously specified by ask.

For methods that require sensitivities (see needs_sensitivities()), fx should be a tuple (objective, sensitivities), containing the values returned by pints.ErrorMeasure.evaluateS1().

xbest()[source]

Returns the current best position.

class pints.PopulationBasedOptimiser(x0, sigma0=None, boundaries=None)[source]

Base class for optimisers that work by moving multiple points through the search space.

Extends Optimiser.

ask()

Returns a list of positions in the search space to evaluate.

fbest()

Returns the objective function evaluated at the current best position.

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()

Returns this method’s full name.

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated error.

population_size()[source]

Returns this optimiser’s population size.

If no explicit population size has been set, None may be returned. Once running, the correct value will always be returned.

running()

Returns True if this an optimisation is in progress.

set_hyper_parameters(x)[source]

The hyper-parameter vector is [population_size].

See TunableMethod.set_hyper_parameters().

set_population_size(population_size=None)[source]

Sets a population size to use in this optimisation.

If population_size is set to None, the population size will be set using the heuristic suggested_population_size().

stop()

Checks if this method has run into trouble and should terminate. Returns False if everything’s fine, or a short message (e.g. “Ill-conditioned matrix.”) if the method should terminate.

suggested_population_size(round_up_to_multiple_of=None)[source]

Returns a suggested population size for this method, based on the dimension of the search space (e.g. the parameter space).

If the optional argument round_up_to_multiple_of is set to an integer greater than 1, the method will round up the estimate to a multiple of that number. This can be useful to obtain a population size based on e.g. the number of worker processes used to perform objective function evaluations.

tell(fx)

Performs an iteration of the optimiser algorithm, using the evaluations fx of the points x previously specified by ask.

For methods that require sensitivities (see needs_sensitivities()), fx should be a tuple (objective, sensitivities), containing the values returned by pints.ErrorMeasure.evaluateS1().

xbest()

Returns the current best position.

Convenience methods

pints.fmin(f, x0, args=None, boundaries=None, threshold=None, max_iter=None, max_unchanged=200, verbose=False, parallel=False, method=None)[source]

Minimises a callable function f, starting from position x0, using a pints.Optimiser.

Returns a tuple (xbest, fbest) with the best position found, and the corresponding value fbest = f(xbest).

Parameters:
  • f – A function or callable class to be minimised.
  • x0 – The initial point to search at. Must be a 1-dimensional sequence (e.g. a list or a numpy array).
  • args – An optional tuple of extra arguments for f.
  • boundaries – An optional pints.Boundaries object or a tuple (lower, upper) specifying lower and upper boundaries for the search. If no boundaries are provided an unbounded search is run.
  • threshold – An optional absolute threshold stopping criterium.
  • max_iter – An optional maximum number of iterations stopping criterium.
  • max_unchanged – A stopping criterion based on the maximum number of successive iterations without a signficant change in f (see pints.OptimisationController()).
  • verbose – Set to True to print progress messages to the screen.
  • parallel – Allows parallelisation to be enabled. If set to True, the evaluations will happen in parallel using a number of worker processes equal to the detected cpu core count. The number of workers can be set explicitly by setting parallel to an integer greater than 0.
  • method – The pints.Optimiser to use. If no method is specified, pints.CMAES is used.

Example

import pints

def f(x):
    return (x[0] - 3) ** 2 + (x[1] + 5) ** 2

xopt, fopt = pints.fmin(f, [1, 1])
pints.curve_fit(f, x, y, p0, boundaries=None, threshold=None, max_iter=None, max_unchanged=200, verbose=False, parallel=False, method=None)[source]

Fits a function f(x, *p) to a dataset (x, y) by finding the value of p for which sum((y - f(x, *p))**2) / n is minimised (where n is the number of entries in y).

Returns a tuple (xbest, fbest) with the best position found, and the corresponding value fbest = f(xbest).

Parameters:
  • f (callable) – A function or callable class to be minimised.
  • x – The values of an independent variable, at which y was recorded.
  • y – Measured values y = f(x, p) + noise.
  • p0 – An initial guess for the optimal parameters p.
  • boundaries – An optional pints.Boundaries object or a tuple (lower, upper) specifying lower and upper boundaries for the search. If no boundaries are provided an unbounded search is run.
  • threshold – An optional absolute threshold stopping criterium.
  • max_iter – An optional maximum number of iterations stopping criterium.
  • max_unchanged – A stopping criterion based on the maximum number of successive iterations without a signficant change in f (see pints.OptimisationController()).
  • verbose – Set to True to print progress messages to the screen.
  • parallel – Allows parallelisation to be enabled. If set to True, the evaluations will happen in parallel using a number of worker processes equal to the detected cpu core count. The number of workers can be set explicitly by setting parallel to an integer greater than 0.
  • method – The pints.Optimiser to use. If no method is specified, pints.CMAES is used.
Returns:

  • xbest (numpy array) – The best parameter set obtained.
  • fbest (float) – The corresponding score.

Example

import numpy as np
import pints

def f(x, a, b, c):
    return a + b * x + c * x ** 2

x = np.linspace(-5, 5, 100)
y = f(x, 1, 2, 3) + np.random.normal(0, 1)

p0 = [0, 0, 0]
popt = pints.curve_fit(f, x, y, p0)

Boundary transformations

class pints.TriangleWaveTransform(boundaries)[source]

Transforms from unbounded to (rectangular) bounded parameter space using a periodic triangle-wave transform.

Note: The transform is applied _inside_ optimisation methods, there is no need to wrap this around your own problem or score function.

This can be applied as a transformation on x to implement _rectangular_ boundaries in methods with no natural boundary mechanism. It effectively mirrors the search space at every boundary, leading to a continuous (but non-smooth) periodic landscape. While this effectively creates an infinite number of minima/maxima, each one maps to the same point in parameter space.

It should work well for methods that maintain a single search position or a single search distribution (e.g. CMAES, xNES, SNES), which will end up in one of the many mirror images. However, for methods that use independent search particles (e.g. PSO) it could lead to a scattered population, with different particles exploring different mirror images. Other strategies should be used for such problems.

Bare-bones CMA-ES

class pints.BareCMAES(x0, sigma0=0.1, boundaries=None)[source]

Finds the best parameters using the CMA-ES method described in [1, 2], using a bare bones re-implementation.

For general use, we recommend the pints.CMAES optimiser, which wraps around the cma module provided by the authors of CMA-ES. The cma module provides a battle-tested version of the optimiser.

The role of this class, is to provide a simpler implementation of only the core algorithm of CMA-ES, which is easier to read and analyse, and which can be used to compare with bare implementations of other methods.

Extends PopulationBasedOptimiser.

References

[1]The CMA Evolution Strategy: A Tutorial Nikolaus Hanse, arxiv https://arxiv.org/abs/1604.00772
[2]Hansen, Mueller, Koumoutsakos (2003) “Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES)”. Evolutionary Computation https://doi.org/10.1162/106365603321828970
ask()[source]

See Optimiser.ask().

cov(decomposed=False)[source]

Returns the current covariance matrix C of the proposal distribution.

If the optional argument decomposed is set to True, a tuple (R, S) will be returned such that R contains the eigenvectors of C while S is a diagonal matrix containing the squares of the eigenvalues of C, such that C = R S S R.T.

fbest()[source]

See Optimiser.fbest().

mean()[source]

Returns the current mean of the proposal distribution.

n_hyper_parameters()

See TunableMethod.n_hyper_parameters().

name()[source]

See Optimiser.name().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated error.

population_size()

Returns this optimiser’s population size.

If no explicit population size has been set, None may be returned. Once running, the correct value will always be returned.

running()[source]

See Optimiser.running().

set_hyper_parameters(x)

The hyper-parameter vector is [population_size].

See TunableMethod.set_hyper_parameters().

set_population_size(population_size=None)

Sets a population size to use in this optimisation.

If population_size is set to None, the population size will be set using the heuristic suggested_population_size().

stop()[source]

See Optimiser.stop().

suggested_population_size(round_up_to_multiple_of=None)

Returns a suggested population size for this method, based on the dimension of the search space (e.g. the parameter space).

If the optional argument round_up_to_multiple_of is set to an integer greater than 1, the method will round up the estimate to a multiple of that number. This can be useful to obtain a population size based on e.g. the number of worker processes used to perform objective function evaluations.

tell(fx)[source]

See Optimiser.tell().

xbest()[source]

See Optimiser.xbest().

CMA-ES

class pints.CMAES(x0, sigma0=None, boundaries=None)[source]

Finds the best parameters using the CMA-ES method described in [1], [2] and implemented in the cma module [3].

CMA-ES stands for Covariance Matrix Adaptation Evolution Strategy, and is designed for non-linear derivative-free optimization problems.

Extends PopulationBasedOptimiser.

References

[1]The CMA Evolution Strategy: A Tutorial Nikolaus Hanse, arxiv https://arxiv.org/abs/1604.00772
[2]Hansen, Mueller, Koumoutsakos (2006) “Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES)”. Evolutionary Computation https://doi.org/10.1162/106365603321828970
[3]PyPi page for cma https://pypi.org/project/cma/
ask()[source]

See Optimiser.ask().

fbest()[source]

See Optimiser.fbest().

n_hyper_parameters()

See TunableMethod.n_hyper_parameters().

name()[source]

See Optimiser.name().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated error.

population_size()

Returns this optimiser’s population size.

If no explicit population size has been set, None may be returned. Once running, the correct value will always be returned.

running()[source]

See Optimiser.running().

set_hyper_parameters(x)

The hyper-parameter vector is [population_size].

See TunableMethod.set_hyper_parameters().

set_population_size(population_size=None)

Sets a population size to use in this optimisation.

If population_size is set to None, the population size will be set using the heuristic suggested_population_size().

stop()[source]

See Optimiser.stop().

suggested_population_size(round_up_to_multiple_of=None)

Returns a suggested population size for this method, based on the dimension of the search space (e.g. the parameter space).

If the optional argument round_up_to_multiple_of is set to an integer greater than 1, the method will round up the estimate to a multiple of that number. This can be useful to obtain a population size based on e.g. the number of worker processes used to perform objective function evaluations.

tell(fx)[source]

See Optimiser.tell().

xbest()[source]

See Optimiser.xbest().

Gradient descent (fixed learning rate)

class pints.GradientDescent(x0, sigma0=0.1, boundaries=None)[source]

Gradient-descent method with a fixed learning rate.

ask()[source]

See Optimiser.ask().

fbest()[source]

See Optimiser.fbest().

learning_rate()[source]

Returns this optimiser’s learning rate.

n_hyper_parameters()[source]

See pints.TunableMethod.n_hyper_parameters().

name()[source]

See Optimiser.name().

needs_sensitivities()[source]

See Optimiser.needs_sensitivities().

running()[source]

See Optimiser.running().

set_hyper_parameters(x)[source]

See pints.TunableMethod.set_hyper_parameters().

The hyper-parameter vector is [learning_rate].

set_learning_rate(eta)[source]

Sets the learning rate for this optimiser.

Parameters:eta (float) – The learning rate, as a float greater than zero.
stop()

Checks if this method has run into trouble and should terminate. Returns False if everything’s fine, or a short message (e.g. “Ill-conditioned matrix.”) if the method should terminate.

tell(reply)[source]

See Optimiser.tell().

xbest()[source]

See Optimiser.xbest().

Nelder-Mead

class pints.NelderMead(x0, sigma0=None, boundaries=None)[source]

Nelder-Mead downhill simplex method.

Implementation of the classical algorithm by [1], following the presentation in Algorithm 8.1 of [2].

This is a deterministic local optimiser. In most update steps it performs either 1 evaluation, or 2 sequential evaluations, so that it will not typically benefit from parallelisation.

Generates a “simplex” of n + 1 samples around a given starting point, and evaluates their scores. Next, each iteration consists of a sequence of operations, typically the worst sample y_worst is replaced with a new point:

y_new = mu + delta * (mu - y_worst)
mu = (1 / n) * sum(y), y != y_worst

where delta has one of four values, depending on the type of operation:

  • Reflection (delta = 1)
  • Expansion (delta = 2)
  • Inside contraction (delta = -0.5)
  • Outside contraction (delta = 0.5)

Note that the delta values here are common choices, but not the only valid choices.

A fifth type of iteration called a “shrink” is occasionally performed, in which all samples except the best sample y_best are replaced:

y_i_new = y_best + ys * (y_i - y_best)

where ys is a parameter (typically ys = 0.5).

The initialisation of the initial simplex was copied from [3].

References

[1]A simplex method for function minimization Nelder, Mead 1965, Computer Journal https://doi.org/10.1093/comjnl/7.4.308
[2]Introduction to derivative-free optimization Andrew R. Conn, Katya Scheinberg, Luis N. Vicente 2009, First edition. ISBN 978-0-098716-68-9 https://doi.org/10.1137/1.9780898718768
[3]SciPy on GitHub https://github.com/scipy/scipy/
ask()[source]

See: pints.Optimiser.ask().

fbest()[source]

See: pints.Optimiser.fbest().

n_hyper_parameters()

Returns the number of hyper-parameters for this method (see TunableMethod).

name()[source]

See: pints.Optimiser.name().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated error.

running()[source]

See: pints.Optimiser.running().

set_hyper_parameters(x)

Sets the hyper-parameters for the method with the given vector of values (see TunableMethod).

Parameters:x – An array of length n_hyper_parameters used to set the hyper-parameters.
stop()[source]

See: pints.Optimiser.stop().

tell(fx)[source]

See: pints.Optimiser.tell().

xbest()[source]

See: pints.Optimiser.xbest().

PSO

class pints.PSO(x0, sigma0=None, boundaries=None)[source]

Finds the best parameters using the PSO method described in [1].

Particle Swarm Optimisation (PSO) is a global search method (so refinement with a local optimiser is advised!) that works well for problems in high dimensions and with many local minima. Because it treats each parameter independently, it does not require preconditioning of the search space.

In a particle swarm optimization, the parameter space is explored by n independent particles. The particles perform a pseudo-random walk through the parameter space, guided by their own personal best score and the global optimum found so far.

The method starts by creating a swarm of n particles and assigning each an initial position and initial velocity (see the explanation of the arguments hints and v for details). Each particle’s score is calculated and set as the particle’s current best local score pl. The best score of all the particles is set as the best global score pg.

Next, an iterative procedure is run that updates each particle’s velocity v and position x using:

v[k] = v[k-1] + al * (pl - x[k-1]) + ag * (pg - x[k-1])
x[k] = v[k]

Here, x[t] is the particle’s current position and v[t] its current velocity. The values al and ag are scalars randomly sampled from a uniform distribution, with values bound by r * 4.1 and (1 - r) * 4.1. Thus a swarm with r = 1 will only use local information, while a swarm with r = 0 will only use global information. The de facto standard is r = 0.5. The random sampling is done each time al and ag are used: at each time step every particle performs m samplings, where m is the dimensionality of the search space.

Pseudo-code algorithm:

almax = r * 4.1
agmax = 4.1 - almax
while stopping criterion not met:
    for i in [1, 2, .., n]:
        if f(x[i]) < f(p[i]):
            p[i] = x[i]
        pg = min(p[1], p[2], .., p[n])
        for j in [1, 2, .., m]:
            al = uniform(0, almax)
            ag = uniform(0, agmax)
            v[i,j] += al * (p[i,j] - x[i,j]) + ag * (pg[i,j]  - x[i,j])
            x[i,j] += v[i,j]

Extends PopulationBasedOptimiser.

References

[1]Kennedy, Eberhart (1995) Particle Swarm Optimization. IEEE International Conference on Neural Networks https://doi.org/10.1109/ICNN.1995.488968
ask()[source]

See Optimiser.ask().

fbest()[source]

See Optimiser.fbest().

n_hyper_parameters()[source]

See TunableMethod.n_hyper_parameters().

name()[source]

See Optimiser.name().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated error.

population_size()

Returns this optimiser’s population size.

If no explicit population size has been set, None may be returned. Once running, the correct value will always be returned.

running()[source]

See Optimiser.running().

set_hyper_parameters(x)[source]

The hyper-parameter vector is [population_size, local_global_balance].

See TunableMethod.set_hyper_parameters().

set_local_global_balance(r=0.5)[source]

Set the balance between local and global exploration for each particle, using a parameter r such that r = 1 is a fully local search and r = 0 is a fully global search.

set_population_size(population_size=None)

Sets a population size to use in this optimisation.

If population_size is set to None, the population size will be set using the heuristic suggested_population_size().

stop()

Checks if this method has run into trouble and should terminate. Returns False if everything’s fine, or a short message (e.g. “Ill-conditioned matrix.”) if the method should terminate.

suggested_population_size(round_up_to_multiple_of=None)

Returns a suggested population size for this method, based on the dimension of the search space (e.g. the parameter space).

If the optional argument round_up_to_multiple_of is set to an integer greater than 1, the method will round up the estimate to a multiple of that number. This can be useful to obtain a population size based on e.g. the number of worker processes used to perform objective function evaluations.

tell(fx)[source]

See Optimiser.tell().

xbest()[source]

See Optimiser.xbest().

SNES

class pints.SNES(x0, sigma0=None, boundaries=None)[source]

Finds the best parameters using the SNES method described in [1], [2].

SNES stands for Seperable Natural Evolution Strategy, and is designed for non-linear derivative-free optimization problems in high dimensions and with many local minima [1].

It treats each dimension separately, making it suitable for higher dimensions.

Extends PopulationBasedOptimiser.

References

[1](1, 2) Schaul, Glasmachers, Schmidhuber (2011) “High dimensions and heavy tails for natural evolution strategies”. Proceedings of the 13th annual conference on Genetic and evolutionary computation. https://doi.org/10.1145/2001576.2001692
[2]PyBrain: The Python machine learning library http://pybrain.org
ask()[source]

See Optimiser.ask().

fbest()[source]

See Optimiser.fbest().

n_hyper_parameters()

See TunableMethod.n_hyper_parameters().

name()[source]

See Optimiser.name().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated error.

population_size()

Returns this optimiser’s population size.

If no explicit population size has been set, None may be returned. Once running, the correct value will always be returned.

running()[source]

See Optimiser.running().

set_hyper_parameters(x)

The hyper-parameter vector is [population_size].

See TunableMethod.set_hyper_parameters().

set_population_size(population_size=None)

Sets a population size to use in this optimisation.

If population_size is set to None, the population size will be set using the heuristic suggested_population_size().

stop()

Checks if this method has run into trouble and should terminate. Returns False if everything’s fine, or a short message (e.g. “Ill-conditioned matrix.”) if the method should terminate.

suggested_population_size(round_up_to_multiple_of=None)

Returns a suggested population size for this method, based on the dimension of the search space (e.g. the parameter space).

If the optional argument round_up_to_multiple_of is set to an integer greater than 1, the method will round up the estimate to a multiple of that number. This can be useful to obtain a population size based on e.g. the number of worker processes used to perform objective function evaluations.

tell(fx)[source]

See Optimiser.tell().

xbest()[source]

See Optimiser.xbest().

xNES

class pints.XNES(x0, sigma0=None, boundaries=None)[source]

Finds the best parameters using the xNES method described in [1], [2].

xNES stands for Exponential Natural Evolution Strategy, and is designed for non-linear derivative-free optimization problems [1].

Extends PopulationBasedOptimiser.

References

[1](1, 2) Glasmachers, Schaul, Schmidhuber et al. (2010) “Exponential natural evolution strategies”. Proceedings of the 12th annual conference on Genetic and evolutionary computation. https://doi.org/10.1145/1830483.1830557
[2]PyBrain: The Python machine learning library http://pybrain.org
ask()[source]

See Optimiser.ask().

fbest()[source]

See Optimiser.fbest().

n_hyper_parameters()

See TunableMethod.n_hyper_parameters().

name()[source]

See Optimiser.name().

needs_sensitivities()

Returns True if this methods needs sensitivities to be passed in to tell along with the evaluated error.

population_size()

Returns this optimiser’s population size.

If no explicit population size has been set, None may be returned. Once running, the correct value will always be returned.

running()[source]

See Optimiser.running().

set_hyper_parameters(x)

The hyper-parameter vector is [population_size].

See TunableMethod.set_hyper_parameters().

set_population_size(population_size=None)

Sets a population size to use in this optimisation.

If population_size is set to None, the population size will be set using the heuristic suggested_population_size().

stop()

Checks if this method has run into trouble and should terminate. Returns False if everything’s fine, or a short message (e.g. “Ill-conditioned matrix.”) if the method should terminate.

suggested_population_size(round_up_to_multiple_of=None)

Returns a suggested population size for this method, based on the dimension of the search space (e.g. the parameter space).

If the optional argument round_up_to_multiple_of is set to an integer greater than 1, the method will round up the estimate to a multiple of that number. This can be useful to obtain a population size based on e.g. the number of worker processes used to perform objective function evaluations.

tell(fx)[source]

See Optimiser.tell().

xbest()[source]

See Optimiser.xbest().

Noise model diagnostics

Pints includes functionality to generate diagnostic plots of the residuals. These tools may be useful to evaluate the validity of a noise model.

Plotting functions:

Diagnostics:

Plotting functions

pints.residuals_diagnostics.plot_residuals_autocorrelation(parameters, problem, max_lag=10, thinning=None, significance_level=0.05, posterior_interval=0.95)[source]

Generate an autocorrelation plot of the residuals.

This function can be used to analyse the results of either optimisation or MCMC Bayesian inference. When multiple samples of the residuals are present (corresponding to multiple MCMC samples), the plot illustrates the distribution of autocorrelations across the MCMC samples. At each lag, a point is drawn at the median autocorrelation, and a line is drawn giving the percentile range of the posterior interval specified as an argument (by default, the 2.5th to the 97.5th percentile).

When multiple outputs are present, one residuals plot will be generated for each output.

When a significance level is provided, confidence bounds for the sample autocorrelations under the assumption of IID residuals are drawn on the plot. Many of the observed residuals autocorrelations falling outside these bounds could imply evidence against the residuals being IID.

Under the assumption that the residuals of length \(n\) are IID with mean 0 and variance \(\sigma^2\), for large \(n\) the residuals sample autocorrelations are approximately IID Normal(mean=0, variance=1/n). This result is proved in [1] (see Theorem 7.2.2 and Example 7.2.1). Therefore, the confidence bounds can be calculated by \(\pm z^* n^{-1/2}\) for the appropriate critical value \(z^*\).

This function returns a matplotlib figure.

Parameters:
  • parameters – The parameter values with shape (n_samples, n_parameters). When passing a single best fit parameter vector, n_samples will be 1.
  • problem – The problem given by a pints.SingleOutputProblem or pints.MultiOutputProblem, with n_parameters greater than or equal to the n_parameters of the parameters. Extra parameters not found in the problem are ignored.
  • max_lag – Optional int value (default 10). The highest lag to plot.
  • thinning – Optional int value (greater than zero). If thinning is set to n, only every nth sample in parameters will be used. If set to None (default), some thinning will be applied so that about 200 samples will be used.
  • significance_levelNone or float value (default 0.05). When a significance level is provided, dashed lines for the confidence interval corresponding to that significance level are drawn on the plot. When None, no lines are drawn.
  • posterior_interval – Float value (default 0.95). When multiple samples of the parameter values are provided, this gives the size of the credible region of the posterior to plot.

References

[1]Brockwell, P. J., & Davis, R. A. (1991). Time series: Theory and methods (2nd ed.). New York: Springer.
pints.residuals_diagnostics.plot_residuals_binned_autocorrelation(parameters, problem, thinning=None, n_bins=25)[source]

Plot the autocorrelation of the residuals within bins (i.e. discrete time windows across the series).

Given a time series with observed residuals

\[e_i = y_i - f(t_i; \theta)\]

This method divides the vector of residuals into some number of equally sized bins. The lag 1 autocorrelation is calculated for the residuals within each bin. The plot shows the lag 1 autocorrelation in each bin over time.

This diagnostic is useful for diagnosing time series with noise whose autocorrelation varies over time.

When passing an array of parameters (from an MCMC sampler), this method plots the autocorrelations of the posterior median residual values.

Typically, this diagnostic is called after obtaining the residuals of an IID fit, in order to determine whether the IID fit is satisfactory or a more complex noise model is needed.

This function returns a matplotlib figure.

Parameters:
  • parameters – The parameter values with shape (n_samples, n_parameters). When passing a single best fit parameter vector, n_samples will be 1.
  • problem – The problem given by a pints.SingleOutputProblem or pints.MultiOutputProblem, with n_parameters greater than or equal to the n_parameters of the parameters. Extra parameters not found in the problem are ignored.
  • thinning – Optional int value (greater than zero). If thinning is set to n, only every nth sample in parameters will be used. If set to None (default), some thinning will be applied so that about 200 samples will be used.
  • n_bins – Optional int value (greater than zero) giving the number of bins into which to divide the time series. By default, it is fixed to 25.
pints.residuals_diagnostics.plot_residuals_binned_std(parameters, problem, thinning=None, n_bins=25)[source]

Plot the standard deviation of the residuals within bins (i.e. discrete time windows across the series).

Given a time series with observed residuals

\[e_i = y_i - f(t_i; \theta)\]

This method divides the vector of residuals into some number of equally sized bins. The standard deviation is calculated for the residuals within each bin. The plot shows the standard deviation in each bin over time.

This diagnostic is particularly useful for diagnosing time series whose noise exhibits a change in variance over time.

When passing an array of parameters (from an MCMC sampler), this method will plot the standard deviation of the posterior median residual values.

Typically, this diagnostic can be called after obtaining the residuals of an IID fit, in order to determine whether the IID fit is satisfactory or a more complex noise model is needed.

This function returns a matplotlib figure.

Parameters:
  • parameters – The parameter values with shape (n_samples, n_parameters). When passing a single best fit parameter vector, n_samples will be 1.
  • problem – The problem given by a pints.SingleOutputProblem or pints.MultiOutputProblem, with n_parameters greater than or equal to the n_parameters of the parameters. Extra parameters not found in the problem are ignored.
  • thinning – Optional int value (greater than zero). If thinning is set to n, only every nth sample in parameters will be used. If set to None (default), some thinning will be applied so that about 200 samples will be used.
  • n_bins – Optional int value (greater than zero) giving the number of bins into which to divide the time series. By default, it is fixed to 25.
pints.residuals_diagnostics.plot_residuals_distance(parameters, problem, thinning=None)[source]

Plot a distance matrix of the residuals.

Given a time series with observed residuals

\[e_i = y_i - f(t_i; \theta)\]

this function generates and plots the distance matrix \(D\) whose entries are defined by

\[D_{i, j} = |e_i - e_j|\]

The plot of this matrix may be helpful for identifying a time series with correlated noise. When the noise terms are correlated, the distance matrix \(D\) is likely to have a banded appearance.

For problems with multiple outputs, one distance matrix is generated for each output.

When passing an array of parameters (from an MCMC sampler), this method will plot the distance matrix of the posterior median residual values.

Typically, this diagnostic is called after obtaining the residuals of an IID fit, in order to determine whether the IID fit is satisfactory or a more complex noise model is needed.

This function returns a matplotlib figure.

Parameters:
  • parameters – The parameter values with shape (n_samples, n_parameters). When passing a single best fit parameter vector, n_samples will be 1.
  • problem – The problem given by a pints.SingleOutputProblem or pints.MultiOutputProblem, with n_parameters greater than or equal to the n_parameters of the parameters. Extra parameters not found in the problem are ignored.
  • thinning – Optional int value (greater than zero). If thinning is set to n, only every nth sample in parameters will be used. If set to None (default), some thinning will be applied so that about 200 samples will be used.
pints.residuals_diagnostics.plot_residuals_vs_output(parameters, problem, thinning=None)[source]

Draw a plot of the magnitude of residuals versus the solution output.

This plot is useful to detect any dependence between the error model and the magnitude of the solution. For example, it may help to detect multiplicative Gaussian noise, in which the standard deviation of the error scales with the output.

When multiple samples of the parameters are provided (from an MCMC chain), the residuals are calculated and plotted relative to the posterior median of the solution outputs.

This function returns a matplotlib figure.

Parameters:
  • parameters – The parameter values with shape (n_samples, n_parameters). When passing a single best fit parameter vector, n_samples will be 1.
  • problem – The problem given by a pints.SingleOutputProblem or pints.MultiOutputProblem, with n_parameters greater than or equal to the n_parameters of the parameters. Extra parameters not found in the problem are ignored.
  • thinning – Optional, integer value (greater than zero). If thinning is set to n, only every nth sample in parameters will be used. If set to None (default), some thinning will be applied so that about 200 samples will be used.

Diagnostics

pints.residuals_diagnostics.acorr(x, max_lag)[source]

Calculate the normalised autocorrelation for a given data series.

This function uses the same procedure as matplotlib.pyplot.acorr, but it just calculates the autocorrelation without plotting anything.

Returns the autocorrelation as a NumPy array.

Parameters:
  • x – A 1d NumPy array containing the time series for which to calculate autocorrelation.
  • max_lag – An int specifying the highest lag to consider.
pints.residuals_diagnostics.calculate_residuals(parameters, problem, thinning=None)[source]

Calculate the residuals (difference between actual data and the fit).

Either a single set of parameters or a chain of MCMC samples can be provided.

The residuals are returned as a 3-dimensional NumPy array with shape (n_samples, n_outputs, n_times).

Parameters:
  • parameters – The parameter values with shape (n_samples, n_parameters). When passing a single best fit parameter vector, n_samples will be 1.
  • problem – The problem given by a pints.SingleOutputProblem or pints.MultiOutputProblem, with n_parameters greater than or equal to the n_parameters of the parameters. Extra parameters not found in the problem are ignored.
  • thinning – Optional, integer value (greater than zero). If thinning is set to n, only every nth sample in parameters will be used. If set to None (default), some thinning will be applied so that about 200 samples will be used.

Toy problems

The toy module provides toy models, distributions and error measures that can be used for tests and in examples.

Some toy classes provide extra functionality defined in the pints.toy.ToyModel and pints.toy.ToyLogPDF classes.

Toy base classes

class pints.toy.ToyLogPDF[source]

Abstract base class for toy distributions.

Extends pints.LogPDF.

distance(samples)[source]

Calculates a measure of distance from samples to some characteristic of the underlying distribution.

evaluateS1(x)

Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (L, L') where L is a scalar value and L' is a sequence of length n_parameters.

Note that the derivative returned is of the log-pdf, so L' = d/dp log(f(p)), evaluated at p=x.

This is an optional method that is not always implemented.

n_parameters()

Returns the dimension of the space this LogPDF is defined over.

sample(n_samples)[source]

Generates independent samples from the underlying distribution.

suggested_bounds()[source]

Returns suggested boundaries for prior.

class pints.toy.ToyModel[source]

Defines an interface for toy problems.

Note that toy models should extend both ToyModel and one of the forward model classes, e.g. pints.ForwardModel.

suggested_parameters()[source]

Returns an NumPy array of the parameter values that are representative of the model.

For example, these parameters might reproduce a particular result that the model is famous for.

suggested_times()[source]

Returns an NumPy array of time points that is representative of the model

class pints.toy.ToyODEModel[source]

Defines an interface for toy problems where the underlying model is an ordinary differential equation (ODE) that describes some time-series generating model.

Note that toy ODE models should extend both pints.ToyODEModel and one of the forward model classes, e.g. pints.ForwardModel or pints.ForwardModelS1.

To use this class as the basis for a pints.ForwardModel, the method _rhs() should be reimplemented.

Models implementing _rhs(), jacobian() and _dfdp() can be used to create a pints.ForwardModelS1.

_dfdp(y, t, p)[source]

Returns the derivative of the ODE RHS at time t, with respect to model parameters p.

Parameters:
  • y – The state vector at time t (with length n_outputs).
  • t – The time to evaluate at (as a scalar).
  • p – A vector of model parameters (of length n_parameters).
Returns:

Return type:

A matrix of dimensions n_outputs by n_parameters.

_rhs(y, t, p)[source]

Returns the evaluated RHS (dy/dt) for a given state vector y, time t, and parameter vector p.

Parameters:
  • y – The state vector at time t (with length n_outputs).
  • t – The time to evaluate at (as a scalar).
  • p – A vector of model parameters (of length n_parameters).
Returns:

Return type:

A vector of length n_outputs.

initial_conditions()[source]

Returns the initial conditions of the model.

jacobian(y, t, p)[source]

Returns the Jacobian (the derivative of the RHS ODE with respect to the outputs) at time t.

Parameters:
  • y – The state vector at time t (with length n_outputs).
  • t – The time to evaluate at (as a scalar).
  • p – A vector of model parameters (of length n_parameters).
Returns:

Return type:

A matrix of dimensions n_outputs by n_outputs.

n_states()[source]

Returns number of states in underlying ODE. Note: will not be same as n_outputs() for models where only a subset of states are observed.

set_initial_conditions(y0)[source]

Sets the initial conditions of the model.

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

simulateS1(parameters, times)[source]

See pints.ForwardModelS1.simulateS1().

suggested_parameters()

Returns an NumPy array of the parameter values that are representative of the model.

For example, these parameters might reproduce a particular result that the model is famous for.

suggested_times()

Returns an NumPy array of time points that is representative of the model

Annulus Distribution

class pints.toy.AnnulusLogPDF(dimensions=2, r0=10, sigma=1)[source]

Toy distribution based on a d-dimensional distribution of the form

\[f(x|r_0, \sigma) \propto e^{-(|x|-r_0)^2 / {2\sigma^2}}\]

where \(x\) is a d-dimensional real, and \(|x|\) is the Euclidean norm.

This distribution is roughly a one-dimensional Gaussian distribution centred on \(r0\), that is smeared over the surface of a hypersphere of the same radius. In two dimensions, the density looks like a circular annulus.

Extends pints.LogPDF.

Parameters:
  • dimensions (int) – The dimensionality of the space.
  • r0 (float) – The radius of the hypersphere and is approximately the mean normed distance from the origin.
  • sigma (float) – The width of the annulus; approximately the standard deviation of normed distance.
distance(samples)[source]

Calculates a measure of normed distance of samples from exact mean and covariance matrix assuming uniform prior with bounds given by suggested_bounds().

See ToyLogPDF.distance().

evaluateS1(x)[source]

See LogPDF.evaluateS1().

mean()[source]

Returns the mean of this distribution.

mean_normed()[source]

Returns the mean of the normed distance from the origin.

moment_normed(order)[source]

Returns a given moment of the normed distance from the origin.

n_parameters()[source]

Returns the dimension of the space this LogPDF is defined over.

r0()[source]

Returns r0.

sample(n_samples)[source]

See ToyLogPDF.sample().

sigma()[source]

Returns sigma

suggested_bounds()[source]

See ToyLogPDF.suggested_bounds().

var_normed()[source]

Returns the variance of the normed distance from the origin.

Beeler-Reuter Action Potential Model

class pints.toy.ActionPotentialModel(y0=None)[source]

The 1977 Beeler-Reuter model of the mammalian ventricular action potential (AP).

This model is written as an ODE with 8 states and several intermediary variables: for the full model equations, please see the original paper [1].

The model contains 5 ionic currents, each described by a sub-model with several kinetic parameters, and a maximum conductance parameter that determines its magnitude. Only the 5 conductance parameters are varied in this ToyModel, all other parameters are fixed and assumed to be known. To aid in inference, a parameter transformation is used: instead of specifying the maximum conductances directly, their natural logarithm should be used. In other words, the parameter vector passed to simulate() should contain the logarithm of the five conductances.

As outputs, we use the AP and the calcium transient, as these are the only two states (out of the total of eight) with a physically observable counterpart. This makes this a fairly hard problem.

Extends pints.ForwardModel, pints.toy.ToyModel.

Parameters:y0 – The initial state of the observables V and Ca_i, where Ca_i must be 0 or greater. If not given, the defaults are -84.622 and 2e-7.

References

[1]Reconstruction of the action potential of ventricular myocardial fibres. Beeler, Reuter (1977) Journal of Physiology https://doi.org/10.1113/jphysiol.1977.sp011853
initial_conditions()[source]

Returns the initial conditions of this model.

n_outputs()[source]

See pints.ForwardModel.n_outputs().

n_parameters()[source]

See pints.ForwardModel.n_parameters().

set_initial_conditions(y0)[source]

Changes the initial conditions for this model.

set_solver_tolerances(rtol=0.0001, atol=1e-06)[source]

Updates the solver tolerances. See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.odeint.html

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

simulate_all_states(parameters, times)[source]

Runs a simulation and returns all state variables, including the ones that do no have a physically observable counterpart.

suggested_parameters()[source]

Returns suggested parameters for this model. The returned vector is already log-transformed, and can be passed directly to simulate().

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

Cone Distribution

class pints.toy.ConeLogPDF(dimensions=2, beta=1)[source]

Toy distribution based on a d-dimensional distribution of the form,

\[f(x) \propto e^{-|x|^\beta}\]

where x is a d-dimensional real, and |x| is the Euclidean norm. The mean and variance that are returned relate to expectations on |x| not the multidimensional x.

Extends pints.LogPDF.

Parameters:
  • dimensions (int) – The dimensionality of the cone.
  • beta (float) – The power to which |x| is raised in the exponential term, which must be positive.
CDF(x)[source]

Returns the cumulative density function in terms of |x|.

beta()[source]

Returns the exponent in the pdf

distance(samples)[source]

Calculates a measure of normed distance of samples from exact mean and covariance matrix assuming uniform prior with bounds given by suggested_bounds().

See pints.toy.ToyLogPDF.distance().

evaluateS1(x)[source]

See LogPDF.evaluateS1().

mean_normed()[source]

Returns the mean of the normed distance from the origin

n_parameters()[source]

Returns the dimension of the space this LogPDF is defined over.

sample(n_samples)[source]

See ToyLogPDF.sample().

suggested_bounds()[source]

See ToyLogPDF.suggested_bounds().

var_normed()[source]

Returns the variance of the normed distance from the origin.

Constant Model

class pints.toy.ConstantModel(n, force_multi_output=False)[source]

Toy model that’s constant over time, linear over the parameters, mostly useful for unit testing.

For an n-dimensional model, evaluated with parameters p = [p_1, p_2, ..., p_n], the simulated values are time-invariant, so that for any time t

\[f(t) = (p_1, 2 p_2, 3 p_3, ..., n p_n)\]

The derivatives with respect to the parameters are time-invariant, and simply equal

\[\begin{split}\frac{\partial{f_i(t)}}{dp_j} = \begin{cases} i, i = j\\0, i \neq j \end{cases}\end{split}\]

Extends pints.ForwardModelS1.

Parameters:
  • n (int) – The number of parameters (and outputs) the model should have.
  • force_multi_output (boolean) – Set to True to always return output of the shape (n_times, n_outputs), even if n_outputs == 1.

Example

times = np.linspace(0, 1, 100)
m = pints.ConstantModel(2)
m.simulate([1, 2], times)

In this example, the returned output is [1, 4] at every point in time.

n_outputs()[source]

See pints.ForwardModel.n_outputs().

n_parameters()[source]

See pints.ForwardModel.n_parameters().

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

simulateS1(parameters, times)[source]

See pints.ForwardModel.simulateS1().

Eight Schools distribution

class pints.toy.EightSchoolsLogPDF(centered=True)[source]

The classic Eight Schools example that is discussed in [1].

The aim of this model (implemented as a pints.ToyLogPDF) is to determine the effects of coaching on SAT scores in 8 schools (each school being denoted by subscript j in the following equations). It it used by statisticians to illustrate how hierarchical models can quite easily become unidentified, making inference hard.

This model is hierarchical and takes the form,

\[\begin{split}\begin{align} \mu &\sim \mathcal{N}(0, 5) \\ \tau &\sim \text{Cauchy}(0, 5) \\ \theta_j &\sim \mathcal{N}(\mu, \tau) \\ y_j &\sim \mathcal{N}(\theta_j, \sigma_j), \\ \end{align}\end{split}\]

where \(\sigma_j\) is known. The user may choose between the “centered” parameterisation of the model (which exactly mirrors the statistical model), and the “non-centered” parameterisation, which introduces auxillary variables to improve chain mixing. The non-centered model takes the form,

\[\begin{split}\begin{align} \mu &\sim \mathcal{N}(0, 5) \\ \tau &\sim \text{Cauchy}(0, 5) \\ \tilde{\theta}_j &\sim \mathcal{N}(0, 1) \\ \theta_j &= mu + \tilde{\theta}_j \tau \\ y_j &\sim \mathcal{N}(\theta_j, \sigma_j). \\ \end{align}\end{split}\]

Note that, in the non-centered case, the parameter samples correspond to \(\tilde{\theta}\) rather than \(\theta\).

The model uses a 10-dimensional parameter vector, composed of

  • mu, the population-level score
  • tau, the population-level standard deviation
  • theta_j, school j’s mean score (for each of the 8 schools).

Extends pints.toy.ToyLogPDF.

Parameters:centered (bool) – Whether or not to use the centered formulation.

References

[1](1, 2) “Bayesian data analysis”, 3rd edition, 2014, Gelman, A et al..
data()[source]

Returns data used to fit model from [1].

distance(samples)

Calculates a measure of distance from samples to some characteristic of the underlying distribution.

evaluateS1(x)[source]

See pints.LogPDF.evaluateS1().

n_parameters()[source]

See pints.LogPDF.n_parameters().

sample(n_samples)

Generates independent samples from the underlying distribution.

suggested_bounds()[source]

See pints.toy.ToyLogPDF.suggested_bounds().

Fitzhugh-Nagumo Model

class pints.toy.FitzhughNagumoModel(y0=None)[source]

Fitzhugh-Nagumo model of the action potential [1].

Has two states, and three phenomenological parameters: a , b, c. All states are visible

\[\frac{d \mathbf{y}}{dt} = \mathbf{f}(\mathbf{y},\mathbf{p},t)\]

where

\[\begin{split}\mathbf{y} &= (V,R)\\ \mathbf{p} &= (a,b,c)\end{split}\]

The RHS, jacobian and change in RHS with the parameters are given by

\[\begin{split}\begin{align} \mathbf{f}(\mathbf{y},\mathbf{p},t) &= \left[\begin{matrix} c \left(R - V^{3}/3+V\right) \\ - \frac{1}{c} \left(R b + V - a\right) \end{matrix}\right] \\ \frac{\partial \mathbf{f}}{\partial \mathbf{y}} &= \left[\begin{matrix} c \left(1- V^{2}\right) & c \\ - \frac{1}{c} & - \frac{b}{c} \end{matrix}\right] \\ \frac{\partial \mathbf{f}}{\partial \mathbf{p}} &= \left[\begin{matrix} 0 & 0 & R - V^{3}/3 + V\\ \frac{1}{c} & - \frac{R}{c} & \frac{1}{c^{2}} \left(R b + V - a\right) \end{matrix}\right] \end{align}\end{split}\]

Extends pints.ForwardModelS1, pints.toy.ToyODEModel.

Parameters:y0 – The system’s initial state. If not given, the default [-1, 1] is used.

References

[1]A kinetic model of the conductance changes in nerve membrane Fitzhugh (1965) Journal of Cellular and Comparative Physiology. https://doi.org/10.1002/jcp.1030660518
initial_conditions()

Returns the initial conditions of the model.

jacobian(y, t, p)[source]

See pints.ToyODEModel.jacobian().

n_outputs()[source]

See pints.ForwardModel.n_outputs().

n_parameters()[source]

See pints.ForwardModel.n_parameters().

n_states()

Returns number of states in underlying ODE. Note: will not be same as n_outputs() for models where only a subset of states are observed.

set_initial_conditions(y0)

Sets the initial conditions of the model.

simulate(parameters, times)

See pints.ForwardModel.simulate().

simulateS1(parameters, times)

See pints.ForwardModelS1.simulateS1().

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

Gaussian distribution

class pints.toy.GaussianLogPDF(mean=[0, 0], sigma=[1, 1])[source]

Toy distribution based on a multivariate (unimodal) Normal/Gaussian distribution.

Extends pints.toy.ToyLogPDF.

Parameters:
  • mean – The distribution mean (specified as a vector).
  • sigma – The distribution’s covariance matrix. Can be given as either a matrix or a vector (in which case diag(sigma) will be used. Should be symmetric and positive-semidefinite.
distance(samples)[source]

Returns the Kullback-Leibler divergence.

See pints.toy.ToyLogPDF.distance().

evaluateS1(x)[source]

See pints.LogPDF.evaluateS1().

kl_divergence(samples)[source]

Calculates the Kullback-Leibler divergence between a given list of samples and the distribution underlying this LogPDF.

The returned value is (near) zero for perfect sampling, and then increases as the error gets larger.

See: https://en.wikipedia.org/wiki/Kullback-Leibler_divergence

n_parameters()[source]

See pints.LogPDF.n_parameters().

sample(n)[source]

See pints.toy.ToyLogPDF.sample().

suggested_bounds()

Returns suggested boundaries for prior.

German Credit Hierarchical Logistic Distribution

class pints.toy.GermanCreditHierarchicalLogPDF(x=None, y=None, download=False)[source]

Toy distribution based on a hierarchical logistic regression model, which takes the form,

\[f(z, y|\beta) \propto \text{exp}(-\sum_{i=1}^{N} \text{log}(1 + \text{exp}(-y_i z_i.\beta)) - \beta.\beta/2\sigma^2 - N/2 \text{log }\sigma^2 - \lambda \sigma^2)\]

The data \((z, y)\) are a matrix of individual predictors (with 1s in the first column) and responses (1 if the individual should receive credit and -1 if not) respectively; \(\beta\) is a 325x1 vector of coefficients and \(N=1000\); \(z\) is the design matrix formed by creating all interactions between individual variables and themselves as defined in [2].

Extends pints.LogPDF.

Parameters:theta (float) – vector of coefficients of length 326 (first dimension is sigma; other entries make up beta)

References

[1]“UCI machine learning repository”, 2010. A. Frank and A. Asuncion.
[2]“The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo”, 2014, M.D. Hoffman and A. Gelman.
data()[source]

Returns data used to fit model: x, y and z.

distance(samples)

Calculates a measure of distance from samples to some characteristic of the underlying distribution.

evaluateS1(theta)[source]

See LogPDF.evaluateS1().

n_parameters()[source]

Returns the dimension of the space this LogPDF is defined over.

sample(n_samples)

Generates independent samples from the underlying distribution.

suggested_bounds()[source]

See ToyLogPDF.suggested_bounds().

German Credit Logistic Distribution

class pints.toy.GermanCreditLogPDF(x=None, y=None, download=False)[source]

Toy distribution based on a logistic regression model, which takes the form,

\[f(x, y|\beta) \propto \text{exp}(-\sum_{i=1}^{N} \text{log}(1 + \text{exp}(-y_i x_i.\beta)) - \beta.\beta/2\sigma^2)\]

The data \((x, y)\) are a matrix of individual predictors (with 1s in the first column) and responses (1 if the individual should receive credit and -1 if not) respectively; \(\beta\) is a 25x1 vector of coefficients and \(\sigma^2=100\). The dataset here is from [1] but the test problem is defined in [2].

Extends pints.LogPDF.

Parameters:beta (float) – vector of coefficients of length 25.

References

[1]“UCI machine learning repository”, 2010. A. Frank and A. Asuncion.
[2]“The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo”, 2014, M.D. Hoffman and A. Gelman.
data()[source]

Returns data used to fit model.

distance(samples)

Calculates a measure of distance from samples to some characteristic of the underlying distribution.

evaluateS1(beta)[source]

See LogPDF.evaluateS1().

n_parameters()[source]

Returns the dimension of the space this LogPDF is defined over.

sample(n_samples)

Generates independent samples from the underlying distribution.

suggested_bounds()[source]

See ToyLogPDF.suggested_bounds().

Goodwin oscillator model

class pints.toy.GoodwinOscillatorModel[source]

Three-state Goodwin oscillator toy model introduced in [1], [2], but best described in [3]. The model considers level of mRNA, \(x\), which is translated into protein \(y\), which, in turn, stimulated production of protein \(z\) that inhibits production of mRNA. The ODE system is described by the following equations,

\[ \begin{align}\begin{aligned}\dot{x} = 1 / (1 + z^{10}) - m_1 x\\\dot{y} = k_2 x - m_2 y\\\dot{z} = k_3 y - m_3 z\end{aligned}\end{align} \]

Parameters are \([k_2, k_3, m_1, m_2, m_3]\). The initial conditions are hard-coded at [0.0054, 0.053, 1.93].

Extends pints.ForwardModelS1, pints.toy.ToyODEModel.

References

[1]Oscillatory behavior in enzymatic control processes. Goodwin (1965) Advances in enzyme regulation. https://doi.org/10.1016/0065-2571(65)90067-1
[2]Mathematics of cellular control processes I. Negative feedback to one gene. Griffith (1968) Journal of theoretical biology. https://doi.org/10.1016/0022-5193(68)90189-6
[3]Estimating Bayes factors via thermodynamic integration and population MCMC. Ben Calderhead and Mark Girolami, 2009, Computational Statistics and Data Analysis.
initial_conditions()

Returns the initial conditions of the model.

jacobian(state, time, parameters)[source]

See pints.ToyODEModel.jacobian().

n_outputs()[source]

See pints.ForwardModel.n_outputs().

n_parameters()[source]

See pints.ForwardModel.n_parameters().

n_states()

Returns number of states in underlying ODE. Note: will not be same as n_outputs() for models where only a subset of states are observed.

set_initial_conditions(y0)

Sets the initial conditions of the model.

simulate(parameters, times)

See pints.ForwardModel.simulate().

simulateS1(parameters, times)

See pints.ForwardModelS1.simulateS1().

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

HES1 Michaelis-Menten Model

class pints.toy.Hes1Model(m0=None, fixed_parameters=None)[source]

HES1 Michaelis-Menten model of regulatory dynamics [1].

This model describes the expression level of the transcription factor Hes1.

\[\begin{split}\frac{dm}{dt} &= -k_{deg}m + \frac{1}{1 + (p_2/P_0)^h} \\ \frac{dp_1}{dt} &= -k_{deg} p_1 + \nu m - k_1 p_1 \\ \frac{dp_2}{dt} &= -k_{deg} p_2 + k_1 p_1\end{split}\]

The system is determined by 3 state variables \(m\), \(p_1\), and \(p_2\). It is assumed that only \(m\) can be observed, that is only \(m\) is an observable. The initial condition of the other two state variables and \(k_{deg}\) are treated as implicit parameters of the system. The input order of parameters of interest is \(\{ P_0, \nu, k_1, h \}\).

Extends pints.ForwardModel, pints.toy.ToyModel.

Parameters:
  • m0 (float) – The initial condition of the observable m. Requires m0 >= 0.
  • fixed_parameters – The fixed parameters of the model which are not inferred, given as a vector [p1_0, p2_0, k_deg] with p1_0, p2_0, k_deg >= 0.

References

[1]Silk, D., el al. 2011. Designing attractive models via automated identification of chaotic and oscillatory dynamical regimes. Nature communications, 2, p.489. https://doi.org/10.1038/ncomms1496
fixed_parameters()[source]

Returns the fixed parameters of the model which are not inferred, given as a vector [p1_0, p2_0, k_deg].

initial_conditions()

Returns the initial conditions of the model.

jacobian(state, time, parameters)[source]

See pints.ToyModel.jacobian().

m0()[source]

Returns the initial conditions of the m variable.

n_outputs()[source]

See pints.ForwardModel.n_outputs().

n_parameters()[source]

See pints.ForwardModel.n_parameters().

n_states()[source]

See pints.ToyODEModel.n_states().

set_fixed_parameters(k)[source]

Changes the implicit parameters for this model.

set_initial_conditions(y0)

Sets the initial conditions of the model.

set_m0(m0)[source]

Sets the initial conditions of the m variable.

simulate(parameters, times)

See pints.ForwardModel.simulate().

simulateS1(parameters, times)

See pints.ForwardModelS1.simulateS1().

simulate_all_states(parameters, times)[source]

Returns all state variables that simulate() does not return.

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

suggested_values()[source]

Returns a suggested set of values that matches suggested_times().

High dimensional Gaussian distribution

class pints.toy.HighDimensionalGaussianLogPDF(dimension=20, rho=0.5)[source]

High-dimensional zero-mean multivariate Gaussian log pdf, with off-diagonal correlations.

Specifically, the covariance matrix Sigma is constructed so that diagonal elements are integers: Sigma_i,i = i and off-diagonal elements are Sigma_i,j = rho * sqrt(i) * sqrt(j).

Extends pints.toy.ToyLogPDF.

Parameters:
  • dimension (int) – Dimensions of multivariate Gaussian distribution (which must exceed 1).
  • rho (float) – The correlation between pairs of parameter dimensions. Note that this must be between `-1 / (dimension - 1) and 1 so that the covariance matrix is positive semi-definite.
distance(samples)[source]

Returns approximate Kullback-Leibler divergence between samples and underlying distribution.

See pints.toy.ToyLogPDF.distance().

evaluateS1(x)[source]

See pints.LogPDF.evaluateS1().

kl_divergence(samples)[source]

Returns approximate Kullback-Leibler divergence between samples and underlying distribution.

The returned value is (near) zero for perfect sampling, and then increases as the error gets larger.

See: https://en.wikipedia.org/wiki/Kullback-Leibler_divergence

n_parameters()[source]

See pints.LogPDF.n_parameters().

rho()[source]

Returns rho (correlation between dimensions)

sample(n_samples)[source]

See pints.toy.ToyLogPDF.sample().

suggested_bounds()[source]

See pints.toy.ToyLogPDF.suggested_bounds().

Hodgkin-Huxley IK Experiment Model

class pints.toy.HodgkinHuxleyIKModel(initial_condition=0.3)[source]

Toy model based on the potassium current experiments used for Hodgkin and Huxley’s 1952 model of the action potential of a squid’s giant axon [1].

A voltage-step protocol is created and applied to an axon, and the elicited potassium current (\(I_\text{K}\)) is given as model output.

The model equations are

\[\begin{split}\alpha &= p_1 \frac{-V - 75 + p_2}{\exp[(-V - 75 + p_2) / p_3] - 1} \\ \beta &= p_4 \exp[(-V - 75) / p_5] \\ \frac{dn}{dt} &= \alpha \cdot (1 - n) - \beta \cdot n \\ E_\text{K} &= -88 \\ g_\text{max} &= 36 \\ I_\text{K} &= g_\text{max} \cdot n^4 \cdot (V - E_\text{K})\end{split}\]

Where \(p_1, p_2, ..., p_5\) are the parameters varied in this toy model.

During simulation, the membrane potential \(V\) is varied by holding it at -75mV for 90ms, then at a “step potential” for 10ms. The step potentials are based on the values used in the original paper, and are -69, -64, -56, -49, -43, -37, -24, -12, 1, 13, 25, and 34mV. The protocol is applied in the interval \(t = [0, 1200]\), so sampling outside this interval will not provide new information.

With the parameter values from suggested_parameters(), simulation results will match those in [1].

Extends pints.ForwardModel, pints.toy.ToyModel.

Parameters:initial_condition (float) – The initial value of the state variable \(n\).

References

[1](1, 2) A quantitative description of membrane currents and its application to conduction and excitation in nerve. Hodgkin, Huxley (1952d) Journal of Physiology. https://doi.org/10.1113/jphysiol.1964.sp007378

Example usage:

model = HodgkinHuxleyIKModel()

p0 = model.suggested_parameters()
times = model.suggested_times()
values = model.simulate(p0, times)

import matplotlib.pyplot as plt
plt.figure()
plt.plot(times, values)

Alternatively, the data can be displayed using the fold() method:

plt.figure()
for t, v in model.fold(times, values):
    plt.plot(t, v)
plt.show()
fold(times, values)[source]

Takes a set of times and values as return by this model, and “folds” the individual currents over each other, to create a very common plot in electrophysiology.

Returns a list of tuples (times, values) for each different voltage step.

n_outputs()

Returns the number of outputs this model has. The default is 1.

n_parameters()[source]

See pints.ForwardModel.n_parameters().

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

suggested_duration()[source]

Returns the duration of the experimental protocol modeled in this toy model.

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

Returns an array with the original model parameters used by Hodgkin and Huxley.

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

Logistic model

class pints.toy.LogisticModel(initial_population_size=2)[source]

Logistic model of population growth [1].

\[\begin{split}f(t) &= \frac{k}{1+(k/p_0 - 1) \exp(-r t)} \\ \frac{\partial f(t)}{\partial r} &= \frac{k t (k / p_0 - 1) \exp(-r t)} {((k/p_0-1) \exp(-r t) + 1)^2} \\ \frac{\partial f(t)}{ \partial k} &= -\frac{k \exp(-r t)} {p_0 ((k/p_0-1)\exp(-r t) + 1)^2} + \frac{1}{(k/p_0 - 1)\exp(-r t) + 1}\end{split}\]

Has two model parameters: A growth rate \(r\) and a carrying capacity \(k\). The initial population size \(p_0 = f(0)\) is a fixed (known) parameter in the model.

Extends pints.ForwardModel, pints.toy.ToyModel.

Parameters:initial_population_size (float) – Sets the initial population size \(p_0\).

References

[1]https://en.wikipedia.org/wiki/Population_growth
n_outputs()

Returns the number of outputs this model has. The default is 1.

n_parameters()[source]

See pints.ForwardModel.n_parameters().

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

simulateS1(parameters, times)[source]

See pints.ForwardModelS1.simulateS1().

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

Lotka-Volterra model

class pints.toy.LotkaVolterraModel(y0=None)[source]

Lotka-Volterra model of Predatory-Prey relationships [1].

This model describes cyclical fluctuations in the populations of two interacting species.

\[\begin{split}\frac{dx}{dt} = ax - bxy \\ \frac{dy}{dt} = -cy + dxy\end{split}\]

where x is the number of prey, and y is the number of predators.

Real data is included via suggested_values(), which was taken from [2], and includes hare and lynx pelt count data collected by the Hudson’s Bay Company, in Canada in the early twentieth century.

Extends pints.ForwardModelS1, pints.toy.ToyODEModel.

Parameters:y0 – The initial population, given as a vector [a, b] such that a >= 0 and b >= 0.

References

[1]https://en.wikipedia.org/wiki/Lotka-Volterra_equations
[2](1, 2) Howard, P. (2009). Modeling basics. Lecture Notes for Math 442, Texas A&M University
initial_conditions()[source]

Returns the current initial conditions.

jacobian(z, t, p)[source]

See pints.ToyModel.jacobian().

n_outputs()[source]

See pints.ForwardModel.n_outputs().

n_parameters()[source]

See pints.ForwardModel.n_parameters().

n_states()

Returns number of states in underlying ODE. Note: will not be same as n_outputs() for models where only a subset of states are observed.

set_initial_conditions(y0)[source]

Changes the initial conditions for this model.

simulate(parameters, times)

See pints.ForwardModel.simulate().

simulateS1(parameters, times)

See pints.ForwardModelS1.simulateS1().

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

suggested_values()[source]

Returns hare-lynx pelt count data collected by the Hudson’s Bay Company in Canada in the early twentieth century, which is taken from [2]. The data given here corresponds to annual observations taken from 1900-1920 (inclusive).

Multimodal Gaussian distribution

class pints.toy.MultimodalGaussianLogPDF(modes=None, covariances=None)[source]

Multimodal (un-normalised) multivariate Gaussian distribution.

By default, the distribution is on a 2-dimensional space, with modes at at (0, 0) and (10, 10) with independent unit covariance matrices.

Examples:

# Default 2d, bimodal
f = pints.toy.MultimodalGaussianLogPDF()

# 3d bimodal
f = pints.toy.MultimodalGaussianLogPDF([[0, 1, 2], [10, 10, 10]])

# 2d with 3 modes
f = pints.toy.MultimodalGaussianLogPDF([[0, 0], [5, 5], [5, 0]])

Extends pints.toy.ToyLogPDF.

Parameters:
  • modes – A list of points that will form the modes of the distribution. Must all have the same dimension. If not set, the method will revert to the bimodal distribution described above.
  • covariances – A list of covariance matrices, one for each mode. If not set, a unit matrix will be used for each.
distance(samples)[source]

Calculates per mode approximate KL divergence then sums these.

See pints.toy.ToyLogPDF.distance().

evaluateS1(x)[source]

See LogPDF.evaluateS1().

kl_divergence(samples)[source]

Calculates the approximate Kullback-Leibler divergence between a given list of samples and the distribution underlying this LogPDF. It does this by first assigning each point to its most likely mode then calculating KL for each mode separately. If one mode is found with no near samples then all the samples are used to calculate KL for this mode.

The returned value is (near) zero for perfect sampling, and then increases as the error gets larger.

See: https://en.wikipedia.org/wiki/Kullback-Leibler_divergence

n_parameters()[source]

See pints.LogPDF.n_parameters().

sample(n_samples)[source]

See pints.toy.ToyLogPDF.sample().

suggested_bounds()[source]

See pints.toy.ToyLogPDF.suggested_bounds().

Neal’s Funnel Distribution

class pints.toy.NealsFunnelLogPDF(dimensions=10)[source]

Toy distribution based on a d-dimensional distribution of the form,

\[f(x_1, x_2,...,x_d,\nu) = \left[\prod_{i=1}^d\mathcal{N}(x_i|0,e^{\nu/2})\right] \times \mathcal{N}(\nu|0,3)\]

where x is a d-dimensional real. This distribution was introduced in [1].

Extends pints.toy.ToyLogPDF.

Parameters:dimensions (int) – The dimensionality of funnel (by default equal to 10) which must exceed 1.

References

[1]“Slice sampling”. R. Neal, Annals of statistics, 705 (2003) https://doi.org/10.1214/aos/1056562461
distance(samples)[source]

See pints.toy.ToyLogPDF.distance().

evaluateS1(x)[source]

See LogPDF.evaluateS1().

kl_divergence(samples)[source]

Calculates the KL divergence of samples of the \(nu\) parameter of Neal’s funnel from the analytic \(\mathcal{N}(0, 3)\) result.

marginal_log_pdf(x, nu)[source]

Yields the marginal density \(\text{log } p(x_i,\nu)\).

mean()[source]

Returns the mean of the target distribution in each dimension.

n_parameters()[source]

See pints.LogPDF.n_parameters().

sample(n_samples)[source]

See pints.toy.ToyLogPDF.sample().

suggested_bounds()[source]

See pints.toy.ToyLogPDF.suggested_bounds().

var()[source]

Returns the variance of the target distribution in each dimension. Note \(nu\) is the last entry.

Parabolic error

class pints.toy.ParabolicError(c=[0, 0])[source]

Error measure based on a simple parabola centered around a user specified point.

\[f(x) = \sum (x - c)^2\]

Extends pints.ErrorMeasure.

Parameters:c (sequence) – The center of the parabola.
evaluateS1(x)[source]

See pints.ErrorMeasure.evaluateS1().

n_parameters()[source]

See pints.ErrorMeasure.n_parameters().

optimum()[source]

Returns the global optimum for this function.

Repressilator model

class pints.toy.RepressilatorModel(y0=None)[source]

The “Repressilator” model describes oscillations in a network of proteins that suppress their own creation [1], [2].

The formulation used here is taken from [3] and analysed in [4]. It has three protein states (\(p_i\)), each encoded by mRNA (\(m_i\)). Once expressed, they suppress each other:

\[ \begin{align}\begin{aligned}\dot{m_0} = -m_0 + \frac{\alpha}{1 + p_2^n} + \alpha_0\\\dot{m_1} = -m_1 + \frac{\alpha}{1 + p_0^n} + \alpha_0\\\dot{m_2} = -m_2 + \frac{\alpha}{1 + p_1^n} + \alpha_0\\\dot{p_0} = -\beta (p_0 - m_0)\\\dot{p_1} = -\beta (p_1 - m_1)\\\dot{p_2} = -\beta (p_2 - m_2)\end{aligned}\end{align} \]

With parameters alpha_0, alpha, beta, and n.

Only the mRNA states are visible as output.

Extends pints.ForwardModel, pints.toy.ToyModel.

Parameters:y0 – The system’s initial state, must have 6 entries all >=0.

References

[1]A Synthetic Oscillatory Network of Transcriptional Regulators. Elowitz, Leibler (2000) Nature. https://doi.org/10.1038/35002125
[2]https://en.wikipedia.org/wiki/Repressilator
[3]Dynamic models in biology. Ellner, Guckenheimer (2006) Princeton University Press
[4]Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. Toni, Welch, Strelkowa, Ipsen, Stumpf (2009) J. R. Soc. Interface. https://doi.org/10.1098/rsif.2008.0172
n_outputs()[source]

See pints.ForwardModel.n_outputs().

n_parameters()[source]

See pints.ForwardModel.n_parameters().

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

Rosenbrock function

class pints.toy.RosenbrockError[source]

Error measure based on the rosenbrock function [1].

\[f(x,y) = (1 - x)^2 + 100(y - x^2)^2\]

Extends pints.ErrorMeasure.

References

[1]https://en.wikipedia.org/wiki/Rosenbrock_function
evaluateS1(x)

Evaluates this error measure, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data has the shape (e, e') where e is a scalar value and e' is a sequence of length n_parameters.

This is an optional method that is not always implemented.

n_parameters()[source]

See pints.ErrorMeasure.n_parameters().

optimum()[source]

Returns the global optimum for this function.

class pints.toy.RosenbrockLogPDF[source]

Unnormalised LogPDF based on the Rosenbrock function [2] with an addition of 1 on the denominator to avoid a discontinuity:

\[f(x,y) = -log[1 + (1 - x)^2 + 100(y - x^2)^2 ]\]

Extends pints.toy.ToyLogPDF.

References

[2]https://en.wikipedia.org/wiki/Rosenbrock_function
distance(samples)[source]

Calculates a measure of normed distance of samples from exact mean and covariance matrix assuming uniform prior with bounds given by suggested_bounds().

See pints.toy.ToyLogPDF.distance().

evaluateS1(x)[source]

See LogPDF.evaluateS1().

n_parameters()[source]

See pints.LogPDF.n_parameters().

optimum()[source]

Returns the global optimum for this LogPDF.

sample(n_samples)

Generates independent samples from the underlying distribution.

suggested_bounds()[source]

See pints.toy.ToyLogPDF.suggested_bounds().

Simple Egg Box Distribution

class pints.toy.SimpleEggBoxLogPDF(sigma=2, r=4)[source]

Two-dimensional multimodal Gaussian distribution, with four more-or-less independent modes, each centered in a different quadrant.

Extends pints.toy.ToyLogPDF.

Parameters:
  • sigma (float) – The variance of each mode.
  • r (float) – Determines the positions of the modes, which will be located at (d, d), (-d, d), (-d, -d), and (d, -d), where d = r * sigma.
distance(samples)[source]

Calculates approximate mode-wise KL divergence.

See pints.toy.ToyLogPDF.distance().

evaluateS1(x)[source]

See LogPDF.evaluateS1().

kl_divergence(samples)[source]

Calculates a heuristic score for how well a given set of samples matches this LogPDF’s underlying distribution, based on Kullback-Leibler divergence of the individual modes. This only works well if the modes are nicely separated, i.e. for larger values of r.

n_parameters()[source]

See pints.LogPDF.n_parameters().

sample(n)[source]

See ToyLogPDF.sample().

suggested_bounds()[source]

See ToyLogPDF.suggested_bounds().

Simple Harmonic Oscillator model

class pints.toy.SimpleHarmonicOscillatorModel[source]

Simple harmonic oscillator model for a particle that experiences a force in proportion to its displacement from an equilibrium position, and, in addition, a friction force. The system’s behaviour is determined by a second order ordinary differential equation (from Newton’s second law):

\[\frac{d^2y}{dt^2} = -y(t) - \theta \frac{dy(t)}{dt}\]

Here it has been assumed that the particle has unit mass and that the restoring force has constant of proportionality equal to 1.

The model has three parameters: the initial position of the particle, y(0), its initial momentum, dy/dt(0) and the magnitude of the friction force, theta.

Extends pints.ForwardModel, pints.toy.ToyModel.

References

[1]https://en.wikipedia.org/wiki/Simple_harmonic_motion
n_outputs()

Returns the number of outputs this model has. The default is 1.

n_parameters()[source]

See pints.ForwardModel.n_parameters().

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

simulateS1(parameters, times)[source]

See pints.ForwardModelS1.simulateS1().

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

SIR Epidemiology model

class pints.toy.SIRModel(y0=None)[source]

The SIR model of infectious disease models the number of susceptible (S), infected (I), and recovered (R) people in a population [1], [2].

The particular model given here is analysed in [3],_ and is described by the following three-state ODE:

\[ \begin{align}\begin{aligned}\dot{S} = -\gamma S I\\\dot{I} = \gamma S I - v I\\\dot{R} = v I\end{aligned}\end{align} \]

Where the parameters are gamma (infection rate), and v, recovery rate. In addition, we assume the initial value of S, S0, is unknwon, leading to a three parameter model (gamma, v, S0).

The number of infected people and recovered people are observable, making this a 2-output system. S can be thought of as an unknown number of susceptible people within a larger population.

The model does not account for births and deaths, which are assumed to happen much slower than the spread of the (non-lethal) disease.

Real data is included via suggested_values(), which was taken from [3], [4], [5].

Extends pints.ForwardModel, pints.toy.ToyModel.

Parameters:y0 – The system’s initial state, must have 3 entries all >=0.

References

[1]A Contribution to the Mathematical Theory of Epidemics. Kermack, McKendrick (1927) Proceedings of the Royal Society A. https://doi.org/10.1098/rspa.1927.0118
[2]https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology
[3](1, 2) Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. Toni, Welch, Strelkowa, Ipsen, Stumpf (2009) J. R. Soc. Interface. https://doi.org/10.1098/rsif.2008.0172
[4](1, 2) A mathematical model of common-cold epidemics on Tristan da Cunha. Hammond, Tyrrell (1971) Epidemiology & Infection. https://doi.org/10.1017/S0022172400021677
[5](1, 2) Common colds on Tristan da Cunha. Shybli, Gooch, Lewis, Tyrell (1971) Epidemiology & Infection. https://doi.org/10.1017/S0022172400021483
n_outputs()[source]

See pints.ForwardModel.n_outputs().

n_parameters()[source]

See pints.ForwardModel.n_parameters().

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

suggested_parameters()[source]

Returns a suggested set of parameters for this toy model.

suggested_times()[source]

Returns a suggested set of simulation times for this toy model.

suggested_values()[source]

Returns the data from a common-cold outbreak on the remote island of Tristan da Cunha, as given in [3], [4], [5].

Stochastic degradation model

class pints.toy.StochasticDegradationModel(initial_molecule_count=20)[source]

Stochastic degradation model of a single chemical reaction starting from an initial molecule count \(A(0)\) and degrading to 0 with a fixed rate \(k\):

\[A \xrightarrow{k} 0\]

Simulations are performed using Gillespie’s algorithm [1], [2]:

  1. Sample a random value \(r\) from a uniform distribution
\[r \sim U(0,1)\]
  1. Calculate the time \(\tau\) until the next single reaction as
\[\tau = \frac{-\ln(r)}{A(t) k}\]
  1. Update the molecule count \(A\) at time \(t + \tau\) as:
\[A(t + \tau) = A(t) - 1\]
  1. Return to step (1) until the molecule count reaches 0

The model has one parameter, the rate constant \(k\).

Extends pints.ForwardModel, pints.toy.ToyModel.

Parameters:initial_molecule_count – The initial molecule count \(A(0)\).

References

[1]A Practical Guide to Stochastic Simulations of Reaction Diffusion Processes. Erban, Chapman, Maini (2007). arXiv:0704.1908v2 [q-bio.SC] https://arxiv.org/abs/0704.1908
[2]A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Gillespie (1976). Journal of Computational Physics https://doi.org/10.1016/0021-9991(76)90041-3
interpolate_mol_counts(time, mol_count, output_times)[source]

Takes raw times and inputs and mol counts and outputs interpolated values at output_times

mean(parameters, times)[source]

Returns the deterministic mean of infinitely many stochastic simulations, which follows \(A(0) \exp(-kt)\).

n_outputs()

Returns the number of outputs this model has. The default is 1.

n_parameters()[source]

See pints.ForwardModel.n_parameters().

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

simulate_raw(parameters)[source]

Returns raw times, mol counts when reactions occur

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See “meth:pints.toy.ToyModel.suggested_times().

variance(parameters, times)[source]

Returns the deterministic variance of infinitely many stochastic simulations, which follows \(\exp(-2kt)(-1 + \exp(kt))A(0)\).

Stochastic Logistic Model

class pints.toy.StochasticLogisticModel(initial_molecule_count=50)[source]

This model describes the growth of a population of individuals, where the birth rate per capita, initially \(b_0\), decreases to \(0\) as the population size, \(\mathcal{C}(t)\), starting from an initial population size, \(n_0\), approaches a carrying capacity, \(k\). This process follows a rate according to [1]

\[A \xrightarrow{b_0(1-\frac{\mathcal{C}(t)}{k})} 2A.\]

The model is simulated using the Gillespie stochastic simulation algorithm [2], [3].

Extends: pints.ForwardModel, pints.toy.ToyModel.

Parameters:initial_molecule_count (float) – Sets the initial population size \(n_0\).

References

[1]Simpson, M. et al. 2019. Process noise distinguishes between indistinguishable population dynamics. bioRxiv. https://doi.org/10.1101/533182
[2]Gillespie, D. 1976. A General Method for Numerically Simulating the Stochastic Time Evolution of Coupled Chemical Reactions. Journal of Computational Physics. 22 (4): 403-434. https://doi.org/10.1016/0021-9991(76)90041-3
[3]Erban R. et al. 2007. A practical guide to stochastic simulations of reaction-diffusion processes. arXiv. https://arxiv.org/abs/0704.1908v2
mean(parameters, times)[source]

Computes the deterministic mean of infinitely many stochastic simulations with times \(t\) and parameters (\(b\), \(k\)), which follows: \(\frac{kC(0)}{C(0) + (k - C(0)) \exp(-bt)}\).

Returns an array with the same length as times.

n_outputs()

Returns the number of outputs this model has. The default is 1.

n_parameters()[source]

See pints.ForwardModel.n_parameters().

simulate(parameters, times)[source]

See pints.ForwardModel.simulate().

suggested_parameters()[source]

See pints.toy.ToyModel.suggested_parameters().

suggested_times()[source]

See pints.toy.ToyModel.suggested_times().

variance(parameters, times)[source]

Returns the deterministic variance of infinitely many stochastic simulations.

Twisted Gaussian distribution

class pints.toy.TwistedGaussianLogPDF(dimension=10, b=0.1, V=100)[source]

Twisted multivariate Gaussian ‘banana’ with un-normalised density [1]:

\[p(x_1, x_2, x_3, ..., x_n) \propto \pi(\phi(x_1, x_2, x_2, ..., x_n))\]

where pi is the multivariate Gaussian density with covariance matrix \(\Sigma=\text{diag}(100, 1, 1, ..., 1)\) and

\[\phi(x_1,x_2,x_3,...,x_n) = (x_1, x_2 + b x_1^2 - V b, x_3, ..., x_n),\]

Extends pints.toy.ToyLogPDF.

Parameters:
  • dimension (int) – Problem dimension (n), must be 2 or greater.
  • b (float) – “Bananicity”: b = 0.01 induces mild non-linearity in target density, while non-linearity for b = 0.1 is high. Must be greater than or equal to zero.
  • V (float) – Offset (see equation).

References

[1]Adaptive proposal distribution for random walk Metropolis algorithm Haario, Saksman, Tamminen (1999) Computational Statistics. https://doi.org/10.1007/s001800050022
distance(samples)[source]

Returns approximate Kullback-Leibler divergence of samples from underyling distribution.

See pints.toy.ToyLogPDF.distance().

evaluateS1(x)[source]

See LogPDF.evaluateS1().

kl_divergence(samples)[source]

Calculates the approximate Kullback-Leibler divergence between a given list of samples and the distribution underlying this LogPDF.

The returned value is (near) zero for perfect sampling, and then increases as the error gets larger.

See: https://en.wikipedia.org/wiki/Kullback-Leibler_divergence

n_parameters()[source]

See pints.LogPDF.n_parameters().

sample(n)[source]

See pints.toy.ToyLogPDF.sample().

suggested_bounds()[source]

See pints.toy.ToyLogPDF.suggested_bounds().

untwist(samples)[source]

De-transforms (or “untwists”) a list of samples from the twisted distribution, which should result in a simple multivariate Gaussian again.

Transformations

Transformation objects provide methods to transform between different representations of a parameter space; for example from a “model space” (\(p\)) where parameters have units and some physical counterpart to a “search space” (e.g. \(q = \log(p)\)) where parameters are non-dimensionalised and less-recognisable to the modeller. The transformed space may in many cases prove simpler to work with for inference: leading to more effective and efficient optimisation and sampling.

To perform optimisation or sampling in a transformed space, users can choose to write their pints.ForwardModel in “search space” directly, but the issue with this is that we will no longer be correctly inferring the “model parameters”. An alternative is to write the ForwardModel in model parameters, and pass a Transformation object to e.g. an OptimisationController or MCMCController. Using the Transformation object ensures users get the correct statistics about the model parameters (not the search space parameters).

Parameter transformation can be useful in many situations, for example transforming from a constrained parameter space to an unconstrained search space using RectangularBoundariesTransformation leads to crucial performance improvements for many methods.

Example:

transform = pints.LogTransformation(n_parameters)
mcmc = pints.MCMCController(log_posterior, n_chains, x0, transform=transform)

Overview:

class pints.ComposedTransformation(*transformations)[source]

N-dimensional Transformation composed of one or more other \(N_i\)-dimensional sub-transformations, so that \(\sum _i N_i = N\).

The dimensionality of the individual transformations does not have to be the same, i.e. \(N_i\neq N_j\) is allowed.

For example, a composed transformation:

t = pints.ComposedTransformation(
    transformation_1, transformation_2, transformation_3)

where transformation_1, transformation_2, and transformation_3 have dimension 1, 2 and 1 respectively, will have dimension N=4.

The evaluation and transformation of the composed transformations assume that the input transformations are all independent from each other.

The input parameters of the ComposedTransformation are ordered in the same way as the individual tranforms for the parameter vector. In the above example the transformation may be performed by t.to_search(p), where:

p = [parameter_1_for_transformation_1,
     parameter_1_for_transformation_2,
     parameter_2_for_transformation_2,
     parameter_1_for_transformation_3]

Extends Transformation.

convert_boundaries(boundaries)

Returns a transformed boundaries class.

convert_covariance_matrix(C, q)

Converts a convariance matrix C from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

convert_error_measure(error_measure)

Returns a transformed error measure class.

convert_log_pdf(log_pdf)

Returns a transformed log-PDF class.

convert_log_prior(log_prior)

Returns a transformed log-prior class.

convert_standard_deviation(s, q)

Converts standard deviation s, either a scalar or a vector, from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

To transform the provided standard deviation \(\boldsymbol{s}\), we assume the covariance matrix \(\mathbf{C}(\boldsymbol{p})\) above is a diagonal matrix with \(\boldsymbol{s}^2\) on the diagonal, such that

\[s_i(\boldsymbol{q}) = \left( \mathbf{J}^{-1} (\mathbf{J}^{-1})^T \right)^{1/2}_{i, i} s_i(\boldsymbol{p}).\]
elementwise()[source]

See Transformation.elementwise().

jacobian(q)[source]

See Transformation.jacobian().

jacobian_S1(q)[source]

See Transformation.jacobian_S1().

log_jacobian_det(q)[source]

See Transformation.log_jacobian_det().

log_jacobian_det_S1(q)[source]

See Transformation.log_jacobian_det_S1().

n_parameters()[source]

See Transformation.n_parameters().

to_model(q)[source]

See Transformation.to_model().

See Transformation.to_search().

class pints.IdentityTransformation(n_parameters)[source]

:class`Transformation` that returns the input (untransformed) parameters, i.e. the search space under this transformation is the same as the model space. And its Jacobian matrix is the identity matrix.

Extends Transformation.

Parameters:n_parameters – Number of model parameters this transformation is defined over.
convert_boundaries(boundaries)

Returns a transformed boundaries class.

convert_covariance_matrix(C, q)

Converts a convariance matrix C from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

convert_error_measure(error_measure)

Returns a transformed error measure class.

convert_log_pdf(log_pdf)

Returns a transformed log-PDF class.

convert_log_prior(log_prior)

Returns a transformed log-prior class.

convert_standard_deviation(s, q)

Converts standard deviation s, either a scalar or a vector, from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

To transform the provided standard deviation \(\boldsymbol{s}\), we assume the covariance matrix \(\mathbf{C}(\boldsymbol{p})\) above is a diagonal matrix with \(\boldsymbol{s}^2\) on the diagonal, such that

\[s_i(\boldsymbol{q}) = \left( \mathbf{J}^{-1} (\mathbf{J}^{-1})^T \right)^{1/2}_{i, i} s_i(\boldsymbol{p}).\]
elementwise()[source]

See Transformation.elementwise().

jacobian(q)[source]

See Transformation.jacobian().

jacobian_S1(q)[source]

See Transformation.jacobian_S1().

log_jacobian_det(q)[source]

See Transformation.log_jacobian_det().

log_jacobian_det_S1(q)[source]

See Transformation.log_jacobian_det_S1().

n_parameters()[source]

See Transformation.n_parameters().

to_model(q)[source]

See Transformation.to_model().

See Transformation.to_search().

class pints.LogitTransformation(n_parameters)[source]

Logit (or log-odds) transformation of the model parameters.

The transformation is given by

\[q = \text{logit}(p) = \log(\frac{p}{1 - p}),\]

where \(p\) is the model parameter vector and \(q\) is the search space vector.

The Jacobian adjustment of the logit transformation is given by

\[|\frac{d}{dq} \text{logit}^{-1}(q)| = \text{logit}^{-1}(q) \times (1 - \text{logit}^{-1}(q)).\]

And its derivative is given by

\[\frac{d^2}{dq^2} \text{logit}^{-1}(q) = \frac{d f^{-1}(q)}{dq} \times \left( \frac{\exp(-q) - 1}{exp(-q) + 1} \right).\]

The first order derivative of the log determinant of the Jacobian is

\[\frac{d}{dq} \log(|J(q)|) = 2 \times \exp(-q) \times \text{logit}^{-1}(q) - 1.\]

Extends Transformation.

Parameters:n_parameters – Number of model parameters this transformation is defined over.
convert_boundaries(boundaries)

Returns a transformed boundaries class.

convert_covariance_matrix(C, q)

Converts a convariance matrix C from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

convert_error_measure(error_measure)

Returns a transformed error measure class.

convert_log_pdf(log_pdf)

Returns a transformed log-PDF class.

convert_log_prior(log_prior)

Returns a transformed log-prior class.

convert_standard_deviation(s, q)

Converts standard deviation s, either a scalar or a vector, from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

To transform the provided standard deviation \(\boldsymbol{s}\), we assume the covariance matrix \(\mathbf{C}(\boldsymbol{p})\) above is a diagonal matrix with \(\boldsymbol{s}^2\) on the diagonal, such that

\[s_i(\boldsymbol{q}) = \left( \mathbf{J}^{-1} (\mathbf{J}^{-1})^T \right)^{1/2}_{i, i} s_i(\boldsymbol{p}).\]
elementwise()[source]

See Transformation.elementwise().

jacobian(q)[source]

See Transformation.jacobian().

jacobian_S1(q)[source]

See Transformation.jacobian_S1().

log_jacobian_det(q)[source]

See Transformation.log_jacobian_det().

log_jacobian_det_S1(q)[source]

See Transformation.log_jacobian_det_S1().

n_parameters()[source]

See Transformation.n_parameters().

to_model(q)[source]

See Transformation.to_model().

See Transformation.to_search().

class pints.LogTransformation(n_parameters)[source]

Logarithm transformation of the model parameters:

The transformation is given by

\[q = \log(p),\]

where \(p\) is the model parameter vector and \(q\) is the search space vector.

The Jacobian adjustment of the log transformation is given by

\[|\frac{d}{dq} \exp(q)| = \exp(q).\]

And its derivative is given by

\[\frac{d^2}{dq^2} \exp(q) = \exp(q).\]

The first order derivative of the log determinant of the Jacobian is

\[\frac{d}{dq} \log(|J(q)|) = 1.\]

Extends Transformation.

Parameters:n_parameters – Number of model parameters this transformation is defined over.
convert_boundaries(boundaries)

Returns a transformed boundaries class.

convert_covariance_matrix(C, q)

Converts a convariance matrix C from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

convert_error_measure(error_measure)

Returns a transformed error measure class.

convert_log_pdf(log_pdf)

Returns a transformed log-PDF class.

convert_log_prior(log_prior)

Returns a transformed log-prior class.

convert_standard_deviation(s, q)

Converts standard deviation s, either a scalar or a vector, from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

To transform the provided standard deviation \(\boldsymbol{s}\), we assume the covariance matrix \(\mathbf{C}(\boldsymbol{p})\) above is a diagonal matrix with \(\boldsymbol{s}^2\) on the diagonal, such that

\[s_i(\boldsymbol{q}) = \left( \mathbf{J}^{-1} (\mathbf{J}^{-1})^T \right)^{1/2}_{i, i} s_i(\boldsymbol{p}).\]
elementwise()[source]

See Transformation.elementwise().

jacobian(q)[source]

See Transformation.jacobian().

jacobian_S1(q)[source]

See Transformation.jacobian_S1().

log_jacobian_det(q)[source]

See Transformation.log_jacobian_det().

log_jacobian_det_S1(q)[source]

See Transformation.log_jacobian_det_S1().

n_parameters()[source]

See Transformation.n_parameters().

to_model(q)[source]

See Transformation.to_model().

See Transformation.to_search().

class pints.RectangularBoundariesTransformation(lower_or_boundaries, upper=None)[source]

A generalised version of the logit transformation for the model parameters, which transforms an interval or rectangular boundaries \([a, b)\) to all real number.

The transformation is given by

\[q = f(p) = \text{logit}\left(\frac{p - a}{b - p}\right) = \log(p - a) - \log(b - p),\]

where \(p\) is the model parameter vector and \(q\) is the search space vector. Note that LogitTransformation is a special case where \(a = 0\) and \(b = 1\).

The Jacobian adjustment of the transformation is given by

\[|\frac{d}{dq} f^{-1}(q)| = \frac{b - a}{\exp(q) (1 + \exp(-q)) ^ 2}.\]

And its derivative is given by

\[\frac{d^2}{dq^2} f^{-1}(q) = \frac{d f^{-1}(q)}{dq} \times \left( \frac{\exp(-q) - 1}{exp(-q) + 1} \right).\]

The log-determinant of the Jacobian matrix is given by

\[\log|\frac{d}{dq} f^{-1}(q)| = \sum_i \left( \log(b_i - a_i) - 2 \times \log(1 + \exp(-q_i)) - q_i \right)\]

The first order derivative of the log determinant of the Jacobian is

\[\frac{d}{dq} \log(|J(q)|) = 2 \times \exp(-q) \times \text{logit}^{-1}(q) - 1.\]

For example, to create a transformation with \(p_1 \in [0, 4)\), \(p_2 \in [1, 5)\), and \(p_3 \in [2, 6)\) use either:

transformation = pints.RectangularBoundariesTransformation([0, 1, 2],
                                                           [4, 5, 6])

or:

boundaries = pints.RectangularBoundaries([0, 1, 2], [4, 5, 6])
transformation = pints.RectangularBoundariesTransformation(boundaries)

Extends Transformation.

convert_boundaries(boundaries)

Returns a transformed boundaries class.

convert_covariance_matrix(C, q)

Converts a convariance matrix C from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

convert_error_measure(error_measure)

Returns a transformed error measure class.

convert_log_pdf(log_pdf)

Returns a transformed log-PDF class.

convert_log_prior(log_prior)

Returns a transformed log-prior class.

convert_standard_deviation(s, q)

Converts standard deviation s, either a scalar or a vector, from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

To transform the provided standard deviation \(\boldsymbol{s}\), we assume the covariance matrix \(\mathbf{C}(\boldsymbol{p})\) above is a diagonal matrix with \(\boldsymbol{s}^2\) on the diagonal, such that

\[s_i(\boldsymbol{q}) = \left( \mathbf{J}^{-1} (\mathbf{J}^{-1})^T \right)^{1/2}_{i, i} s_i(\boldsymbol{p}).\]
elementwise()[source]

See Transformation.elementwise().

jacobian(q)[source]

See Transformation.jacobian().

jacobian_S1(q)[source]

See Transformation.jacobian_S1().

log_jacobian_det(q)[source]

See Transformation.log_jacobian_det().

log_jacobian_det_S1(q)[source]

See Transformation.log_jacobian_det_S1().

n_parameters()[source]

See Transformation.n_parameters().

to_model(q)[source]

See Transformation.to_model().

Transforms a parameter vector p from the model space to the search space.

class pints.ScalingTransformation(scalings)[source]

Scaling transformation scales the input parameters by multiplying with an array scalings element-wisely. And its Jacobian matrix is a diagonal matrix with the values of 1 / scalings on the diagonal.

Extends Transformation.

convert_boundaries(boundaries)

Returns a transformed boundaries class.

convert_covariance_matrix(C, q)

Converts a convariance matrix C from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

convert_error_measure(error_measure)

Returns a transformed error measure class.

convert_log_pdf(log_pdf)

Returns a transformed log-PDF class.

convert_log_prior(log_prior)

Returns a transformed log-prior class.

convert_standard_deviation(s, q)

Converts standard deviation s, either a scalar or a vector, from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

To transform the provided standard deviation \(\boldsymbol{s}\), we assume the covariance matrix \(\mathbf{C}(\boldsymbol{p})\) above is a diagonal matrix with \(\boldsymbol{s}^2\) on the diagonal, such that

\[s_i(\boldsymbol{q}) = \left( \mathbf{J}^{-1} (\mathbf{J}^{-1})^T \right)^{1/2}_{i, i} s_i(\boldsymbol{p}).\]
elementwise()[source]

See Transformation.elementwise().

jacobian(q)[source]

See Transformation.jacobian().

jacobian_S1(q)[source]

See Transformation.jacobian_S1().

log_jacobian_det(q)[source]

See Transformation.log_jacobian_det().

log_jacobian_det_S1(q)[source]

See Transformation.log_jacobian_det_S1().

n_parameters()[source]

See Transformation.n_parameters().

to_model(q)[source]

See Transformation.to_model().

See Transformation.to_search().

class pints.Transformation[source]

Abstract base class for objects that provide transformations between two parameter spaces: the model parameter space and a search space.

If trans is an instance of a Transformation class, you can apply the transformation of a parameter vector from the model space p to the search space q by using q = trans.to_search(p) and the inverse by using p = trans.to_model(q).

References

[1](1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14) How to Obtain Those Nasty Standard Errors From Transformed Data. Erik Jorgensen and Asger Roer Pedersen. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.9023
[2]The Matrix Cookbook. Kaare Brandt Petersen and Michael Syskind Pedersen. 2012.
convert_boundaries(boundaries)[source]

Returns a transformed boundaries class.

convert_covariance_matrix(C, q)[source]

Converts a convariance matrix C from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

convert_error_measure(error_measure)[source]

Returns a transformed error measure class.

convert_log_pdf(log_pdf)[source]

Returns a transformed log-PDF class.

convert_log_prior(log_prior)[source]

Returns a transformed log-prior class.

convert_standard_deviation(s, q)[source]

Converts standard deviation s, either a scalar or a vector, from the model space to the search space around a parameter vector q provided in the search space.

The transformation is performed using a first order linear approximation [1] with the Jacobian \(\mathbf{J}\):

\[\begin{split}\mathbf{C}(\boldsymbol{q}) &= \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \mathbf{C}(\boldsymbol{p}) \left( \frac{d\boldsymbol{g}(\boldsymbol{p})}{d\boldsymbol{p}} \right)^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2) \\ &= \mathbf{J}^{-1}(\boldsymbol{q}) \mathbf{C}(\boldsymbol{p}) (\mathbf{J}^{-1}(\boldsymbol{q}))^T + \mathcal{O}(\mathbf{C}(\boldsymbol{p})^2).\end{split}\]

Using the property that \(\mathbf{J}^{-1} = \frac{d\boldsymbol{g}}{d\boldsymbol{p}}\), from the inverse function theorem, i.e. the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.

To transform the provided standard deviation \(\boldsymbol{s}\), we assume the covariance matrix \(\mathbf{C}(\boldsymbol{p})\) above is a diagonal matrix with \(\boldsymbol{s}^2\) on the diagonal, such that

\[s_i(\boldsymbol{q}) = \left( \mathbf{J}^{-1} (\mathbf{J}^{-1})^T \right)^{1/2}_{i, i} s_i(\boldsymbol{p}).\]
elementwise()[source]

Returns True if the transformation is element-wise.

An element-wise transformation is a transformation \(\boldsymbol{f}\) that can be carried out element by element: for a parameter vector \(\boldsymbol{p}\) in the model space and a parameter vector \(\boldsymbol{q}\) in the search space, then it has

\[q_i = f(p_i),\]

where \(x_i\) denotes the \(i^{\text{th}}\) element of the vector \(\boldsymbol{x}\), as opposed to a transformation in which multiple elements are combined to create the transformed elements.

jacobian(q)[source]

Returns the Jacobian matrix of the transformation calculated at the parameter vector q in the search space. For a transformation \(\boldsymbol{q} = \boldsymbol{f}(\boldsymbol{p})\), the Jacobian matrix is defined as

\[\mathbf{J} = \left[\frac{\partial \boldsymbol{f}^{-1}}{\partial q_1} \quad \frac{\partial \boldsymbol{f}^{-1}}{\partial q_2} \quad \cdots \right].\]

This is an optional method. It is needed when transformation of standard deviation Transformation.convert_standard_deviation() or covariance matrix Transformation.convert_covariance_matrix() is needed, or when evaluateS1() is needed.

jacobian_S1(q)[source]

Computes the Jacobian matrix of the transformation calculated at the parameter vector q in the search space, and returns the result along with the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (S, S') where S is a n_parameters by n_parameters matrix and S' is a sequence of n_parameters matrices.

This is an optional method. It is needed when the transformation is used along with a non-element-wise transformation in ComposedTransformation.

log_jacobian_det(q)[source]

Returns the logarithm of the absolute value of the determinant of the Jacobian matrix of the transformation Transformation.jacobian() calculated at the parameter vector q in the search space.

The default implementation numerically calculates the determinant of the full matrix which only works if the optional method Transformation.jacobian() is implemented. If there is an analytic expression for the specific transformation, a reimplementation of this method may be preferred.

This is an optional method. It is needed when transformation is performed on LogPDF and/or that requires evaluateS1(); e.g. not necessary if it’s used for ErrorMeasure without ErrorMeasure.evaluateS1().

log_jacobian_det_S1(q)[source]

Computes the logarithm of the absolute value of the determinant of the Jacobian, and returns the result plus the partial derivatives of the result with respect to the parameters.

The returned data is a tuple (S, S') where S is a scalar value and S' is a sequence of length n_parameters.

Note that the derivative returned is of the log of the determinant of the Jacobian, so S' = d/dq log(|det(J(q))|), evaluated at input.

The absolute value of the determinant of the Jacobian is provided by Transformation.log_jacobian_det(). The default implementation calculates the derivatives of the log-determinant using [2]

\[\frac{d}{dq} \log(|det(\mathbf{J})|) = trace(\mathbf{J}^{-1} \frac{d}{dq} \mathbf{J}),\]

where the derivative of the Jacobian matrix is provided by Transformation.jacobian_S1() and the matrix inversion is numerically calculated. If there is an analytic expression for the specific transformation, a reimplementation of this method may be preferred.

This is an optional method. It is needed when transformation is performed on LogPDF and that requires evaluateS1().

n_parameters()[source]

Returns the dimension of the parameter space this transformation is defined over.

to_model(q)[source]

Transforms a parameter vector q from the search space to the model space.

Transforms a parameter vector p from the model space to the search space.

class pints.TransformedBoundaries(boundaries, transformation)[source]

A pints.Boundaries that accepts parameters in a transformed search space.

Extends pints.Boundaries.

Parameters:
check(q)[source]

See Boundaries.check().

n_parameters()[source]

See Boundaries.n_parameters().

range()[source]

Returns the size of the search space (i.e. upper - lower).

sample(n=1)

Returns n random samples from within the boundaries, for example to use as starting points for an optimisation.

The returned value is a NumPy array with shape (n, d) where n is the requested number of samples, and d is the dimension of the parameter space these boundaries are defined on.

Note that implementing :meth:`sample()` is optional, so some boundary types may not support it.

Parameters:n (int) – The number of points to sample
class pints.TransformedErrorMeasure(error, transformation)[source]

A pints.ErrorMeasure that accepts parameters in a transformed search space.

For the first order sensitivity of a pints.ErrorMeasure \(E\) and a pints.Transformation \(\boldsymbol{q} = \boldsymbol{f}(\boldsymbol{p})\), the transformation is done using

\[\begin{split}\frac{\partial E(\boldsymbol{q})}{\partial q_i} &= \frac{\partial E(\boldsymbol{f}^{-1}(\boldsymbol{q}))}{\partial q_i}\\ &= \sum_l \frac{\partial E(\boldsymbol{p})}{\partial p_l} \frac{\partial p_l}{\partial q_i}.\end{split}\]

Extends pints.ErrorMeasure.

Parameters:
evaluateS1(q)[source]

See ErrorMeasure.evaluateS1().

n_parameters()[source]

See ErrorMeasure.n_parameters().

class pints.TransformedLogPDF(log_pdf, transformation)[source]

A pints.LogPDF that accepts parameters in a transformed search space.

When a TransformedLogPDF object (initialised with a pints.LogPDF of \(\pi(\boldsymbol{p})\) and a Transformation of \(\boldsymbol{q} = \boldsymbol{f}(\boldsymbol{p})\)) is called with a vector argument \(\boldsymbol{q}\) in the search space, it returns \(\log(\pi(\boldsymbol{q}))\) where \(\pi(\boldsymbol{q})\) is the transformed unnormalised PDF of the input PDF, using

\[\pi(\boldsymbol{q}) = \pi(\boldsymbol{f}^{-1}(\boldsymbol{q})) \,\, |det(\mathbf{J}(\boldsymbol{f}^{-1}(\boldsymbol{q})))|.\]

\(\mathbf{J}\) is the Jacobian matrix:

\[\mathbf{J} = \left[\frac{\partial \boldsymbol{f}^{-1}}{\partial q_1} \quad \frac{\partial \boldsymbol{f}^{-1}}{\partial q_2} \quad \cdots \right].\]

Hence

\[\log(\pi(\boldsymbol{q})) = \log(\pi(\boldsymbol{f}^{-1}(\boldsymbol{q}))) + \log(|det(\mathbf{J}(\boldsymbol{f}^{-1}(\boldsymbol{q})))|).\]

For the first order sensitivity, the transformation is done using

\[\frac{\partial \log(\pi(\boldsymbol{q}))}{\partial q_i} = \frac{\partial \log(\pi(\boldsymbol{f}^{-1}(\boldsymbol{q})))}{\partial q_i} + \frac{\partial \log(|det(\mathbf{J})|)}{\partial q_i}.\]

The first term can be calculated using the chain rule

\[\frac{\partial \log(\pi(\boldsymbol{f}^{-1}(\boldsymbol{q})))}{\partial q_i} = \sum_l \frac{\partial \log(\pi(\boldsymbol{p}))}{\partial p_l} \frac{\partial p_l}{\partial q_i}.\]

Extends pints.LogPDF.

Parameters:
evaluateS1(q)[source]

See LogPDF.evaluateS1().

n_parameters()[source]

See LogPDF.n_parameters().

class pints.TransformedLogPrior(log_prior, transformation)[source]

A pints.LogPrior that accepts parameters in a transformed search space.

Extends pints.LogPrior, pints.TransformedLogPDF.

Parameters:
cdf(x)

Returns the cumulative density function at point(s) x.

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_from_unit_cube(u)

Converts samples u uniformly drawn from the unit cube into those drawn from the prior space, typically by transforming using LogPrior.icdf().

u should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

convert_to_unit_cube(x)

Converts samples from the prior x to be drawn uniformly from the unit cube, typically by transforming using LogPrior.cdf().

x should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

evaluateS1(q)

See LogPDF.evaluateS1().

icdf(p)

Returns the inverse cumulative density function at cumulative probability/probabilities p.

p should be an n x d array, where n is the number of input samples and d is the dimension of the parameter space.

mean()

Returns the analytical value of the expectation of a random variable distributed according to this LogPDF.

n_parameters()

See LogPDF.n_parameters().

sample(n)[source]

See pints.LogPrior.sample().

Note that this does not sample from the transformed log-prior but simply transforms the samples from the original log-prior.

Utilities

Overview:

pints.strfloat(x)[source]

Converts a float to a string, with maximum precision.

class pints.Loggable[source]

Interface for classes that can log to a Logger.

_log_init(logger)[source]

Adds this Loggable's fields to a Logger.

_log_write(logger)[source]

Logs data for each of the fields specified in _log_init().

class pints.Logger[source]

Logs numbers to screen and/or a file.

Example

log = pints.Logger()
log.add_counter('id', width=2)
log.add_float('Length')
log.log(1, 1.23456)
log.log(2, 7.8901)
add_counter(name, width=5, max_value=None, file_only=False)[source]

Adds a field for positive integers.

Returns this Logger object.

Parameters:
  • name (str) – This field’s name. Will be displayed in the header.
  • width (int) – A hint for the width of this column. If numbers exceed this width layout will break, but no information will be lost.
  • max_value (int|None) – A hint for the maximum number this field will need to display.
  • file_only (boolean) – If set to True, this field will not be shown on screen.
add_float(name, width=9, file_only=False)[source]

Adds a field for floating point number.

Returns this Logger object.

Parameters:
  • name (str) – This field’s name. Will be displayed in the header.
  • width (int) – A hint for the field’s width. The minimum width is 7.
  • file_only (boolean) – If set to True, this field will not be shown on screen.
add_int(name, width=5, file_only=False)[source]

Adds a field for a (positive or negative) integer.

Returns this Logger object.

Parameters:
  • name (str) – This field’s name. Will be displayed in the header.
  • width (int) – A hint for the width of this column. If numbers exceed this width layout will break, but no information will be lost.
  • file_only (boolean) – If set to True, this field will not be shown on screen.
add_long_float(name, file_only=False)[source]

Adds a field for a maximum precision floating point number.

Returns this Logger object.

Parameters:
  • name (str) – This field’s name. Will be displayed in the header.
  • file_only (boolean) – If set to True, this field will not be shown on screen.
add_string(name, width, file_only=False)[source]

Adds a field showing (at most width characters of) string values.

Returns this Logger object.

Parameters:
  • name (str) – This field’s name. Will be displayed in the header.
  • width (int) – The maximum width for strings to display.
  • file_only (boolean) – If set to True, this field will not be shown on screen.
add_time(name, file_only=False)[source]

Adds a field showing a formatted time (given in seconds).

Returns this Logger object.

Parameters:
  • name (str) – This field’s name. Will be displayed in the header.
  • file_only (boolean) – If set to True, this field will not be shown on screen.
log(*data)[source]

Logs a new row of data.

set_filename(filename=None, csv=False)[source]

Enables logging to a file if a filename is passed in. Logging to file can be disabled by passing filename=None.

Usually, file logging happens in the same format as logging to screen. To obtain csv logs instead, set csv=True

set_stream(stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]

Enables logging to screen if an output stream is passed in. Logging to screen can be disabled by passing stream=None.

class pints.Timer(output=None)[source]

Provides accurate timing.

Example

timer = pints.Timer()
print(timer.format(timer.time()))
format(time=None)[source]

Formats a (non-integer) number of seconds, returns a string like “5 weeks, 3 days, 1 hour, 4 minutes, 9 seconds”, or “0.0019 seconds”.

reset()[source]

Resets this timer’s start time.

time()[source]

Returns the time (in seconds) since this timer was created, or since meth:reset() was last called.

pints.matrix2d(x)[source]

Copies x and returns a 2d read-only NumPy array of floats with shape (m, n).

Raises a ValueError if x has an incompatible shape.

pints.vector(x)[source]

Copies x and returns a 1d read-only NumPy array of floats with shape (n,).

Raises a ValueError if x has an incompatible shape.

pints.sample_initial_points(function, n_points, random_sampler=None, boundaries=None, max_tries=50, parallel=False, n_workers=None)[source]

Samples n_points parameter values to use as starting points in a sampling or optimisation routine on the given function.

How the initial points are determined depends on the arguments supplied. In order of precedence:

  1. If a method random_sampler is provided then this will be used to draw the random samples.
  2. If no sampler method is given but function is a LogPosterior then the method function.log_prior().sample() will be used.
  3. If no sampler method is supplied and function is not a LogPosterior and if boundaries are provided then the method boundaries.sample() will be used to draw samples.

A ValueError is raised if none of the above options are available.

Each sample x is tested to ensure that function(x) returns a finite result within boundaries if these are supplied. If not, a new sample will be drawn. This is repeated at most max_tries times, after which an error is raised.

Parameters:
  • function – A pints.ErrorMeasure or a pints.LogPDF that evaluates points in the parameter space. If the latter, it is optional that function be of type LogPosterior.
  • n_points (int) – The number of initial values to generate.
  • random_sampler – A function that when called returns draws from a probability distribution of the same dimensionality as function. The only argument to this function should be an integer specifying the number of draws.
  • boundaries – An optional set of boundaries on the parameter space of class pints.Boundaries.
  • max_tries (int) – Number of attempts to find a finite initial value across all n_points. By default this is 50 per point.
  • parallel (bool) – Whether to evaluate function in parallel (defaults to False).
  • n_workers (int) – Number of workers on which to run parallel evaluation.

Hierarchy of methods

Pints contains different types of methods, that can be roughly arranged into a hierarchy, as follows.

Sampling

  1. MCMC without gradients
  2. Nested sampling
  3. Particle based samplers
    • SMC
  4. Likelihood free sampling (Need distance between data and states, e.g. least squares?)
    • ABC-MCMC
    • ABC-SMC
  5. 1st order sensitivity MCMC samplers (Need derivatives of LogPDF)
  6. Differential geometric methods (Need Hessian of LogPDF)
    • smMALA
    • RMHMC

Optimisation

All methods shown here are derivative-free methods that work on any ErrorMeasure or LogPDF.

  1. Particle-based methods
    • Evolution strategies (global/local methods)
    • PSO (global method)

Problems in Pints

Pints defines single and multi-output problem classes that wrap around models and data, and over which error measures or log-likelihoods can be defined.

To find the appropriate type of Problem to use, see the overview below:

  1. Systems with a single observable output
  2. Systems with multiple observable outputs
    • Single data set: Use a MultiOutputProblem and any of the appropriate error measures or log-likelihoods