Log-likelihoods¶
The classes below all implement the ProblemLogLikelihood
interface,
and can calculate a log-likelihood based on some time-series Problem
and an assumed noise model.
Example:
logpdf = pints.GaussianLogLikelihood(problem)
x = [1, 2, 3]
fx = logpdf(x)
Overview:
AR1LogLikelihood
ARMA11LogLikelihood
CauchyLogLikelihood
ConstantAndMultiplicativeGaussianLogLikelihood
GaussianIntegratedUniformLogLikelihood
GaussianKnownSigmaLogLikelihood
GaussianLogLikelihood
KnownNoiseLogLikelihood
MultiplicativeGaussianLogLikelihood
ScaledLogLikelihood
StudentTLogLikelihood
UnknownNoiseLogLikelihood
-
class
pints.
AR1LogLikelihood
(problem)[source]¶ Calculates a log-likelihood assuming AR(1) (autoregressive order 1) errors.
In this error model, the ith error term \(\epsilon_i = x_i - f_i(\theta)\) is assumed to obey the following relationship.
\[\epsilon_i = \rho \epsilon_{i-1} + \nu_i\]where \(\nu_i\) is IID Gaussian white noise with variance \(\sigma^2 (1-\rho^2)\). Therefore, this likelihood is appropriate when error terms are autocorrelated, and the parameter \(\rho\) determines the level of autocorrelation.
This model is parameterised as such because it leads to a simple marginal distribution \(\epsilon_i \sim N(0, \sigma)\).
This class treats the error at the first time point (i=1) as fixed, which simplifies the calculations. For sufficiently long time-series, this conditioning on the first observation has at most a small effect on the likelihood. Further details as well as the alternative unconditional likelihood are available in [1] , chapter 5.2.
Noting that
\[\nu_i = \epsilon_i - \rho \epsilon_{i-1} \sim N(0, \sigma^2 (1-\rho^2))\]we thus calculate the likelihood as the product of normal likelihoods from \(i=2,...,N\), for a time series with N time points.
\[L(\theta, \sigma, \rho|\boldsymbol{x}) = -\frac{N-1}{2} \log(2\pi) - (N-1) \log(\sigma') - \frac{1}{2\sigma'^2} \sum_{i=2}^N (\epsilon_i - \rho \epsilon_{i-1})^2\]for \(\sigma' = \sigma \sqrt{1-\rho^2}\).
Extends
ProblemLogLikelihood
.Parameters: problem – A SingleOutputProblem
orMultiOutputProblem
. For a single-output problem two parameters are added (rho, sigma), for a multi-output problem 2 *n_outputs
parameters are added.References
[1] Hamilton, James D. Time series analysis. Vol. 2. New Jersey: Princeton, 1994. -
evaluateS1
(x)¶ Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.
The returned data is a tuple
(L, L')
whereL
is a scalar value andL'
is a sequence of lengthn_parameters
.Note that the derivative returned is of the log-pdf, so
L' = d/dp log(f(p))
, evaluated atp=x
.This is an optional method that is not always implemented.
-
n_parameters
()¶
-
-
class
pints.
ARMA11LogLikelihood
(problem)[source]¶ Calculates a log-likelihood assuming ARMA(1,1) errors.
The ARMA(1,1) model has 1 autoregressive term and 1 moving average term. It assumes that the errors \(\epsilon_i = x_i - f_i(\theta)\) obey
\[\epsilon_i = \rho \epsilon_{i-1} + \nu_i + \phi \nu_{i-1}\]where \(\nu_i\) is IID Gaussian white noise with standard deviation \(\sigma'\).
\[\sigma' = \sigma \sqrt{\frac{1 - \rho^2}{1 + 2 \phi \rho + \phi^2}}\]This model is parameterised as such because it leads to a simple marginal distribution \(\epsilon_i \sim N(0, \sigma)\).
Due to the complexity of the exact ARMA(1,1) likelihood, this class calculates a likelihood conditioned on initial values. This topic is discussed further in [2] , chapter 5.6. Thus, for a time series defined at points \(i=1,...,N\), summation begins at \(i=3\), and the conditional log-likelihood is
\[L(\theta, \sigma, \rho, \phi|\boldsymbol{x}) = -\frac{N-2}{2} \log(2\pi) - (N-2) \log(\sigma') - \frac{1}{2\sigma'^2} \sum_{i=3}^N (\nu_i)^2\]where the values of \(\nu_i\) are calculated from the observations according to
\[\nu_i = \epsilon_i - \rho \epsilon_{i-1} - \phi (\epsilon_{i-1} - \rho \epsilon_{i-2})\]Extends
ProblemLogLikelihood
.Parameters: problem – A SingleOutputProblem
orMultiOutputProblem
. For a single-output problem three parameters are added (rho, phi, sigma), for a multi-output problem 3 *n_outputs
parameters are added.References
[2] Hamilton, James D. Time series analysis. Vol. 2. New Jersey: Princeton, 1994. -
evaluateS1
(x)¶ Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.
The returned data is a tuple
(L, L')
whereL
is a scalar value andL'
is a sequence of lengthn_parameters
.Note that the derivative returned is of the log-pdf, so
L' = d/dp log(f(p))
, evaluated atp=x
.This is an optional method that is not always implemented.
-
n_parameters
()¶
-
-
class
pints.
CauchyLogLikelihood
(problem)[source]¶ Calculates a log-likelihood assuming independent Cauchy-distributed noise at each time point, and adds one parameter: the scale (
sigma
).For a noise characterised by
sigma
, the log-likelihood is of the form:\[\log{L(\theta, \sigma)} = -N\log \pi - N\log \sigma -\sum_{i=1}^N\log(1 + \frac{x_i - f(\theta)}{\sigma}^2)\]Extends
ProblemLogLikelihood
.Parameters: problem – A SingleOutputProblem
orMultiOutputProblem
. For a single-output problem one parameter is addedsigma
, wheresigma
is scale, for a multi-output problemn_outputs
parameters are added.-
evaluateS1
(x)¶ Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.
The returned data is a tuple
(L, L')
whereL
is a scalar value andL'
is a sequence of lengthn_parameters
.Note that the derivative returned is of the log-pdf, so
L' = d/dp log(f(p))
, evaluated atp=x
.This is an optional method that is not always implemented.
-
n_parameters
()¶
-
-
class
pints.
ConstantAndMultiplicativeGaussianLogLikelihood
(problem)[source]¶ Calculates the log-likelihood assuming a mixed error model of a Gaussian base-level noise and a Gaussian heteroscedastic noise.
For a time series model \(f(t| \theta)\) with parameters \(\theta\) , the ConstantAndMultiplicativeGaussianLogLikelihood assumes that the model predictions \(X\) are Gaussian distributed according to
\[X(t| \theta , \sigma _{\text{base}}, \sigma _{\text{rel}}) = f(t| \theta) + (\sigma _{\text{base}} + \sigma _{\text{rel}} f(t| \theta)^\eta ) \, \epsilon ,\]where \(\epsilon\) is a i.i.d. standard Gaussian random variable
\[\epsilon \sim \mathcal{N}(0, 1).\]For each output in the problem, this likelihood introduces three new scalar parameters: a base-level scale \(\sigma _{\text{base}}\); an exponential power \(\eta\); and a scale relative to the model output \(\sigma _{\text{rel}}\).
The resulting log-likelihood of a constant and multiplicative Gaussian error model is
\[\log L(\theta, \sigma _{\text{base}}, \eta , \sigma _{\text{rel}} | X^{\text{obs}}) = -\frac{n_t}{2} \log 2 \pi -\sum_{i=1}^{n_t}\log \sigma _{\text{tot}, i} - \sum_{i=1}^{n_t} \frac{(X^{\text{obs}}_i - f(t_i| \theta))^2} {2\sigma ^2_{\text{tot}, i}},\]where \(n_t\) is the number of measured time points in the time series, \(X^{\text{obs}}_i\) is the observation at time point \(t_i\), and \(\sigma _{\text{tot}, i}=\sigma _{\text{base}} +\sigma _{\text{rel}} f(t_i| \theta)^\eta\) is the total standard deviation of the error at time \(t_i\).
For a system with \(n_o\) outputs, this becomes
\[\log L(\theta, \sigma _{\text{base}}, \eta , \sigma _{\text{rel}} | X^{\text{obs}}) = -\frac{n_tn_o}{2} \log 2 \pi -\sum_{j=1}^{n_0}\sum_{i=1}^{n_t}\log \sigma _{\text{tot}, ij} - \sum_{j=1}^{n_0}\sum_{i=1}^{n_t} \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {2\sigma ^2_{\text{tot}, ij}},\]where \(n_o\) is the number of outputs of the model, \(X^{\text{obs}}_{ij}\) is the observation at time point \(t_i\) of output \(j\), and \(\sigma _{\text{tot}, ij}=\sigma _{\text{base}, j} + \sigma _{\text{rel}, j}f_j(t_i| \theta)^{\eta _j}\) is the total standard deviation of the error at time \(t_i\) of output \(j\).
Extends
ProblemLogLikelihood
.Parameters: problem – A SingleOutputProblem
orMultiOutputProblem
. For a single-output problem three parameters are added (\(\sigma _{\text{base}}\), \(\eta\), \(\sigma _{\text{rel}}\)), for a multi-output problem \(3n_o\) parameters are added (\(\sigma _{\text{base},1},\ldots , \sigma _{\text{base},n_o}, \eta _1,\ldots , \eta _{n_o}, \sigma _{\text{rel},1}, \ldots , \sigma _{\text{rel},n_o})\).-
evaluateS1
(parameters)[source]¶ See
LogPDF.evaluateS1()
.The partial derivatives of the log-likelihood w.r.t. the model parameters are
\[\begin{split}\frac{\partial \log L}{\partial \theta _k} =& -\sum_{i,j}\sigma _{\text{rel},j}\eta _j\frac{ f_j(t_i| \theta)^{\eta _j-1}} {\sigma _{\text{tot}, ij}} \frac{\partial f_j(t_i| \theta)}{\partial \theta _k} + \sum_{i,j} \frac{X^{\text{obs}}_{ij} - f_j(t_i| \theta)} {\sigma ^2_{\text{tot}, ij}} \frac{\partial f_j(t_i| \theta)}{\partial \theta _k} \\ &+\sum_{i,j}\sigma _{\text{rel},j}\eta _j \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {\sigma ^3_{\text{tot}, ij}}f_j(t_i| \theta)^{\eta _j-1} \frac{\partial f_j(t_i| \theta)}{\partial \theta _k} \\ \frac{\partial \log L}{\partial \sigma _{\text{base}, j}} =& -\sum ^{n_t}_{i=1}\frac{1}{\sigma _{\text{tot}, ij}} +\sum ^{n_t}_{i=1} \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {\sigma ^3_{\text{tot}, ij}} \\ \frac{\partial \log L}{\partial \eta _j} =& -\sigma _{\text{rel},j}\eta _j\sum ^{n_t}_{i=1} \frac{f_j(t_i| \theta)^{\eta _j}\log f_j(t_i| \theta)} {\sigma _{\text{tot}, ij}} + \sigma _{\text{rel},j}\eta _j \sum ^{n_t}_{i=1} \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {\sigma ^3_{\text{tot}, ij}}f_j(t_i| \theta)^{\eta _j} \log f_j(t_i| \theta) \\ \frac{\partial \log L}{\partial \sigma _{\text{rel},j}} =& -\sum ^{n_t}_{i=1} \frac{f_j(t_i| \theta)^{\eta _j}}{\sigma _{\text{tot}, ij}} + \sum ^{n_t}_{i=1} \frac{(X^{\text{obs}}_{ij} - f_j(t_i| \theta))^2} {\sigma ^3_{\text{tot}, ij}}f_j(t_i| \theta)^{\eta _j},\end{split}\]where \(i\) sums over the measurement time points and \(j\) over the outputs of the model.
-
n_parameters
()¶
-
-
class
pints.
GaussianIntegratedUniformLogLikelihood
(problem, lower, upper)[source]¶ Calculates a log-likelihood assuming independent Gaussian-distributed noise at each time point where \(\sigma\sim U(a,b)\) has been integrated out of the joint posterior of \(p(\theta,\sigma|X)\),
\[\begin{split}\begin{align} p(\theta|X) &= \int_{0}^{\infty} p(\theta, \sigma|X) \mathrm{d}\sigma\\ &\propto \int_{0}^{\infty} p(X|\theta, \sigma) p(\theta, \sigma) \mathrm{d}\sigma,\end{align}\end{split}\]Note that this is exactly the same statistical model as
pints.GaussianLogLikelihood
with a uniform prior on \(\sigma\).A possible advantage of this log-likelihood compared with using a
pints.GaussianLogLikelihood
, is that it has one fewer parameters (\(sigma\)) which may speed up convergence to the posterior distribution, especially for multi-output problems which will haven_outputs
fewer parameter dimensions.The log-likelihood is given in terms of the sum of squared errors:
\[SSE = \sum_{i=1}^n (f_i(\theta) - y_i)^2\]and is given up to a normalisation constant by:
\[\begin{split}\begin{align} \text{log} L = & - n / 2 \text{log}(\pi) \\ & - \text{log}(2 (b - a) \sqrt(2)) \\ & + (1 / 2 - n / 2) \text{log}(SSE) \\ & + \text{log}\left[\Gamma((n - 1) / 2, \frac{SSE}{2 b^2}) - \Gamma((n - 1) / 2, \frac{SSE}{2 a^2}) \right] \end{align}\end{split}\]where \(\Gamma(u,v)\) is the upper incomplete gamma function as defined here: https://en.wikipedia.org/wiki/Incomplete_gamma_function
This log-likelihood is inherently a Bayesian method since it assumes a uniform prior on \(\sigma\sim U(a,b)\). However using this likelihood in optimisation routines should yield the same estimates as the full
pints.GaussianLogLikelihood
.Extends
ProblemLogLikelihood
.Parameters: - problem – A
SingleOutputProblem
orMultiOutputProblem
. - lower – The lower limit on the uniform prior on sigma. Must be non-negative.
- upper – The upper limit on the uniform prior on sigma.
-
evaluateS1
(x)¶ Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.
The returned data is a tuple
(L, L')
whereL
is a scalar value andL'
is a sequence of lengthn_parameters
.Note that the derivative returned is of the log-pdf, so
L' = d/dp log(f(p))
, evaluated atp=x
.This is an optional method that is not always implemented.
-
n_parameters
()¶
- problem – A
-
class
pints.
GaussianKnownSigmaLogLikelihood
(problem, sigma)[source]¶ Calculates a log-likelihood assuming independent Gaussian noise at each time point, using a known value for the standard deviation (sigma) of that noise:
\[\log{L(\theta | \sigma,\boldsymbol{x})} = -\frac{N}{2}\log{2\pi} -N\log{\sigma} -\frac{1}{2\sigma^2}\sum_{i=1}^N{(x_i - f_i(\theta))^2}\]Extends
ProblemLogLikelihood
.Parameters: - problem – A
SingleOutputProblem
orMultiOutputProblem
. - sigma – The standard devation(s) of the noise. Can be a single value or a sequence of sigma’s for each output. Must be greater than zero.
-
evaluateS1
(x)[source]¶ See
LogPDF.evaluateS1()
.
-
n_parameters
()¶
- problem – A
-
class
pints.
GaussianLogLikelihood
(problem)[source]¶ Calculates a log-likelihood assuming independent Gaussian noise at each time point, and adds a parameter representing the standard deviation (sigma) of the noise on each output.
For a noise level of
sigma
, the likelihood becomes:\[L(\theta, \sigma|\boldsymbol{x}) = p(\boldsymbol{x} | \theta, \sigma) = \prod_{j=1}^{n_t} \frac{1}{\sqrt{2\pi\sigma^2}}\exp\left( -\frac{(x_j - f_j(\theta))^2}{2\sigma^2}\right)\]leading to a log likelihood of:
\[\log{L(\theta, \sigma|\boldsymbol{x})} = -\frac{n_t}{2} \log{2\pi} -n_t \log{\sigma} -\frac{1}{2\sigma^2}\sum_{j=1}^{n_t}{(x_j - f_j(\theta))^2}\]where
n_t
is the number of time points in the series,x_j
is the sampled data at timej
andf_j
is the simulated data at timej
.For a system with
n_o
outputs, this becomes\[\log{L(\theta, \sigma|\boldsymbol{x})} = -\frac{n_t n_o}{2}\log{2\pi} -\sum_{i=1}^{n_o}{ {n_t}\log{\sigma_i} } -\sum_{i=1}^{n_o}{\left[ \frac{1}{2\sigma_i^2}\sum_{j=1}^{n_t}{(x_j - f_j(\theta))^2} \right]}\]Extends
ProblemLogLikelihood
.Parameters: problem – A SingleOutputProblem
orMultiOutputProblem
. For a single-output problem a single parameter is added, for a multi-output problemn_outputs
parameters are added.-
evaluateS1
(x)[source]¶ See
LogPDF.evaluateS1()
.
-
n_parameters
()¶
-
-
class
pints.
KnownNoiseLogLikelihood
(problem, sigma)[source]¶ Deprecated alias of
GaussianKnownSigmaLogLikelihood
.-
evaluateS1
(x)¶ See
LogPDF.evaluateS1()
.
-
n_parameters
()¶
-
-
class
pints.
MultiplicativeGaussianLogLikelihood
(problem)[source]¶ Calculates the log-likelihood for a time-series model assuming a heteroscedastic Gaussian error of the model predictions \(f(t, \theta )\).
This likelihood introduces two new scalar parameters for each dimension of the model output: an exponential power \(\eta\) and a scale \(\sigma\).
A heteroscedascic Gaussian noise model assumes that the observable \(X\) is Gaussian distributed around the model predictions \(f(t, \theta )\) with a standard deviation that scales with \(f(t, \theta )\)
\[X(t) = f(t, \theta) + \sigma f(t, \theta)^\eta v(t)\]where \(v(t)\) is a standard i.i.d. Gaussian random variable
\[v(t) \sim \mathcal{N}(0, 1).\]This model leads to a log likelihood of the model parameters of
\[\log{L(\theta, \eta , \sigma | X^{\text{obs}})} = -\frac{n_t}{2} \log{2 \pi} -\sum_{i=1}^{n_t}{\log{f(t_i, \theta)^\eta \sigma}} -\frac{1}{2}\sum_{i=1}^{n_t}\left( \frac{X^{\text{obs}}_{i} - f(t_i, \theta)} {f(t_i, \theta)^\eta \sigma}\right) ^2,\]where \(n_t\) is the number of time points in the series, and \(X^{\text{obs}}_{i}\) the measurement at time \(t_i\).
For a system with \(n_o\) outputs, this becomes
\[\log{L(\theta, \eta , \sigma | X^{\text{obs}})} = -\frac{n_t n_o}{2} \log{2 \pi} -\sum ^{n_o}_{j=1}\sum_{i=1}^{n_t}{\log{f_j(t_i, \theta)^\eta \sigma _j}} -\frac{1}{2}\sum ^{n_o}_{j=1}\sum_{i=1}^{n_t}\left( \frac{X^{\text{obs}}_{ij} - f_j(t_i, \theta)} {f_j(t_i, \theta)^\eta \sigma _j}\right) ^2,\]where \(n_o\) is the number of outputs of the model, and \(X^{\text{obs}}_{ij}\) the measurement of output \(j\) at time point \(t_i\).
Extends
ProblemLogLikelihood
.Parameters: problem – A SingleOutputProblem
orMultiOutputProblem
. For a single-output problem two parameters are added (\(\eta\), \(\sigma\)), for a multi-output problem 2 times \(n_o\) parameters are added.-
evaluateS1
(x)¶ Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.
The returned data is a tuple
(L, L')
whereL
is a scalar value andL'
is a sequence of lengthn_parameters
.Note that the derivative returned is of the log-pdf, so
L' = d/dp log(f(p))
, evaluated atp=x
.This is an optional method that is not always implemented.
-
n_parameters
()¶
-
-
class
pints.
ScaledLogLikelihood
(log_likelihood)[source]¶ Calculates a log-likelihood based on a (conditional)
ProblemLogLikelihood
divided by the number of time samples.The returned value will be
(1 / n) * log_likelihood(x|problem)
, wheren
is the number of time samples multiplied by the number of outputs.This log-likelihood operates on both single and multi-output problems.
Extends
ProblemLogLikelihood
.Parameters: log_likelihood – A ProblemLogLikelihood
to scale.-
evaluateS1
(x)[source]¶ See
LogPDF.evaluateS1()
.This method only works if the underlying
LogPDF
object implements the optional methodLogPDF.evaluateS1()
!
-
n_parameters
()¶
-
-
class
pints.
StudentTLogLikelihood
(problem)[source]¶ Calculates a log-likelihood assuming independent Student-t-distributed noise at each time point, and adds two parameters: one representing the degrees of freedom (
nu
), the other representing the scale (sigma
).For a noise characterised by
nu
andsigma
, the log likelihood is of the form:\[\log{L(\theta, \nu, \sigma|\boldsymbol{x})} = N\frac{\nu}{2}\log(\nu) - N\log(\sigma) - N\log B(\nu/2, 1/2) -\frac{1+\nu}{2}\sum_{i=1}^N\log(\nu + \frac{x_i - f(\theta)}{\sigma}^2)\]where
B(.,.)
is a beta function.Extends
ProblemLogLikelihood
.Parameters: problem – A SingleOutputProblem
orMultiOutputProblem
. For a single-output problem two parameters are added(nu, sigma)
, wherenu
is the degrees of freedom andsigma
is scale, for a multi-output problem2 * n_outputs
parameters are added.-
evaluateS1
(x)¶ Evaluates this LogPDF, and returns the result plus the partial derivatives of the result with respect to the parameters.
The returned data is a tuple
(L, L')
whereL
is a scalar value andL'
is a sequence of lengthn_parameters
.Note that the derivative returned is of the log-pdf, so
L' = d/dp log(f(p))
, evaluated atp=x
.This is an optional method that is not always implemented.
-
n_parameters
()¶
-
-
class
pints.
UnknownNoiseLogLikelihood
(problem)[source]¶ Deprecated alias of
GaussianLogLikelihood
-
evaluateS1
(x)¶ See
LogPDF.evaluateS1()
.
-
n_parameters
()¶
-