Wednesday, January 3, 2018

Plotting MCMC chains in Python using getdist

This is a quick introduction to the getdist package by Antony Lewis, which allows visualizing MCMC chains.
from getdist import plots, MCSamples
import numpy as np

def main():
    mean = [0, 0, 0]
    cov = [[1, 0, 0], [0, 100, 0], [0, 0, 8]]
    x1, x2, x3 = np.random.multivariate_normal(mean, cov, 5000).T
    s1 = np.c_[x1, x2, x3]

    mean = [0.2, 0.5, 2.]
    cov = [[0.7, 0.3, 0.1], [0.3, 10, 0.25], [0.1, 0.25, 7]]
    x1, x2, x3 = np.random.multivariate_normal(mean, cov, 5000).T
    s2 = np.c_[x1, x2, x3]

    names = ['x_1', 'x_2', 'x_3']
    samples1 = MCSamples(samples=s1, labels=names, label='sample1')
    samples2 = MCSamples(samples=s2, labels=names, label='sample2')

    g = plots.getSubplotPlotter()
    g.triangle_plot([samples1, samples2], filled=True)
    g.export('triangle_plot.png')
    g.export('triangle_plot.pdf')
    return 

if __name__ == "__main__":
    main()
which results in

We can also plot the constraints on the individual parameters
def get_constraints(samples):
    for i, mean in enumerate(samples.getMeans()):
        upper = samples.confidence(i, upper=True, limfrac=0.05)
        print("\nupper limit 95 C.L. = %f" % upper)
        lower = samples.confidence(i, upper=False, limfrac=0.05)
        print("lower limit 95 C.L. = %f" % lower)
        print("%s = %f +/- %f +/- %f" % (samples.parLabel(i),\
        mean, mean - samples.confidence(i, limfrac=0.16),\
        mean - samples.confidence(i, limfrac=0.025)) )
    return 
which in our case looks like this
upper limit 95 C.L. = 1.653261
lower limit 95 C.L. = -1.637202
x_1 = 0.000167 +/- 0.996626 +/- 1.958270

upper limit 95 C.L. = 16.365135
lower limit 95 C.L. = -16.484502
x_2 = -0.010400 +/- 9.874866 +/- 19.682727

upper limit 95 C.L. = 4.637981
lower limit 95 C.L. = -4.656869
x_3 = -0.009280 +/- 2.827152 +/- 5.571246
While in our case we set the variance and covariance of the parameters manually at the beginning to generate the samples, usually one gets an MCMC chain and wants to determine these parameters. To get the covariance matrix between two parameters we have to use the cov() function and provide the index of the parameters
print("cov(x_1, x_3) = ", samples2.cov(pars=[0,2]))
print("cov(x_1, x_2) = ", samples2.cov(pars=[0,1]))
which in our case just reproduces the input values
cov(x_1, x_3) =  [[ 0.69364079  0.10129875]
 [ 0.10129875  7.13148423]]
cov(x_1, x_2) =  [[  0.69364079   0.30451831]
 [  0.30451831  10.05805092]]
I posted the example code above on GitHub
best
Florian

1 comment:

  1. Thanks for sharing. It was really helpful.
    Also, I am trying to get estimates for my model with 5 parameters using Hubble and Sn1 data(It works for the standard model with two parameters). However, my values of all parameters have values like 2 with errors ranging from +1.5 to -1.2, How do I constrain my parameters more accurately?

    ReplyDelete