My statistics blog: Study the Universe with Python tutorial, part 2 -- power spectrum

In the first part of this series we discussed how to download the galaxy catalogue of the Baryon Oscillation Spectroscopic Survey (BOSS). We also made some plots to show the distribution of galaxies. In this blog post we will calculate a summary statistic of the dataset which is the first step to use the data to constrain cosmological parameters, like dark matter and dark energy.

As we saw in the last blog post, the BOSS dataset contains about 1 million galaxies and their distribution in the Universe. The position of these galaxies is what carries the cosmological information. Imagine that there would be more dark matter in the Universe. The additional matter would have additional gravitational force which would clump together the material. On the other hand with less dark matter, the galaxies would be more spread out.

Figure 2: Distribution of galaxies in a
Universe with 30% of its total energy
in the form of dark matter.

Figure 1: Distribution of galaxies in a
Universe with 10% of its total energy
in the form of dark matter.

Even though the two Figures above look very similar you can see that the distributed points are more spread out on the left compared to the right Figure. The only difference is that on the left we have 10% of all the energy in the Universe in the form of dark matter ($\Omega_{cdm} = 0.1$), while in the Figure on the right we have 30% of all the energy in the form of dark matter ($\Omega_{cdm} = 0.3$).

We can now compare these distributions with our data from the BOSS survey and depending on which of the two distributions looks more like the data, we can determine how much dark matter there is in the Universe.

In practice we do not want to compare the actual distribution of galaxies, but instead we compare a summary statistic, meaning a measurement from these galaxies, which carries all the information we need. There are many choices for such a summary statistic, but here we will use the power spectrum.

To calculate the power spectrum we first need to download two more catalogues

wget -N https://data.sdss.org/sas/dr12/boss/lss/random0_DR12v5_CMASSLOWZTOT_North.fits.gz -P path/to/folder/
wget -N https://data.sdss.org/sas/dr12/boss/lss/random0_DR12v5_CMASSLOWZTOT_South.fits.gz -P path/to/folder/

which contain a random distribution of points. These distributions are needed to calibrate the data catalogues.

To use these catalogues we write a quick read in function which should look like this

def read_ran(filename):
    ''' Read the random catalogues '''
    randoms_cat = nbodykit.lab.FITSCatalog(os.path.join(base_dir, filename))
    print('randoms_cat.columns = ', randoms_cat.columns)
    randoms_cat = randoms_cat[(randoms_cat['Z'] > 0.01) & (randoms_cat['Z'] < 0.9)]
    return randoms_cat

Now we can combine the data catalogues and random catalogue, assign the point distributions to a 3D grid and calculate the power spectrum, all using nbodykit.

# Combine data and random catalogue
fkp = nbodykit.lab.FKPCatalog(data, random)
# Assign point distribution to 3D grid
mesh = nbodykit.lab.fkp.to_mesh(Nmesh=512, nbar='NZ', comp_weight='WEIGHT', fkp_weight='WEIGHT_FKP')
# Calculate power spectrum (monopole only)
r = nbodykit.lab.ConvolvedFFTPower(mesh, poles=[0], dk=0.01, kmin=0.01)

Here we use a grid with 512 grid points in each of the 3 dimensions. There are also two weightings included in this calculation, the 'Weight' and 'WEIGHT_FKP' columns. The first weight refers to a correction of incompleteness in the data catalogue, while the second weight tries to optimise the signal-to-noise.

If we use this code to measure the BOSS power spectrum for the northern and southern part we get

Figure 3: Power spectrum measurements for the northern (orange)

and southern (blue) parts of the BOSS dataset.

Here we see that the power spectra of the northern and southern parts are slightly different. Whether these differences are significant is not clear though, since we don't have uncertainties on these measurements (yet).

We could now compare these measurements with models for the power spectrum like

Figure 4: Power spectrum models with different amounts of dark

matter using the class Boltzmann code.

This would allow us to determine the amount of dark matter in the Universe and would be much more practical compared to the comparison of point distributions which we looked at above.

However, before we can go ahead and constrain cosmological parameters we have to get an estimate of the uncertainty on the measurements. That will be the subject of the next post.
You can find the code for this project on GitHub.
cheers
Florian

My statistics blog

Monday, October 9, 2017

Study the Universe with Python tutorial, part 2 -- power spectrum

No comments:

Post a Comment