First, we need to download the data. We are going to use the dataset produced by the Baryon Oscillation Spectroscopic Survey (BOSS) collaboration which represents the largest dataset of its kind to date. You can download all necessary data with the following 2 wget commands, just provide the destination path at the end. This will download a bit less than 3Gb of zipped data which turns into 8Gb after you unzip it.
wget -N https://data.sdss.org/sas/dr12/boss/lss/galaxy_DR12v5_CMASSLOWZTOT_North.fits.gz -P path/to/folder/ wget -N https://data.sdss.org/sas/dr12/boss/lss/galaxy_DR12v5_CMASSLOWZTOT_South.fits.gz -P path/to/folder/
conda create --name nbodykit-env python=3 source activate nbodykit-env conda install -c bccp nbodykit
pip install nbodykit
We start by reading in the data catalog using the fitsio package, which is included in nbodykit
import nbodykit.lab def read_data(filename): ''' Here we read the data catalogues ''' galaxies_cat = nbodykit.lab.FITSCatalog(os.path.join(base_dir, filename)) print('galaxies_cat.columns = ', galaxies_cat.columns) galaxies_cat = galaxies_cat[(galaxies_cat['Z'] > 0.01) & (galaxies_cat['Z'] < 0.9)] galaxies_cat['Weight'] = galaxies_cat['WEIGHT_SYSTOT'] * (galaxies_cat['WEIGHT_NOZ'] + galaxies_cat['WEIGHT_CP'] - 1.0) return galaxies_cat
galaxies_cat.columns = ['AIRMASS', 'CAMCOL', 'COMP', 'DEC', 'DEVFLUX', 'EB_MINUS_V', 'EXPFLUX', 'EXTINCTION', 'FIBER2FLUX', 'FIBERID', 'FIELD', 'FINALN', 'FRACPSF', 'ICHUNK', 'ICOLLIDED', 'ID', 'IMAGE_DEPTH', 'IMATCH', 'INGROUP', 'IPOLY', 'ISECT', 'MJD', 'MODELFLUX', 'MULTGROUP', 'NZ', 'PLATE', 'PSFFLUX', 'PSF_FWHM', 'RA', 'RERUN', 'RUN', 'R_DEV', 'SKYFLUX', 'SPECTILE', 'Selection', 'TILE', 'Value', 'WEIGHT_CP', 'WEIGHT_FKP', 'WEIGHT_NOZ', 'WEIGHT_SEEING', 'WEIGHT_STAR', 'WEIGHT_SYSTOT', 'Weight', 'Z']
In this tutorial, we will use only a subset of the columns given here. Namely, we will only use the positions of the galaxies given by the angles on the sky in 'RA' (right ascension) and 'DEC' (declination) as well as the redshift 'Z'. The redshift describes how much the light spectrum of the galaxy has been shifted in wavelength due to the expansion of the Universe. Most of the galaxies in this dataset are several billion light-years away from us, meaning the light travelled a significant fraction of the existence of the Universe before reaching us. Galaxies which are further away have a larger redshift and we will use the redshift later together with the right ascension and declination to get a 3D position for all galaxies.
Let's explore the dataset a bit before we go any further. We can read in the data
filename = 'galaxy_DR12v5_CMASSLOWZTOT_South.fits' data_south = read_data(filename) filename = 'galaxy_DR12v5_CMASSLOWZTOT_North.fits' data_north = read_data(filename) plot_data(data_south, data_north)
def plot_data(cat1, cat2): ''' Plot the catalogues ''' cat1 = down_sample(cat1, 1000) cat1['RA'] -= 180 cat2 = down_sample(cat2, 1000) cat2['RA'] -= 180 plt.clf() plt.subplot(projection="aitoff") plt.title("BOSS DR12 survey footprint", y=1.1) plt.plot(np.radians(cat1['RA'].compute()), np.radians(cat1['DEC'].compute()), '.') plt.plot(np.radians(cat2['RA'].compute()), np.radians(cat2['DEC'].compute()), '.') plt.grid(True) plt.show()
def down_sample(cat, N): x = np.random.uniform(size=cat.size) return cat[x < float(N) / cat.size]
Figure 1: The sky distribution of galaxies in the BOSS galaxy survey using Aitoff projection.
Only 1000 points for each region are plotted.
|
The northern part has about twice as many galaxies as the southern part and in total this dataset covers about 1 quarter of the sky.
We can also plot the redshift distribution
bins = np.arange(0., 0.9, 0.01) plt.hist(cat1['Z'].compute(), weights=cat1['Weight'].compute(), bins=bins, alpha=0.4, label='BOSS South') plt.hist(cat2['Z'].compute(), weights=cat2['Weight'].compute(), bins=bins, alpha=0.4, label='BOSS North') plt.xlabel("redshift") plt.legend(loc=0) plt.show()
Figure 2: Redshift distribution of the BOSS galaxy sample. Redshift
0.7 corresponds to about 6 000 000 000 (6 billion) light-years. So when the light from
these galaxies was sent out the Universe was only about 7 billion years old,
while now it is 13 billion years old.
while now it is 13 billion years old.
The next step is to turn the RA, DEC, z system into Cartesian coordinates. However, this transformation depends on the cosmological model. Using the current standard model of cosmology (also called $\Lambda$CDM) we can set up a cosmological framework by fixing a small set of parameters
(1) The amount of baryonic matter (Omega0_b)
(2) The amount of cold dark matter (Omega0_cdm)
(1) The amount of baryonic matter (Omega0_b)
(2) The amount of cold dark matter (Omega0_cdm)
(3) The Hubble parameter h (or more precisely the Hubble parameter divided by 100, h = H/100)
In principle, there are more free parameters, but these are the ones that matter to this analysis. Using nbodykit we can set a cosmology with
Here we also specified that we want to use a flat cosmology, which has no curvature which means that the dark energy parameter is automatically fixed through
cosmo = nbodykit.lab.cosmology.Cosmology(Omega0_cdm=0.265, Omega0_b=0.045, h=0.7)
\[
\Omega_{DarkEnergy} = 1 - \Omega_{cdm} + \Omega_b.
\]These parameters constrain the expansion history of the Universe, including its size and age. The dataset we just read with our Python script is powerful enough to constrain these parameters and therefore teach us something about dark matter and dark energy. We will get to that in a later post.
Here we want to convert the angles (RA, DEC) together with the redshift to a 3D position for the galaxies. To do this we can just call a function provided by nobodykit
with this, we can now make a 3D plot of all galaxies
\Omega_{DarkEnergy} = 1 - \Omega_{cdm} + \Omega_b.
\]These parameters constrain the expansion history of the Universe, including its size and age. The dataset we just read with our Python script is powerful enough to constrain these parameters and therefore teach us something about dark matter and dark energy. We will get to that in a later post.
Here we want to convert the angles (RA, DEC) together with the redshift to a 3D position for the galaxies. To do this we can just call a function provided by nobodykit
galaxies_cat_1['Position'] = nbodykit.transform.SkyToCartesian(galaxies_cat_1['RA'],\ galaxies_cat_1['DEC'], galaxies_cat_1['Z'], cosmo=cosmo) galaxies_cat_2['Position'] = nbodykit.transform.SkyToCartesian(galaxies_cat_2['RA'],\ galaxies_cat_2['DEC'], galaxies_cat_2['Z'], cosmo=cosmo)
Figure 3: 3D distribution of BOSS galaxies. Only 1000 randomly selected
galaxies out of more than 1 million in this dataset are included in this plot.
Milky Way is located in the middle and the galaxies in the northern (yellow)
and southern (blue) part are observed in two opposite directions.
And this concludes this first look at the BOSS dataset. This dataset has only been released last year (2016) and represents cutting edge cosmological research. How to use this dataset to constrain dark matter and dark energy will be the subject of the next blog posts.
The entire Python code for this project is available on GitHub.
The entire Python code for this project is available on GitHub.
cheers
Florian
Florian
Just a small typo, "from nbodykit.lab" should probably be "import nbodykit.lab". Also for some strange reason, something in nbodykit is raising strange errors if you try to run this in a Jupyter notebook. Do you understand why?
ReplyDeleteHuge Thanks for this site, it is very useful to the begin the studies with cosmology.
ReplyDelete