4th Blog Post: Studying CR7 with ALMA

We began this week by analysing the negative slices, using the same methodology as the week before. The intention of this was obtain a measure of to what extent the noise could have affected our detections in the positive mapping. The principle underpinning this technique is that any negative flux that exists within our slices is of nonphysical origin, and hence by calculating their significance in an aperture set to the size of the PSF (Point spread function) we can begin a statistical analysis of our data and begin computing a Luminosity Function for the region around CR7.


The above graph is a histogram of the signal to noise including all the apertures from our sample above 3σ. Clumps A, B and C of CR7 are highlighted in the lines. The positive apertures are given by the red lines, and the blue is from the negative map. Interesting to note, is that whilst at the lower end there appears to be an even number of bins with more positive than negative and vice versa, towards the higher end we observe an offset of positive over negative detections at the highest signal to noise. These are expected to correspond to real detections. Even more interestingly, our highest signal to noise source that we detected was in fact a negative, at 5.21σ. This has interesting implications into the nature of noise in the ALMA data to detect noise that is so significant. This is an interesting thing to observe, and we wonder if it is of the same origin as noise (from the CCD) in the rest of the cube or from something completely different to do with the observational apparatus.

What we want to achieve from this data, is to now compute a ‘CII Luminosity Function’ for the region surrounding CR7, and to compare this to theoretical models for this place and time in the Universe. A luminosity function relates the number density of galaxies to the luminosity they emit. This will allow us to observe whether this is an over dense region which holds implications on the nature of reionization in the Universe. One of the first steps in achieving this, is to calculate the volume of the region of the cube, so we can determine the number density of the galaxies in this region of space.

In order to calculate the volume of the ALMA data cube we started by looking at the redshift difference from the closest end of the cube to the far end of the cube. Using z=6.604 as the redshift at the centre of the cube we could calculate the redshift at each end by looking at the velocity offset. In one direction, the velocity offset was 957 km/s and the other -1289 km/s. Using Δz =  Δv/c, where c is the speed of light, we calculated the redshift at each end of the cube. We could then use Ned Wright’s JavaScript Cosmological Calculator to find the co-moving volume at each redshift, with the difference between the two giving us a shell of volume within which the cube lies. In order to find the volume of the cube we then multiplied this shell by the area of one image slice over 4π (this is the fraction of the sky that the cube takes up). Each slice was found, from Gaia, to have an area of 51.2 arc seconds x 51.2 arc seconds. The total volume of our cube was calculated to be 12.33 mega-parsecs cubed.


3rd Blog Post: Studying CR7 with ALMA

Following on from last week, we began trying to automate the whole method so that it was easier to determine sources from the slices of the cube quicker and easier. To do this we needed to make edits to the original scripts we were given by David so that it could automatically import data from a table and print the results into a file once we had identified all the possible sources on each slice in GAIA. We began by learning more about coding with the help of Google and after many tried and failed attempts we were able to get it to work so that the script would calculate the signal to noise ratio and the star formation rate for the positions of the potential sources.

We began our analysis of the slices by first using the noise measuring script over 50,000 apertures to give us the best value for 3σ to use to significantly reduce the noise in GAIA and make spotting candidates for real sources an easier job. This script printed off the RMS standard deviation in Jy km/s and counts/pixel – which was used as sigma. We then wanted to extract the X and Y pixel coordinates from the image, to be read by our modified script which creates its own apertures, and measures the CII flux and calculates many other properties of the apertures such as Star Formation Rate (SFR) etc, signal to noise etc.

We used GAIA’s aperture photometry tool to extract X and Y pixel coordinates from any candidate sources because this tool, when calculating the apertures, ‘snaps’ on to the region of highest flux. This will give us the best coordinates possible to use to measure the CII flux through any given aperture using our script. We placed these apertures over every single point, however bright or faint, knowing that regardless of the position the script will measure the signal to noise, and we can later determine a threshold of signal to noise at which we can discard all data below, to maximize statistical chance of the sources we observe being real.

Figure 1: 1-Raw ‘noisy’ data slice. 2-Removing pixel counts below 3σ. 3-Candidate sources. 4-Gaia apertures (PSF)

We individually followed this exact method through each of the ‘slices’ of the ALMA cube, so that we could then compare our results to improve their accuracy and demonstrate the reproducibility of the experiment. From this we were now able to get it to print into a file the coordinates of each source, the flux, luminosity, signal to noise ratio and star formation rate in a table.

This data was then taken and imported into Topcat so that we could make a plot of the cube with all the positions of the potential sources with  signal to noise greater than 3 sigma, 4σ and 5σ. This gave an image like the one below.

Figure 2: 1-3σ cut-off for our sources. 2-4σ cut-off for our sources. 3-5σ cut-off for our sources.

In each slice, we found many sources with 3σ significance, on average perhaps two to three with 4σ, and in only two slices we found 5σ sources, as can be seen in the gif above. Next week, we will be determining a cutoff point, by looking at the negatives of these slices. These negatives allow us to do this because by reversing the image values, all of the actual sources are removed, and the most negative (and hence noisiest) regions may become visible above the 3σ limit. By analysing these negative images we can go on to determine a threshold of S/N at which we will be able to determine our most robust sources.

2nd Blog Post: Studying CR7 with ALMA

After two productive weeks working with the team on the ALMA project, we have developed a good level of insight into the fundamental principles that are underlying the method we will use to search for CII line emitters in the region around CR7.

 This has included expanding on our method for determination of noise from the CII image. We began taking more localized noise measurements that were surrounding anything we identified as being a potential source. This is because noise may be non uniform across the image, and some regions may be systematically noisier than others. Several problems arose from this, however. We found that many of the sub 3σ sources that we identified became sources as localized noise was reduced, which is fine, but many of the 3σ sources, including clump A of CR7, were now discounted using this method. This was because we were only making 50-100 apertures for each individual noise measurement, and only about 300 for the overall image. This can be fixed utilizing a script to repeat the process. David sent us a script for this purpose which can be set to repeat this process for any number of apertures, and generates an astoundingly strong Gaussian distribution in our histogram, which was expected of our results. Using statistics from this distribution, this more accurate method of noise level determination will form part of the basis of the rest of our project.

Histogram showing a Gaussian distribution for noise measurements.

We have also been considering what physical properties can be inferred from data that can be extracted from these images. In particular, how CII luminosity can be measured, and then how this CII luminosity can be used to estimate a star formation rate. For CII luminosity, we found that ALMA data is calibrated by using nearby sources of known luminosity such as bright quasars, and comparing this to the flux observed from a source a CII luminosity can be calculated. GAIA in fact can read this calibrated information which is stored within the .fits files. This in the units of mJy (milli-Jansky), a measurement of spectral flux density used in Astrophysics. Another script sent to us from David to try, in fact is made to measure the flux from a source. This script calculated for clump A of CR7 to be L=0.50×10^8 in units of Solar Luminosity.

This information can then be used to infer a star formation rate, by using an equation found in a paper from Vallini et al.  in which a CII luminosity-Star formation rate relationship is inferred from the best fit line of observed data, taking into account how star formation rate varies with CII luminosity and metallicity of the source. This equation, using observed data and taking a lower bound of metallicity determined the star formation rate of clump A of CR7 to be ~44 Solar masses per year. This is a reasonable result, in the right order of magnitude and varies only according to what initial mass function model you use.

We will be using methods similar to those outlined in Emma’s blog post from last week, utilizing both python scripts and eye observation to analyze ‘slices’ from a cube of ALMA data. This ‘cube’ is essentially an image which contains information including a third dimension of distance, by accounting for small changes in redshift of the emission line. This 3D image can be split into ‘slices’ of space which we can analyze individually, first determining the noise using a scripted version of the method outlined in Emma’s previous blog post. Using statistical data from the noise, we can use GAIA’s features such as aperture photometry and SExtractor, to identify potential candidates for CII line emitters in the region about CR7. The unique feature of a cube, is that we can create a 3D map from the catalog of any observed emitters. We will also be using a CII flux measurement script to calculate the CII flux in a given aperture size (set to the Point Spread Function of the image) to measure the CII Luminosity.

After generating a catalog from this ALMA data cube of sources, we can begin exploring data from Hubble and search for different emission lines, that we can use to confirm which of these in the catalog that we have identified actually correspond to real sources. We are hoping to find a few sources in the region surrounding CR7 using this method.

The ALMA data cube.

1st Blog Post: Studying CR7 with ALMA

At the beginning of the first week I wasn’t sure what to expect and was worried about being the only first year. This is my first time completing an internship and my first taste into what being a researcher consists of.

On the first day, we downloaded all the software onto our laptops and started to have a look at what it could do. We are using GAIA, DS9 and Topcat to look at the 3D cubes from ALMA (Atacama Large Millimeter/submillimeter Array). We also did a lot of background reading about CR7. CR7 is a high redshift Lyman alpha emitter galaxy discovered by David Sobral and is the focus of our research for the next few weeks. The galaxy is modelled to be made up of three clumps, A, B and C with each region giving off different emissions.

The 3D cubes from ALMA are composed of different frequencies which are collapsed into an image that can be analysed using GAIA or DS9. We are looking at CII maps of CR7, as this is a good indicator of star formation rates in galaxies, and we started by using GAIA to look at the sources and determine if they are real sources or just noise.


In order to do this, we used aperture photometry, this consists of summing the pixel counts within a circular aperture of fixed size and subtracting the average sky count. (Count is the number of photons incident on a point and is related to flux). In GAIA, we used the Gaussian sky setting which assumes all errors are Gaussian. We took into account the effects of the size of the aperture, matching it to the size of clump A using contours. We repeated the procedure many times to be able to quantify the noise and look at the spread of the noise in different regions of the image. Later we will run scripts that can do this for us but we started off carrying it out by hand in order to get an understanding.

In final conclusions, I found 3 potential sources in the image with source 2 corresponding to the location of clump A in CR7.