In this project, we are looking to find young, primeval galaxies, so for this week, we were focusing on understanding why and how the data collected from the GOODS-S field is filtered so that we are left with a high proportion of these galaxies. We can identify these galaxies by looking at the wavelengths of light that they emit, and also the strength of the light emitted. These young galaxies form lots of stars which means that they are emitting lots of light from certain elements at specific wavelengths, which are Hydrogen alpha, Oxygen II & III, Carbon IV and Lyman alpha. Finding the strength of the light at these wavelengths will help us identify these galaxies. Also, we have to look very far back in time to see these galaxies so young, and because light travels at a finite speed, we have to look at galaxies very far away. These primeval galaxies are significantly further away than most objects in the night sky, so to identify these objects we need a way of seeing how old the sources are.
Equivalent width is a way of finding the strength of a certain emission line. This is found by plotting a graph of intensity against wavelength and finding the area under the curve, as shown in figure 1. A rectangle is then made with the same area, with the width of this rectangle being the equivalent width. We will pick a minimum value of EW_obs and use this to cut out some of the noise.
Sigma quantifies the significance of the colour excess, which is the difference in colour detected from a source and the colour emitted from the source. Over long distances, light from a source appears more red, which is due to a combination of redshift and interstellar extinction.
Redshift occurs when a source of light is moving away from the detector, or vice versa. Because the universe is expanding, all sources of light are moving away from us and the further away they are the faster they are moving. This means that the further away a source is, the more it is redshifted. The galaxies that we are looking for are very old, and because light travels at a finite speed, the further away a galaxy the older it is. This means that primeval galaxies have very high redshifts, which is one way of identifying them.
Interstellar extinction is where light is absorbed and scattered in space. Light with a shorter wavelength (i.e. bluer light) is more likely to be scattered, which means that light sources from far away are likely to appear redder. Primeval galaxies are very far away, so extinction makes them appear more red.
The combination of redshift and interstellar extinction means that primeval galaxies are observed to be a lot redder than what we think they emit. Colour excess is the difference between the colour a source is observed to be and the colour of the light emitted from the source, so we expect primeval galaxies to have a high colour excess. Σ (sigma) represents the significance of the difference of the colour excess compared with the mean for the whole sample. Most of the sample is noise, so to find primeval galaxies we need to select data with a significantly higher colour excess compared with the rest of the sample (i.e a higher sigma). Like with the equivalent width, we will select a minimum value of sigma and use it to cut out some of the noise.
We call the sources that emit the right wavelengths ‘emitters’.The data that we have collected is split up into different wavelength filters, where only photons of a small range of wavelengths are observed. For example, in the filter called IA484, the centre of the filter is 4840 Å, however the range of wavelengths it picks up is 4734.65 to 4963.75 Å. We know the wavelengths that the emitters emit, and we know the range of wavelengths that they would be picked up as in each filter; from this we can calculate the redshift of each emission line. Therefore, we can use the redshift to identify possible emitters. It is not definite that all of the possible emitters are primeval galaxies, so we will do a visual check later on in the project after we have decided on the best cut.
The way that we are selecting which cut is the most effective is by looking at how much noise is eliminated, but also by looking at how many emitters have been eliminated as well. Ideally, we would want a cut that gets rid of as much noise as possible without getting rid of too many emitters. The values we measure to measure these two things are purity and completeness. Purity is the fraction of real galaxies out of all of the sources in the sample after the cut, and completeness is the fraction of real galaxies in the cut out of all of the real galaxies in the sample before the cut.
We spent some time calculating these values, both by hand and using code. For each combination of EW and sigma cuts, we calculated the ratio of purity and completeness. We are still evaluating whether we should have equal weighting on purity and completeness, or if we should prioritise one over the other. We plotted some graphs comparing the EW cut, the sigma cut, and the ratio of purity and completeness, as shown in figure 3. Because our first draft of code was not very efficient, we plotted a limited number of results. Our aim for the next week is to make our code more efficient so that we can analyse a larger number of possible cuts.