Will and I spent the last couple of weeks writing a programme to automatically detect sources in the cube. Ultimately, this was incredibly frustrating and rather fruitless, but it did finally allow us to finally detect the source around which our whole cube was centred, the galaxy VR7.
We used a programme called SExtractor, which is able to automatically detect sources within a 2D image. This meant we first had to bin our cube into smaller slices and run the programme on each one. A lot of time was spent carefully choosing the parameters used by the SExtractor so that when applied to a white light image, it would detect all the sources with minimal inclusion of noise. Sadly any choice of parameters seemed to give us an unreasonable amount of noise round the edge of the cube, so ultimately the decision was made to simply cut away the edges these edges. Eventually we outputted 286 catalogues (one for each 8 wavelength value slice) with everything SExtractor detected at a sigma level of 2 for 7 or more pixels.
Obviously we didn’t want to sort through each of these catalogues individually so we wrote another programme which in its early stages was so inefficient, we were blamed for a city-wide power cut. This point of this programme was to compile the individual catalogues, such that each distinct (defined as within one arcsecond) source only appeared once. This outputted a catalogue of one hundred and something blindly found sources, with the number of times they had been detected and the wavelength at which they were first detected. Unfortunately, most of which turned out to be noise once their spectra were taken.
However, here were five survivors of the noise cull, which showed strong asymmetric emission lines. These are hypothesised to be Lyman-alpha emitters, which are high redshift galaxies offering a glimpse into the history of the universe. However, the limited range over which our datacube runs means that it is very difficult to confirm this, as if we see the Lyman-alpha peak in a spectrum, we are likely to only see that peak so have no further evidence with which to solid our redshift.
One such of these Lyman-alpha emitters is, of course, VR7, right at the centre of the cube. As one would expect, this showed a very clear spectrum with a strong Lyman-alpha peak, giving us a redshift of 6.5345, right at the edge of our ‘PIG galaxy’. For context this corresponds to a luminosity distance of over 65,000 Mpc. For further context that is 1.34*10^17 times the distance from earth to the sun or more than 80,000 times the distance from us to our nearest neighbour galaxy, Andromeda.
The incredible distance at which we managed to find a source almost makes all the frustration worth it and nevertheless the experience taught me valuable lessons about the patience needed in programming and in research in general.
I have been trying to investigate the properties of the galaxies we have found. The production of emission lines due to recombination requires the presence of ionised gas. Two main sources of ionisation are high-temperature (O/B type) stars and active galactic nuclei (AGN). AGN are small, incredibly luminous regions of a galaxy, powered by matter heating up as it accretes around the galaxy’s supermassive black hole.
The primary source of ionisation in a galaxy may be distinguished using a BPT (Baldwin, Phillips and Terlevich) diagram. The two lines represent two different ways of differentiating the two types – the solid line is from theoretical modelling (Kewley et al., 2001), the dashed line is based on observations (Kauffmann et al. 2003) Galaxies with ionisation from hot stars appear under the line; AGNs above.
It took a surprisingly long time to get here – lots of wrangling with code, then realising that some my sources were contaminated by other sources! In the end only five sources remained, as shown above. I think the code is being over-fussy about what counts as a peak, which I should look further at.
To compare the properties we found to some other source, you need some other galaxies. The CLOUDY simulations provide this. Essentially they are computer-generated stars and AGN. With these, you don’t have to correct for dust absorption, distance etc., so you can obtain all sorts of information.
From the diagram above, you can see four galaxies appear to be dominated by star-formation (one of them is a bit odd, I’ll come back to it later). To see if our estimated metallicities (how much of the stuff isn’t hydrogen or helium) are reasonable, I compared our values to the CLOUDY simulated stars. The metallicities of all of them, bar PIG-7354-1408, fit in nicely. The closest match to the CLOUDY data is a set of galaxies dominated by O and B type stars (the hottest and most massive).
Hot stars die young (still tens of millions of years old, but that’s peanuts to space), so their presence indicates that the galaxy is still forming stars. If galaxy stopped forming stars, the O and B type stars would die quickly, leaving dimmer stars to dominate the spectrum. So if you see the hot stars you would expect the star formation rate to be quite high. However, I am finding very low values for the star formation rate of our sources (<0.01 solar mass per year). I am still trying to work out what is going on here!
The blue galaxy is quite interesting. From its position on the BPT diagram, it clearly contains an active galactic nucleus of some sort. As I mentioned earlier, these are visible due to a supermassive black hole driving jets of matter out into space towards us. There are different types of AGN, which might be just to do with the orientation of the jets relative to us. A beam that has to pass through dust on its way to us appears different to one that does not.
Looking at the properties (such as the relative fluxes of different spectral lines and the galaxy’s luminosity), I have concluded that this galaxy is most likely a ‘LINER’. This is a fairly weak type of AGN.
The final source (PIG-7404-1368) is a real muddle. Depending on which features you look at, it appears to be star-forming, a LINER or a Seyfert galaxy (a more powerful type of AGN). Looking at the galaxy in Hubble, I think it is contaminated by its massive neighbour, so this could be the cause of some of the confusion! See the image below, where PIG-7404-1368 is shown by the thick blue circle, and other sources with thin circles.
While difficult, I have found classifying these galaxies very satisfying. I am still unsure about some things – whether I can classify more sources, the strange-looking sources and low star formation rates, but I am happy to have achieved some things. It has made me really appreciate how difficult it is to classify galaxies properly, and amazed at how we manage to do it with so little information.
One of the most interesting PIG’s may not be a PIG at all. This stripe looks very much like a galaxy thats close by but it does not appear in the image taken by Hubble Space Telescope whatsoever. This could be due to a low continuum with very bright emission lines that means only MUSE can see it. When we searched for a particular wavelength that this was particularly bright at we found nothing which ruled out the emission theory. We were inclined to believe it was just an imaging error or some very unlikely noise pattern until SExtractor (our automatic searching script) also decided it could be a source. This PIG is going to need more investigation…~Will
Analysing the properties of galaxies is where the fun really begins! However, before we could really get into some proper analysis of the spectra we had a few things which needed fixing first.
Many of the spectra we had taken in the previous week had a lot of noise, which comes from various sources, including light emitted by particles in the atmosphere and the telescope itself. As the noise appears as more, smaller emission and absorption lines in the spectra themselves it makes identifying the emissions of the fainter galaxies very hard, as the emissions get hidden in all the noise. To try to get some better spectra for the galaxies in our catalogue we found more appropriate minicube sizes for each source in our catalogue. This took a while as we had to go through the whole catalogue and see what minicube size gets the whole source and little of the background or other sources, using the noise reduced image we has produced in the first week. Once we had found better sizes and updated our catalogue to include these we altered the code to take different sizes of minicube then ran it for the whole catalogue again!
Once we had our updated spectra we had to go through each one and find and new emission lines that had appeared and from them the redshift of the galaxy, now the noise should play a smaller part in the shape of the spectra. Unfortunately the difference was not a noticeable as we had hoped, but for a few galaxies we were able to identify at least one emission line that we had been unable to see before. Some distinct emission lines are most likely from the light emitted by the atmosphere and appear in many of the spectra we have collected, these ‘Sky Lines’ can be found as distinct peaks at wavelengths 5578.75Å (with a trough at 5580Å) and three close peaks towards the end of the spectrum at 9307.5Å, 9313.75Å and 9323.75Å, the latter of which appears most often in all spectra.
Next we found the flux of distinct emissions in the spectra and calculated the Metallicity Star Formation Rate (SFR) of the galaxies. This shows you how much metal (in Astrophysics an element which not Hydrogen or Helium is called a metal) is present in the galaxy and how quickly the galaxy is creating stars, which can give you an indication of the age and size of the galaxy. Younger galaxies have more dust and gas so form more stars, and a galaxy with a low metallicity is generally smaller as it cannot retain as much metal (worth noting however, the early universe was very metal poor, so galaxies with a high redshift are could potentially have a lower metallicity as well).
Above Figure: Ignoring two points at high SFR, the other points seem to show a correlation. We hypothesised that the two points at high SFR are star forming galaxies and rest are AGN. This is because
We ran into a few issues with calculating the Metallicity of galaxies with too few emission lines, as the formula we had required H alpha, which is not visible in a lot of our spectra. So after finding some additional formulae which could be used to calculate the Metallicity, albeit with increasingly low accuracy. We also tried to calculate the Dust Extinction of the spectra (the reduction in the brightness of the galaxies due to light being scattered or absorbed by gas and dust between us and the galaxy), this involved some hazy calculations and as we got some negative values for the dust extinction I am not sure our current method for calculating it is at all useful and if we want to obtain some better values for the dust extinction we are going to need a better formula. Never the less we collected the results of the calculations into one file and tried plotting some graphs relating the quantities. These didn’t seem to show much, but did provide an interesting visual representation of the differences in the galaxies and how their properties might relate to each other.
Above Left: Plotting the extinction rate against the metallicity and removing any points at which either value cannot be found, the points appears to follow some correlation, though perhaps a none linear correlation is more accurate.
Above Right: Plotting metallicity against SFR we expected something similar to the plot of dust extinction against SFR, but with our current data it seems to have much less correlation between the Metallicity and SFR.
Next we tried Binning the spectra. This is where you average the signal over a few points and plot that value instead, which should reduce the effect of the noise. However this caused us some confusion and, as we found out when binning some spectra, this method reduces the emission peaks too far to make them visible amongst the noise as emission peaks are very narrow. Binning will however make any absorption lines in the spectra more obvious, which will help when looking at objects with more continuous spectra, as absorption lines are wider than emission lines. The differences in the width of absorption and emission lines as they come from the absorption of light by the metals in the atmospheres of stars, rather than the emissions from H II regions or other regions of ionised metals. For the moment we have not looked further into the absorption lines of the spectra we have collected, however it is worth doing as the absorption lines can tell you about the type of stars which dominate the spectrum of the galaxy, which can tell you a lot about the galaxy itself. It should be noted however, absorption lines could also come from absorption by dust and gas between us and the galaxy that produced the spectra, which means they are just more noise in our spectra.
Above Figure: Binned spectrum of a PIG, found by dividing the noise value by the square root of the width of the bin. We are unsure if this was the correct method, however it does show some more distinct absorption lines just after the continuum stops.
Some key things to look for in the absorption line of the binned spectra is the continuum stopping below a wavelength of approximately 4000Å, to the left of which are three absorption lines which come from Calcium, neutral Hydrogen and Potassium and various other absorption features will be present. This method of analysis works best when the spectrum is continuous, in other words has a distinct curve in its signal as the wavelength increases.
We had a look at the sources which produced continuous spectra, comparing the position of the source in the MUSU image and Hubble image of the same region of sky. This shows that some of the sources which gave a continuum are most likely Stars in our own galaxy. This can be seen in the Hubble image as these Stars produced diffraction spikes (where the light from the source is diffracted by the struts that hold up the secondary mirror inside the Hubble telescope into four lines out from the object in the image). Diffraction spikes are only seen when an object is point-like and very bright, so a galaxy cannot produce diffraction spikes as it needs to be very bright, which means it is no longer point like (you can see the shape of bright galaxies with Hubble), and point-like, which means it is no longer bright enough to produce diffraction spikes.
Much of what we have been is fixing problems in our Python codes and trying to get better spectra. Hopefully this will mean that we can get stuck into some proper analysis of the galaxies in our catalogue soon. The next step as well is to look into automatically finding galaxies in the data cube so we can add even more objects to the PIGEON catalogue!
As there has been a lot of methodically going over sources and their spectra which has been slow, but finding a spectra which many emission lines and looking into all the information we can find out from a single spectrum is pretty amazing. (not to mention that my coding skills are definitely improving!) I certainly look forward to investigating many of these galaxies further, and maybe we will find some very distant galaxies in our catalogue!
This week we looked at some of the properties of the galaxies in our image. For example, how much metal they contain and the rate at which they form stars (SFR). PIG-7395-1374 had a SFR of over 20 Solar masses per year! It also has the highest metallicity of any galaxy we found. This is because the galaxy is a really big and so holds in all the metal, while also having lots of dust to form stars with.
PIG-7444-1418 raised some issues in our survey. We had a automatic script to generate a minicube (a smaller file containing just the source, cut from the full cube) of every source we found. This galaxy, however is too close to the edge of the full cube and so gave us a lot of errors and empty data points. We had an interesting time modifying the script to combat any errors like this.
Now that we had all the galaxies, it was time to start exploring them.
The focus of this week has been analysing the spectra of our sources. With a spectrum, you can deduce all sorts of properties – whether a galaxy is close or distant, metal-rich or metal poor, star-forming or dead.
In a lab on Earth, you find that each element emits light at specific wavelengths. Energised electrons in atoms can jump to a lower energy, releasing their energy as light. On a spectrum, these appear as bright ’emission lines’ against a dark background.
You find how far away a source is by finding it’s ‘redshift’. As the universe is expanding, light from distant galaxies is stretched out as it travels towards us. This stretching increases the wavelength – in other words the light appears redder (hence ‘redshift’). When looking at their spectra, all the emission lines appear shifted towards the red. By identifying the correct line and comparing where it is to where it should be, you can find the redshift. Light from more distant sources is stretched out more, so by finding the shift gives you can also calculate its distance.
A nice example emission spectrum, from source PIG-7354-1408. The clearest spectral lines are labelled.
It turns out there is no easy way to code this, so we have been identifying peaks by eye. Many of our spectra are currently full of noise, or only have one line, which could be almost anything. Despite this, we have now identified over 30 sources with two or more identifiable spectral lines, allowing us to calculate the redshift (and therefore distance) with confidence!
At the moment most of our redshifts are quite small (less than 1), which is quite close in astronomical terms, but we expect to see more distant sources once we reduce the noise further.
Lines kept appearing at the same points from sources at completely different locations. These could only be actual lines if each source was at exactly the same distance, which seems suspicious. We tried inventing a new element (pigeonium) to explain them, we realised the truth was much less exciting and sadly not worthy of a Nobel Prize. Sky-lines are emissions from our atmosphere that appear exactly as emission lines would when you look at the spectra (see Katie’s post for more details).
By finding the area under a spectral line, we found its ‘flux’ (roughly speaking, how bright the source appears on Earth). Since we know the distance, we can find the actual luminosity of the source, and not just how bright it appears to us.
Since then we have started finding more properties about the sources, such as star formation rate (SFR) (mass of stars formed in a galaxy per unit time) and metallicity (the percentage of mass in a galaxy that isn’t hydrogen or helium – yes, any element other than those two counts as a ‘metal’ in astrophysics). There are fairly simple mathematical relations between the fluxes of particular spectral lines and these properties.
In our first attempt at finding the SFR of a galaxy
Again, it turns out we had not made a major discovery. Instead, we had overlooked a factor of 10^20 in our units! Correcting for this, we got the much more reasonable value of 3.9 solar masses per year. I have made plenty of these sort of mistakes, but it’s all valuable learning!
Discovering these galaxies is very exciting for me. It is brilliant to learn from the academics and PhD students here, who have far better knowledge of all this stuff than I do! I have only just begun to appreciate how much you can find out from a simple spectrum. Think of it this way – that picture above holds almost all of the information that we can ever hope to find about these galaxies! I am looking forward to next week, as we find delve deeper into the properties of galaxies that we ourselves have found.
One of the more interesting galaxies found this week was PIG-7351-1417. The spectrum for it looked very clear, with some very obvious emission lines. Using an image taken by Hubble we discovered that it was actually 3 galaxies in the process of merging together. The spectra for all of these galaxies combine to produce the huge peaks we had observed. Here you can see a close up of the Hubble image showing the merging galaxies.
I was uncharacteristically calm before starting this internship. Clearly my subconscious knew that I would be going into a fun, supportive and (relative to the preceding stress of third year exams) relaxed environment. The work so far has been interesting and enjoyable, if not a little frustrating at times. This frustration has mainly been due to my lack of proficiency with computers and so I really feel like my computer skills are going to improve drastically over the next few weeks. Outside of the lab all the interns have been getting on well, and we’ve been warmly welcomed into the wider research group and department having been invited to tea and coffee breaks, the weekly journal club and a talk from a visiting academic from the University of Tokyo.
Our group’s first week has been focussed on getting to grips with the datacube. This is a 2.9GB file, containing a 2D image of a very small section of the sky, ‘taken to 3D’ by recording the incoming photon flux at a range of wavelengths. The data comes from The Multi Unit Spectroscopic Explorer (MUSE), an instrument of the Very Large Telescope (VLT) at the European Southern Observatory (ESO). (Astrophysicists like acronyms it would seem). We originally scanned through the wavelengths manually, identifying sources (i.e. galaxies) at random. We saw that there were both sources that appeared continuously throughout the range of wavelengths and ones which only appeared at very specific wavelengths. These specific wavelengths correspond to specific emission lines and eventually by extracting the spectra of each source and comparing with common galaxy emission lines, we hope to be able to determine the redshift of all the galaxies, as well as other interesting properties.
However, after each obtaining a random selection of galaxies, very few of which overlapped, we soon realised we needed a more systematic approach. We wrote (ok, selectively copied) a programme to compress the image in the wavelength axis, simultaneously removing noise by only recording counts above the three sigma level. This three sigma noise level was itself determined from another programme we wrote. All of this allowed us to see all the sources at once. We split this image into five ‘streets’ (literally named after the streets each of us live on because why not) and each went through our streets recording every source that appeared. It was quite difficult to objectively determine sources from noise, even with the removal of most of said noise by the three sigma limit, so we used an image from the Hubble telescope to compare against our MUSE data. We have ended this week with a (hopefully) comprehensive catalogue of around 150 galaxies. This number is both exciting and intimidating and has really reminded me how incredibly massive and exciting our universe is.
My favourite part of the week was working out a naming convention for all our newly discovered galaxies. After Google determined that there is no completely standardised naming convention, we just decided to have fun with it. Somehow, as a group, we have become extremely fixated on pigeons (it’s best not to ask), so we originally tried to work an acronym around this. However PIGEON (PickInG lettErs frOm aNywhere) was decided to be a bit long for a catalogue name, so we shortened it to PIG (Potentially Interesting Galaxy). We then uniquely identified each galaxy using the first four digits after the decimal point of the RA and Dec angles – eg. PIG-7286-1237.
Finally, here are three of the most important lessons I feel I have learnt this week:
- If you know what you want to do with your code but just can’t get it to work, then ask for help. Preferably before you’ve wasted a whole extremely frustrating morning on it.
- Macs are actually quite good and I probably shouldn’t have avoided them for the first 3 years of my degree.
- Everything can be related to pigeons.