*Having thoroughly researched what defines a “habitable planet,” we came to a decision on what exactly our research question would be. The end goal of our project will be to find an estimation of f_sp from the CETI equation. This equation aims to calculate the number of intelligent alien civilisations currently in the galaxy, with whom we could communicate. According to the equation, the number of CETI (Communicating Extra-Terrestrial Intelligent civilisations) depends on a number of factors, including f_sp, the fraction of stars in the galaxy which host at least one suitable planet, in a habitable zone, which could support life, out of all the stars in the Galaxy which are older than 5Gy.*

*To find an estimate for this, we need to analyse as much data about exoplanets in our galaxy as we can, and filter it according to our definitions of “suitable planet, in a habitable zone” and we will achieve this using data from the NASA Exoplanet Archive. The archive includes a catalogue of data for exoplanets which have been discovered and about which there is published research, among a number of other resources. This particular database contains a wealth of information both about exoplanets and the stars they orbit, including but definitely not limited to masses, radii and orbit periods of planets and temperatures and spectral types for their host stars. This database has specifically been designed to be easy for researchers to work with, with so many characteristics all in one place.*

* You might think, therefore, that it would be relatively easy to then add some simple filters based on the constraints we researched in the previous week, count how many planets fit all of our criteria for “habitable” and then calculate a value for f_sp _{ }(this being our ultimate project aim).*

*If only it were that simple…*

*Due to the use of a variety of methods to detect the planets, not every planet has an entry for every variable, leading to “holes” in the data. This is where the bulk of our scientific work has been targeted so far, and it is likely to be that way for most of the length of the project. Some variables have established relationships to each other, based on well-understood physical characteristics. For example, Kepler’s 3 ^{rd} Law directly relates the semi-major axis of an elliptical orbit with the orbital period and knowing one of these variables should allow the other to be found with relative ease. Others are a little more complex.*

*One of the criteria we identified as necessary for humans to survive is to have a surface gravity strength within the correct range. The surface gravity of a planet depends on its mass and radius according to Newton’s Law of Universal Gravitation, but not every exoplanet in the table has entries for both mass and radius. Since the density of an exoplanet (i.e. the relationship between mass and radius) is dependent on its material composition, which is as yet unknown for many of these, it is not possible to directly infer one from the other. Our solution was to derive a relationship between mass and radius for all those that had entries for both, and then apply this relationship to the rest of the planets to fill in the gaps, making it then possible to estimate the surface gravity for every planet.*

*Harry and Amaia have been spending weeks 3 and 4 doing exactly that. Harry found a linear log-log relationship between the planets’ masses and radii, and used it to find the missing values for the other exoplanets. He then used these values to estimate the surface gravity of these planets. The plot for the known masses and radii (from which the relationship was derived) can be seen below. Although the actual distribution of points looks a bit more like a kingfisher than a straight line, the simple linear relationship is a good enough approximation to first order.*

*When it came to calculating the errors in these new values though, they had some initial difficulty. Originally, the errors were calculated in such a way that they were not independent from one another, and changed depending on whether the missing radii or masses were filled in first. This was clearly not ideal. Then, the plan had been to calculate the errors individually for each exoplanet, but this idea led to complications with the code, and was overly detailed. After some consultation with David, Amaia is now working on code that will assign mass and radius errors based on the range in which the values lie. The hope is that once the code that is being developed has been used successfully on one pair of variables, it could be repurposed relatively easily for other combinations. This has been an excellent example of how projects like this are not always smooth sailing, but still always contain opportunities to learn.*

*Our plan for the next week is to fix the errors in the newly calculated mass and radius values, which allows us to accurately calculate the errors in the gravitational field strength estimates. We also hope to use Kepler’s 3^{rd} Law to begin filling in missing values for orbital radius, which can be used to determine whether or not a planet orbits in its host star’s habitable zone.*