What seems like an age ago (October last year) I finished my PhD and submitted my thesis for examination (a month before my deadline 🙂 ). Then after a shocking 8 months (no idea why it took that long…) I had my viva, which went quite well and I thankfully only had a few typos to correct, that I diligently got done around some other time requirements. Once finished, I gave my thesis back to my internal examiner who then sat on it for a further 2 months before finally signing off all my corrections a few weeks ago.
So now I am totally, finally, completely done *yay* and just last week, I got my final hard bound copy back from the printers and my thesis is, as I speak, being archived in our library which means…
…I can now release my thesis as open access for anyone to read! (click the link below, download is at the bottom of the page).
But wait… if that’s not exciting enough, I also wanted to go a bit further than just releasing my thesis – I am also going to release all the data that is associated with the work.
I am a firm believer in open data access in papers, thesis and books and I am very pleased that my sponsoring company has agreed to let me release large amounts of the data used to generate my thesis, for anyone to look at or use. I am not perfect and didn’t have time to run every type of analysis possible on my data. By releasing it I not only allow others to challenge my interpretations, but I hope that just maybe, someone else might discover something new from my work that I’ve missed.
I’ve hosted the data on figshare which is a fantastic place to keep things like this and lets people comment and reference publicly available data. The link below will take you to a repository of each set of data so you don’t need to download the whole set just to look at the numbers behind one graph. Click the link below to go to the directory.
Data – Repository on figshare
Obviously, I haven’t shared every set of data collected from my PhD – for a start that would be a 15GB download but also, much of it is not written in my thesis and copying all the associated meta-data from my lab book would be a fairly epic task. As I’ve moved to an electronic lab-book now, hopefully this won’t be a problem next time I want to share a huge trove of data.
I have done my best to include any and all raw data that helped me reach the conclusions in my thesis and while it’s not perfectly complete, I have tried to include everything I could.
In addition to the data, I also realise that it’s pretty important to share any calculation and modelling tools I used. Figshare is currently better for sharing data than code so I’ve elected to put any calculation code up on GitHub. GitHub is great for both sharing code with others but also allowing people to suggest changes and modifications that might improve the code.
Scripts – @ Git Hub/MCeeP
So enjoy trawling though my data. If you find something missing, then please ask me and I’ll dig it out of the vast repository of data I seem to have amassed. Similarly, if you need more details about any particular bit of work I’d be happy to pull out my lab notes and expand on it.
Footnote: releasing all the data to the internet along with my thesis is slightly more nerve wracking than having my viva.