Note: This article is re-blogged from my column in Laboratory News – go get your free subscription here
After almost a decade in science, I am beginning to realise that I am incurably cursed with data envy.
Now this isn’t the kind of data envy you might expect – I have plenty of data, gigabytes of it in fact. If anything, a little less data might be nice – especially from those experiments where I forgot to turn on one of the bits of kit and had to restart it. Nor do I want my data to be ‘good’ data, as I’m pretty happy even when things don’t work. No, I don’t have have any normal sensible kind of data envy, I have ‘organised’ data envy.
Almost any research project these days generates huge volumes of data – some of it useful, some will be forever consigned to the back of a hard drive. But all of it will be hoarded away in the hope that one day some tiny nugget of it will shine though as a great discovery. Even if you don’t feel this squirrel-like urge, then I would expect your employer certainly does – no company or university wants to accidentally throw away vital data.
These days that hoarded data exists in two places – lab-books and electronic data sets. Lab-book quality varies GREATLY between employers and employees. I was taught great lab-book housekeeping at my first job, which would have been a real boon if I had ever actually followed any of the advice…
So for me and many others, the electronic copy of the data is the go-to source for information on which experiments you ran in the past. Unfortunately for future me, past me is absolutely terrible at data organisation.
It’s not that I don’t try to be nice and organised. I’ve lost track of how many times I’ve started a new research project thinking “this time will be different, this time I’ll keep an organised data system”. I get positively excited about how organised my data will be – I have reached the point now where I’ll even plan out file systems in advance so that there is no way they can fail to keep me organised and on track. However, no matter the planning and care that goes into these systems it always falls apart the same way.
First comes the temp folder. At some point I’ll be busy and not have time to put things in their proper place. So I think to myself ‘hmm best store this and I’ll index it later’. The temp folder is like a infectious virus that soon beings to consume all data that I produce. The reasoning being that I’ve not sorted out the previous temp data yet so I better just add to it.
Next comes the inevitable abandonment of the original system. So now with a giant temp folder to deal with, I being to re-plan my original system – maybe I’ll do it differently and that will somehow fix everything. So now I’ll have two file systems in the same folder – the original one and its new and improved sister.
This new, improved system lasts about a day until a temp folder appears and the cycle of decline starts again. By the end of the project all I have is strange letters, file creation dates and the burning hatred of everyone who manages this better than me.
So if you show me your neatly organised files systems, please don’t fear the burning stare in my eyes that seems to be saying ‘die die die’ I’m just envious of your data system – it’s probably not personal…