Scientific progress is a tricky thing. Despite what you might think, the direction of science is not always forwards – sometimes as a species we can unlearn things which then take us hundreds of years to re-discover. Sometimes this is simply from ideas not being publicised enough and it slips through the cracks. But more often, it’s from someone disagreeing with the outcome and wilfully deciding to ignore or deflect it.

Now there are different scales to this. On the one hand, you can have those holding international positions say that CO₂ isn’t a greenhouse gas – which personally, I think is akin to looking at a bear covered in blood and surrounded by shredded tents and saying “the campers probably just tripped and got horribly mutilated on their own”. But on the smaller scale, there are actually cases where lots of scientists set back science and possibly don’t even realise they are doing it.

In science, results can do three things. They can teach you something, they can teach you nothing (rare) and they can mislead you.

Results that teach you something don’t have to be good results. Even failed experiments can teach you plenty – even if that is how NOT to do something. Results that teach you nothing are often results that are repeats of experiments now 10’s of years old that you might be doing as due diligence.

Results that mislead you are results that are actually lies. Experiments don’t lie, researchers lie when they don’t like the experiment and ‘correct’ or crop the data.

It can start off fairly simple. As I explained (in my previous hamster-based article on data manipulation) even minor ‘fudging’ of data points can seem like the right thing at the time but can obscure deeper information you might currently be missing. This can take the form of everything from deleting or repeating outliers without taking the time to think about why they are there, to removing a dataset because the other 2 don’t agree with it, or the wholesale invention of a data set because you didn’t have time to run a repeat.

Whatever the reason, what is happening now is instead of showing actual data doing actual data things (like be weird and unreliable), you are showing false, made up data.

Obviously this is a bad thing to do. I don’t hear many people argue why it’s okay to ‘fix’ data (although I have had a few people try and argue it, at length). But I do often hear people say that it’s a known problem and just a background state of people not helping science progress. Which I very strongly disagree with – massaging data doesn’t leave scientific progress stationary, it actively sends it backwards.

Every experiment that is misrepresented and then published, is data that no one can repeat without crafting their data manually. It’s data that doesn’t say “We don’t know” it’s data that says “We do know and the answer is blue” while describing the optical properties of a red pot of paint. Future researchers with data saying it’s “Red” are no longer started at the position of “We don’t know” they are now starting at “you’re wrong it’s not blue” which is that much harder a place to argue from.

As a researcher I want to make progress. Not every experiment I do progresses either scientific knowledge at large or even my scientific knowledge all that much. But I can say that my experiments don’t actively make scientific progress go backwards. To me, the other way is like being a lifeguard who is not only bad at saving people but actively drowns the occasional slow swimmer just to make sure it looks like he’s doing something…


1 Comment

Joaquin Barroso · 5 April 2017 at 20:15

Spot on as usual! Yet, I would point two categories of misleading data: there is misleading data due to bias whether from statistics or from interpretation, in which case the conclusions are mislead through no necessary fault of the data; the second category is the manipulated data which you talk about, that is just plain wrong.

Leave a Reply

%d bloggers like this: