We are, I hardly need point out, living in a golden age of big data. It is completely changing the way most science is done; indeed some science can only exist because of our relatively recent ability to handle enormous data sets.
This is a marvellous, exciting and vital thing for science – it had, however, given me a rather unusual problem. I find myself more and more consumed by the superlatives and analogies used to describe big data.
Over the last two years alone 90% of the data in the world was generated… today’s data centres occupy an area of land equal in size to almost 6,000 football fields… the number of bits of information is thought to have exceeded the number of stars in the physical universe in 2007. Lovely trivia tidbits, each and every one. And there are just so many of them. Frankly it’s a little overwhelming.
In an irony that seems quite ridiculous, I have even considered setting up a database in order to capture all of the facts contained within the various infographics I get bombarded with on an almost daily basis. Honestly, I think I need a big data approach to deal with the ever increasing amount of big data descriptions.
So, if my (distinctly average) mind can barely keep up with the descriptive terms of all this data, how can it be expected to actually handle the data? How could anyone’s?
Of course it couldn't – the key to unlocking the information held in these enormous data sets wasn't just the technology to store and process them, nor the methodology to collate them in the first place; we had to learn how to extract meaning and display it in a digestible way. We had to turn big data back into digestible knowledge.
It’ll only get bigger…
And there is no better example of what an enormous challenge this is than the Large Hadron Collider. The experiments attached to this most famous of proton lobbers generate an almost obscene amount of data. Now, as plans begin for its replacement – the Future Circular Collider – that challenge will only increase.
As if the front-line science at what would be the world’s biggest experiment weren’t collaborative enough, in the handling of its data we will see an even wider net cast. It seems to me that collaboration billows from big data generators like the future FCC as if ripples in a pond… such a wonderful thought it has almost distracted me from the latest bit of trivia.