Today as yesterday, making sense of available data to create information and knowledge is as important as before.
The difference is that the available data is way larger and much more accessible to anybody. Statistical analysis tools on Internet-based communication are available for free, like Google Analytics. And a lot of people try to use then to increase their own or their marketing impact.
And there come questions like ‘what’s the best time for me to tweet’? It’s possible to run that analysis, as this blog post by Chris Penn, “when is the best time to tweet”, shows. Now what is the meaning of this analysis? Is it statistically significant? Do we effectively control all the other parameters that influence the result? What are the assumptions – here, the assumption is clearly that people are supposed to live in real time, you want to tweet at a moment they are connected. Is that real? For myself I look at tweeter once a day for all the day’s tweets…
This excellent post by Tom Webster about ‘Social Media data dregding” shows very clearly how these challenges affect the interpretation of the data.
As a conclusion. Running statistical analysis on heaps of data, in the Fourth Revolution, is easier than ever. It makes all the more dangerous the conclusions we get. The good old principles of statistical control and design of experiments are still valid. And more needed than ever. That should be part of the basic literacy in the Collaborative Age.