Lies, Damn Lies And The Myth Of Following Big Data

We are told to follow the data and the truth will be revealed, but data tells many tales and it depends on the data and how you interpret it. It makes me wonder if anything is definitive if you can present two similar sets of data and draw wildly different conclusions, depending on your emphasis. That’s because data is a tool in the hands of humans and we can interpret it as we choose. And to be clear, this isn’t because we choose to be deliberately deceptive either, although that’s probably true sometimes. It’s because being human, we can bring unintended biases to the data. It’s a huge conundrum in the age of big data. How do you find definitive answers when you can look at different data points on the same topic and come to different interpretations? DATA SCIENTISTS MATTER Pam Baker who is author of the book Data Divination: Big Data Strategies, looks at it from a data science perspective, but she still acknowledges you have to ask the right questions to get good answers. “Data is pulled according to its relevancy to the precise question being asked of the data. Algorithms are written to include several inputs as identified as necessary to answering the question,” Baker explained to me in an email. She says data scientists have a number of tools at their disposal to do this work, but mistakes are always possible. “There is always room for error, of course, but data science and statistics have hammered out many of these issues long before big data came to be. But it is true that if the wrong data points are used in the algorithm or the data is flawed in some way, the algorithm output (answer) will be wrong or flawed too.” That’s useful as far as it goes, but we know there is a shortage of data scientists. I’ve heard there is one or none at the vast majority of companies, so there is all this data, but companies are lacking the expertise to help them understand it –and data can be manipulated to give you the answers you want. I listened to a speaker earlier this week at the Gilbane Conference in Boston give a bunch of statistics that suggested people didn’t use that many apps and most had fewer than 10. He also suggested 90 percent of users didn’t mind receiving spam SMS messages. Not coincidentally he worked for a company that offered an SMS advertising solution. He shared a bunch of data that suggested you would be foolish to build an app if you wanted to get a customer’s attention. The speaker who followed, displayed a data point that indicated we download 154,000 apps a minute. So which is it? How can you have fewer than 10 apps and at the same time be downloading apps at that pace? When you have clearly conflicting data like this, it makes it hard to answer questions definitively, suggesting once again the old axiom of ‘lies, damn lies and statistics’ could be truer than we imagine. LINES OF BUSINESS FACE A DATA CHALLENGE And when we put data into the hands of people other than the data scientists as Baker recommended, it could get even dicier, especially when those folks are in marketing and trying to use data to put their products and services in the best possible light. It could get even worse if they try to draw conclusions about their markets based on bad information. via TechCrunch.

