How do we squeeze value out of Data Science?
It is well established that Data science plays a key role in almost all specialized fields. The outcome of an election could be swayed from losing to winning it by proper classification of potential voters. The ability to predict user engagement accrues large gains in advertising. In the manufacturing and transport sector, being able to forecast a machine failure or vehicle breakdown is a great advantage. What generates excitement and holds promise is that data science can give a competitive edge to almost any business that acquires the right data and hires the right talent. However we first need to clear a few common misconceptions as regard to the value of Data Science. The common perception regarding Data Science is that data-driven companies are miles ahead of their contemporaries in terms of performance. We just need to consider Netflix, Amazon and Google. All that is required is high-quality data with the right velocity, variety, and volume, as well as skilled data scientists who can unearth hidden patterns and tell convincing stories about what those patterns really imply. These companies have a much greater competitive advantage and are driven to perform optimally due to the resulting insights of the data scientist. That’s the story doing the rounds. Sounds good? Actually, on ground, things are a bit different… First and foremost, let’s tackle what is to be looked for in a data scientist. An internet research on the skills required for a data scientist will reveal a heavy focus on algorithms. It’s a common assumption that data science is mostly about creating and running advanced analytics algorithms. Secondly, the story-line doesn’t pay credence to the subtle, yet very persistent tendency of human beings to reject things they don’t like. Having gained an insight from a pattern found in the data, it’s a common assumption that all that remains for someone to accept it just a matter of telling a good story. This is the “last mile” assumption, after putting in a great deal of effort analyzing the data. Often, what happens instead is that the requester questions the assumptions, the data, the methods, or the interpretation of the data scientist; who thereafter runs around putting in follow-up research effort until they either tell the requesters what they already believed or just give up and find a new project! A different story line for deriving value out of data science The pre-requisite to getting that competitive edge through data science, is to have a good definition of what a data scientist really is. Data scientists are, primarily, scientists. They use the scientific method- Research Methodology. They take an educated guess at hypotheses, based on the evidence gathered and thereafter draw conclusions. Data scientists specialize in the study of data, rather than specializing in any particular domain, such as earthquakes, aquatic life etc. Like any other scientists, their job is to create and test hypotheses. This boils down to data scientists having falsifiable hypothesis to go ahead with their job. Now this puts them on a tangent to what is the commonly perceived story-line. Thus, in order to develop a competitive advantage through data science, the data scientist will require to have a falsifiable hypothesis about what will create that advantage, take an educated guess at the hypothesis, then drown himself/ herself in work trying to confirm or refute it. Any of the numerable specific hypotheses will all have the same general form: It’s more effective to do X than to do Y For example: Our company will sell more widgets if we increase delivery capabilities in Asia Pacific.