There is a debate about the details of the data scientist role, but it’s all about business value from big data. Do we really need to call them data scientist? How close if their jobs compared to scientist? What does the work scientist meant to business person? To understand data scientist, let us start understanding the role of scientist.
Scientists predict ill effects if global warming and genetic foods are good and also that vaccines are safe. How do we know if they are right?Why do we believe in them? One is by observation of their prediction over a long time.
Let me look at the main reason we believe science follows scientific method. What is the scientific method? Scientists follow this method and the method validates the truth of their claims.This is how it goes, the scientist develops a hypothesis based on assumptions, they find consequences lead by the hypothesis, they observe those consequences over a longer gestation period to check whether consequences validates hypothesis. Great. Are there no risks/pitfalls in the scientific method?
- Start hypothesis definition based on wrong assumptions
- By observing all the consequences happening, can we logically prove that the theory is correct? Observed consequences were limited by a subset of observation using a specific tool and some observations have been ignored.
- Start to continuously observe what is happening around and collect data. After collecting lot of data, propose hypothesis and claim validation.
Scientists mitigate the risk? Who is the judge to decide what goes right and wrong? Other scientists. They judge hypothesis based on seeing evidence and also scrutinizing the evidence and some time questioning the evidence. Scientists also collaborate to judge the evidence like a jury committee and has wide number of choices Any validation as above starts from the place of distrust and it is difficult to get “Yes” to new things from a scientists. The authority to accepts evidence as true and valid is a community of scientists who worked on a problem and some times there work is more than 100 years.
Hmm. Are there no risks/pitfalls in the collaborative method? This is not not sufficient condition to claim no risks as all of people who collaborated fallen trap to one of the above risks.
Not Good news. What else can be done? . First, Observe consequences happening for longer durations to confirm that evidence is real on sustainable basis for a long period. Second, document and share the method to replicate implementation independently and continue to observe the evidence. Third check whether the observation of evidence continues to stand up to the scrutiny verification of hypothesis.
For years, People earlier wanted to find reasons for infant mortality rate. All hypothesis was not leading to reduction in infant mortality rate for different reason. Based on intuition and observation of a person, there arrived the hypothesis that “Getting hands sterilized after attending on one patient and moving to next patient helped infant to be alive”. He took efforts to implement the hypothesis in his hospital in action and observed the consequence of infant mortality rate falling. This evidence stood easy scrutiny and the same evidence was found true in other hospitals also.
Will business wait for this long to implement data insight provided by a data scientist? The data scientists need to explain what they know and also how they know it. Is there a risk for business to takes actions based on insight and lead to bad customer experience.? Yes. In firms with one data scientist, who is going to scrutinize? The question ” who will scrutinize the evidence” remains open.