Big data is not the master

Extract tangible business value & insights from quality and integrated data is more important than volume, velocity or variety of Big Data. Business users who consume Big data don’t know or care about its bigness. They want the right data applicable to their particular business problem, and also want to trust that data and analysis derived from it.

Data and models lie in the context of a larger decision support system. The quality of the data, and quality of the model are two independent limiting factors of such a system and one always begin the decision support process with questions like.
• What problems must the business address?
• What questions does the business need answered?
• What insights does the business need for innovation?
• What business decisions and actions need quantitative support?

First, Develop models to support above business needs and then  decide to pursue for  the required data. Big data is the servant, not the master. The real value in decision making is highly influenced by organizational decision support process in  2 ways.

Start with assumption, instead of a hypothesis (Similar to being in a mission to “Prove” something). Team perceive an outcome and try to align data, analysis and do everything necessary to achieve outcome that syncs with assumption. Starting with assumptions make people cherry-pick data, models, algorithms, visualizations, and even share information with objective to “prove”.

Start with a hypothesis to be proved or disproved. Testing happens. Verification happens. Multiple models are generally used instead of one. Here more focus is spend on “whys” in addition to the “whats.” Teams that are open for critical thinking have the risk of slipping in to bias here, while the genuine focus still continues on the accuracy of result.

Big Data Cycle

Big Data brings focus to three “V’s” (volume, velocity, and variety); More  value from Big Data comes from variety. With little or no domain expertise, techies  focus system design to handle larger data volume . Hence business folks need to focus/emphasis on  data variety and needs to be educated to forget more of big volume and focus on big variety.

  • Get value out of your variety is data integration task. Ensure that no Big Silos gets created and first integrate existing data silos.
  • Integrate far flung disparate systems to generate insights with a holistic view of customer and product attributes along with sales data by channel, region and brand.

When Variety takes precedence over volume, exponential data still gets collected, we need to address data error correction on an exception basis. We cannot scale manual data correction to keep up with our increased data volumes. We must automate data quality processes to catch and fix those errors up front with tools at least as robust as our data collection and storage resources. For unstructured data too, we need to apply  the same data quality standards that is applied to more traditional transaction data.  To achieve data quality, it is important to understand how  business users interact with the data.

Traditionally, applications were designed and developed and  consumed by users until it becomes apparent that applications need to be modified or replaced. Computing played role to tells us what happened in the past and allow humans to speculate what happens in future. Today predictive analytics make machine to speculate about the future and allows humans to take decision/action.

Predictive analytics needs continuous process evolution towards “It is not requirement followed by solution.  It is a process and a journey and not a project. It is not plugging in some technology” When predictive analytics helps to get better understanding of customers, focus is needed to “get inside their souls.” and to continually tune adaption to changes in the business and customer activity.

  • How to  prioritize, data present in the report or person consuming the report?
  • Does applying analytics change things for you in role of a professional and in role of a citizen? Are you agreeable with  new change in both roles?
  • What you did with data 5 years ago compared to what you are doing with it today?
  • As effective data usage is a moving target,  Does “more data”  become a more realistic desire than “all data.”
  • Does velocity matter more than having data in a format that’s actionable and in a format that maps to business objectives?