Big Data Reference Model

A project that approaches Big Data as a purely technical challenge will not deliver results. It is about more than just massive Hadoop clusters and number-crunching. In order to deliver value, a Big Data project has to enable change and adaptation. This requires that there are known problems to be solved. Yet, identifying the problem can be the hardest part. It's often the case that you have to collect some information to even discover what problem to solve. Deciding how to solve that problem creates a need for more information and analysis. This is an empirical discovery loop similar to that found in any research project or Six Sigma initiative.

Diagram of the Big Data Reference Model

Handling the data itself is a technical challenge, and it can be a big one. Still, those other aspects of empirical learning, human decision-making and problem identification must also be addressed.

This model depicts a schematic form of the different aspects to consider, including both human and technological components. It helps discussions by guiding the team to consider the problem space first: the business context, needs, and decision cycles. Then it moves into the solution space: collection, analysis, visualization, and feedback mechanisms. Using this model, projects and project teams are encouraged to pause before implementation to make sure they understand the problem to be solved. They may find that the problem has yet to be identified.

The model is divided into three areas with mutual dependencies and feedback loops. The following sections will discuss each of these areas.

Read More »