7 characteristics to differentiate BI, Data Mining and Big Data

Hi everybody

One of the most frequent questions in our day-to-day work at Aquarela is related to a common misconception of the concepts Business Intelligence (BI), Data Mining, and Big Data. Since all of them deal with exploratory data analysis, it is not strange to see wide misunderstandings. Therefore, the purpose of this post is to quickly illustrate what are the most striking features of each one helping readers define their information strategy, which depends on organization’s  strategy, maturity level and its context.

The basics of each involve the following steps:

  1. Survey questions: What does the customer want to learn (find out) of his/her business.3. How many customers do we serve each month? What is the average value of the product? Which product sells best?
  2. Study of data sources: What data are available internal / external data to answer business questions Where are the data? How can I have these data? How can I process them?
  3. Setting the size (scope) of the project: Who will be involved in the project? What is the size of the analysis or the sample? which will be the tools used? and how much will it be charged.
  4. Development: operationalization of the strategy, performing several, data transformations, processing, interactions with the stakeholders to validate the results and assumptions, finding out if the business questions were well addressed and results are consistent.

Until now the Bi, Data Mining and BigData virtually the same, right? So, in the table below we made a summary of what makes them different from each other in seven characteristics followed by important conclusions and suggestions.

Comparative table (Click to enlarge the image)

Comparative table Aquarela English

Conclusions and Recommendations

Although our research restricts itself to 7 characteristics, the results show that there are significant and important differences between the BI, Data Mining and BigData, serving as initial framework for helping decision maker to analysed and decide that fits best they business needs.  the most important points are:

  • We see that companies with a consolidated BI solution have more maturity to embark on extensive Data mining and/or Big Data, projects. Discoveries made by Data mining or Big Data can be quickly tested and monitored by a BI solution. So, the solutions can and must coexist.
  • The Big Data makes sense only in large volumes of data and the best option for your business depends on what questions are being asked and what the available data. All solutions are input data dependent. Consequently if the quality of the information sources is poor, the chances are that the answer is wrong: “garbage in, garbage out”.
  • While the panels of BI can help you to make sense of your data in a very visual and easy way, but you cannot do intense statistical analysis with it. This requires more complex solutions along side data scientists to enrich the perception of the business reality, by mean of finding new correlations, new market segments (classification and prediction), designing infographics showing global trends based on multivariate analysis).
  • Big Data extend the analysis to unstructured data, e.g. social networking posts, pictures, videos, music and etc. However, the degree of complexity increases significantly requiring experts data scientists in close cooperation with business analysts.
  • To avoid frustration is important to take into consideration differences of the value proposition of each solution and its outputs. Do not expect realtime monitoring data of a Data Mining project. In the same sense do not expect that a BI solution discovers new business insights, this is the role of the business operations of the other two solutions.
  • Big Data can be considered partly the combination of BI and Data Mining. While BI comes with a set of structured data in Data Mining comes with a range of algorithms and data discovery techniques. The makes Big Data a plus is the new large distributed processing technology, storage and memory to digest gigantic volumes of data with a wide range of heterogeneous data, more specifically non-structured data.
  • The results of the three can generate intelligence for business, just as the good use of a simple spread sheet can also generate intelligence, but it is important to assess whether this is sufficient to meet the ambitions and dilemmas of your business.
  • The true power of Big Data has not yet been fully recognized, however today’s most advanced companies in terms of technology base their entire strategy on the power and advanced analytics given by Big Data, in many cases they offer their services free of charge to gathering valuable data from the users. E.g.:  Gmail, Facebook, Twitter and OLX.
  • The complexity of data as well as its volume and file types tend to keep growing as presented in a previous post. This implies on the growing demand for Big Data solutions.

In the next post we will present what are interesting sectors for applying data exploratory and how this can be done for each case. Thank you for join us.

What is Aquarela Advanced Analytics?

Aquarela Analytics is the winner of the CNI Innovation Award in Brazil and a national reference in the application of corporate Artificial Intelligence in the industry and large companies. Through the Vorteris platform and the DCM methodology, it serves important clients such as Embraer (aerospace), Scania, Mercedes-Benz, Randon Group (automotive), SolarBR Coca-Cola (food retail), Hospital das Clínicas (healthcare), NTS-Brasil (oil and gas), Auren,SPIC Brasil (energy), Telefônica Vivo (telecommunications), among others.

Stay tuned following Aquarela’s Linkedin!

More information

1 Comment

  1. […] the vast majority of talks with Big Data prospects, we realized an astonishing gap between the business itself and the expectations of […]

Leave a Reply

Your email address will not be published. Required fields are marked *