Added to the brief introduction that we have done just some lines above, and taking into consideration that we are going to talk during my blog mostly about Business Intelligence, we would like to analyze with deeper detail how to answer to these two simple questions: What does BI means? How can we properly understand the BI concept?
BI has multiple definitions in multiple publications. The Wikipedia definition for BI is this: “Business intelligence is a set of theories, methodologies, architectures, and technologies that transform raw data into meaningful and useful information for business purposes.” In our opinion this is a very interesting definition because it shows a full image of a BI solutions, and not only the usual focus on the front-end tools that some definitions remark. Because to have a BI solution in place implies to follow some theories in the definition of the process such that some specific data model applies methodologies that help you to achieve efficiency during your implementation project and then the later maintenance, which define the correct architecture that gives you an appropriate Return of Investment based on the benefit that you will obtain from the BI project, and finally choose the set of technologies that meets with your requirements, specifications, and economic possibilities. In Figure 1 you can see a diagram of main BI components. Keep it in mind because it will be useful for you to understand the whole blog.
Figure 1. BI system components
In Figure 1 you can see that the main information source of all the system is the ERP (in spite of that, as we will see, there can be multiple other sources), then we have an ODS database that contains a direct extraction from ERP; it can be a database or some tables inside our database, but this concept usually exists, using direct extractions from ERP so as not to overload the source system. With our ETL tool we will move information from ODS to the Staging Area database where we will process the information, and finally we will insert that in the datawarehouse where we will access with our BI front-end tool. It is quite possible that we have just a database and the distinction between ODS, Staging Area, and Datawarehouse is just the tables that we use or different schemas inside the database. Finally we can have a MOLAP system that will help us to realize the budget for the next year. We will see the detail of each component along the whole my blog.
During read my blog we will analyze some theories during this introduction; we will talk about methodologies; we will see the full architecture of the system; and we will evaluate different technologies.
There are also some other interesting concepts to define BI. One of them is focusing on the advantage that a company can get from implementing this kind of system as far as you can be more efficient in administrative tasks for gathering information and use your time to analyze the information to get conclusions. Also it’s important to remark that information that we are managing can come from internal and external sources to be able to analyze how we are performing our activities but also to compare with our competitors if they are publishing information or trying to analyze data from our target markets that we are going to access.
Another interesting concept is the possibility of forecasting the future. We are not talking about witches or fortune tellers; we are referring to finding the correct patterns that will allow us to anticipate what can be our sales if the conditions remain the same. What can happen if a new competitor enters on our main market and steals 20% of our market share? Or what could be the result of increasing a 25% our sales force team by 25%? In this concept, the key feature is to gain the ability to detect which variables are correlated with each other and which of them are almost independent.
Interactivity is also one of the focuses that can give you an idea about what BI is. It’s really interesting that business analysts can investigate and navigate through the data to be able to discover these hidden patterns that can give you visibility of your near future.
A later element to mention with different BI definitions is knowledge, which is the result of applying BI techniques to big amounts of data stored in our databases or simplifying the formula: if you join Data + Analysis you get Knowledge.
The concept Business Intelligence referred to in this blog first appeared and was described by Howard Dresner in 1989. He described Business Intelligence as “concepts and methods to improve business decision making by using fact-based support systems.”
From the late 1990s the usage of this term has been generalized and it’s possible to find innumerable references to BI in technical books and online articles.
The appearance of BI is directly related with the consolidation of transactional systems around the world. Before transactional services were installed everywhere, the main usage of computerized tools was for high-level analysis; the amount of information saved in the systems was small enough to be analyzed directly without the need of any extra tool. When transactional systems appeared in business scenarios, the amount of data to manage increased exponentially. Think about a retail company that had monthly information about units purchased of a given product and the stock that remained in the shop; now it has information of every single ticket of any client, with the detailed products that they have purchased. They can obtain relationships among products; they can analyze payment methods; if their customers pay with cards, they can get the name of the customers and they can analyze how many times a given customer goes to our shop, the type of products they buy, and a lot of other analysis. And this is only an example; you can translate this example to your business and understand why you need BI in your own business.
Most of the recent references are related to the specialized software for BI capabilities, and there are many consultancy companies that have dedicated BI teams and projects just to attend to development requirements on BI tools, considering the rest of the solution as auxiliary components of the BI tool itself. So be careful if you are thinking about hiring consultancy support by ensuring that their estimation of costs for the project contains all the required stuff for your request.
From Strategic to Tactical
BI also has suffered some changes in the scope of their projects, spreading across organizations from top management reports and dashboards to daily operational analysis. Benefits of BI have been proven from top managers, and they have noticed that BI offers multiple possibilities for their organizations to get profit on BI tools implementing BI projects from the bottom to the top of their companies. Over the years we can see that implementations have moved from strategic implementations that assist top managers in the decisions that they must take to guide correctly the companies. This includes all the environments inside the organization, including the lowest ones to facilitate employees to make decisions such as which products I need to ask to refill for the warehouse or what color is the best seller to dress the mannequin at the shop I am working for. BI has moved from strategic decisions support to tactical ones.
Initial BI implementations were known with different acronyms that reveal the nature of the target users on those incipient deployments. One of the initial acronyms to name this kind of systems was DSS or Decisional Support System, which shows us that our target audience will be decision makers. For sure, every single person in our company will make decisions, but most important decisions are usually made by managers, leaders, and executives. Also another interesting acronym that manifests the same is EIS (Executive Information System), which in this case contains directly the names of the target users of the BI platform: executives of the company.
Nowadays, the most important trend of BI is referred to Big Data and Data Lake concepts . Big Data itself is based on the possibility of using BI tools and analytical capabilities to extract information from the incredibly enormous amount of data that is being generated every day by our employees, customers, and platform users in many different platforms such as social networks, job networks, forums, blogs, mobile apps and resources, mobile devices, GPS information, etc., that is saved into unstructured systems and that you cannot attack with standard datawarehouse armament; and this is due to the nature of a datawarehouse, as we will analyze in the next sections. DWH is based on a structured database that contains homogenous information loaded into our system using ETL processes that ensure integrity of the data, multiple checks, and validations; and these kinds of processes are too complex to read from Big Data sources, as far as processing power required to perform that kind of analysis is too high. In a Big Data system, accuracy is not as critical than that in a DWH scenario. Using a Big Data system to analyze Facebook logs and missing a comment that could give you information from a potential customer among 1 billion users is something that we can accept, but if you miss a row in your accounting system it will generate an accounting mismatch. Big Data can be used as sourcing of our BI system: it is an extra component on the infrastructure layer, and it won’t replace our sales, finance, or operation analysis that we can have in place.
In order to be able to give support to Big Data requirements, a new concept different from DWH is required. In this scenario we can locate the Data Lake. The idea behind this concept is that you don’t need to process all the amount of data that you have available to create a structured data source for your BI system. Instead of that, you should access directly to your data source in order to fish the information that you require (from fishing comes the idea of the lake; ingenious, right?).
Internet of Things
It can be considered also as a source for Big Data analysis, but I would like to discuss the Internet of Things in a separate section due to the possibilities that it can offer to BI projects. Internet of Things is related to the incredible amount of information that could be extracted from incoming electronic devices that will be used in multiple elements everywhere. Now we have cars with Internet connection, fridges that can tell us what is missing inside, cleaning machines that can send us an SMS when they have finished, or cleaning robots that can be programmed to be launched from the smartphone. Imagine the amount of information that this could provide to analyze inside Big Data environments. This Internet of Things enables infinite possibilities of investigation and development where knowledge extracted from the information provides inputs that will be extremely interesting to analyze.