Online Analytical Processing, or OLAP, is a computer processing technology that allows rapid execution of complex analytical queries. It is an important part of business intelligence, providing powerful capabilities for data mining and trend analysis. Online analytical processing represents the data in a multidimensional structure, allowing the users to consolidate, detail and rotate it in order to see and analyze it from various perspectives. This multidimensional database structure allows processing ad hoc queries rapidly, which makes OLAP tools a popular choice for business reporting. As a part of BI processes, OLAP applications likewise allow users carry out holistic analysis, examine current trends, and make predictions in order to attain meaningful business insights.
In this article, you will find the answers to multiple questions about OLAP including the main ones:
- What is OLAP?
- How is OLAP used in business?
- What is OLAP analysis?
- Where and when to use OLAP?
- Who can use it?
The term OLAP, as opposed to OLTP, was coined by Edgar Codd in 1993 in his article in Computerworld magazine where he outlined the 12 rules for online analytical processing software. Although the article was later retracted by the magazine due to a potential conflict of interests, its influence on shaping up the online analytical technology is undeniable. In 1995, Nigel Pendse defied the FASMI (Fast Analysis of Shared Multidimensional Information) requirements for OLAP tools. The history of the technology itself dates back to 1970 when Information Resources released Express, the first OLAP server. It was bought by Oracle in 1995 and subsequently became the basis of Oracle online analytical processing, a multidimensional calculation engine that Oracle provides within its database. In 1992, another prominent online analytical processing product, Essbase, was released by Arbor Software (acquired by Oracle in 2007). Microsoft released its online analytical processing server, MS Analysis Services, in 1998. It contributed to the popularity of the technology and prompted the development of other OLAP products. Today it’s easy to name several world-renowned vendors offering OLAP applications including IBM, SAS, SAP, Essbase, Microsoft, Oracle, IcCube etc. Galaktikasoft prides itself on contributing to the development of the high-end BI software and introduces its solution Ranet OLAP to an already OLAP experienced market.
OLAP data cube is the cornerstone of the system. The data in the cube is organized using either a star or a snowflake schema, at the center of which there is a fact table that contains aggregations (measures) and is connected to a number of dimension tables containing information about the measures. Dimensions describe how the measures can be analyzed. If the cube contains more than three dimensions is it often referred to as a hypercube. One of the main features belonging to the OLAP cube is its static character which implies that the cube can’t be changed after being developed. Hence the process of cube build and data model setting is a crucial step on the way to appropriate data processing in the OLAP architecture.
As mentioned above, at the center of the cube structure there is a fact table containing aggregations. Using aggregations is the main reason why queries are processed a lot faster in OLAP tools as compared to OLTP. Aggregations are summaries of data that have been pre-calculated during the data processing. All members stored in the dimension tables define the queries that the cube can receive, and aggregations answer these queries before they are generated. In the cube, aggregations are stored in the cells whose coordinates are specified by particular dimensions. The number of aggregations the cube might contain depends on all possible combinations of dimension members. That is why a typical cube in an OLAP application example can contain an extremely big number of aggregations and precalculating all of them while building the cube doesn’t seem reasonable due to the storage space and build time. Therefore it’s more likely to precalculate only key aggregations which are allocated across the online analytical processing cube that will significantly reduce the time needed to define any aggregations while running the query in a given data model. There are also two options deal with aggregations so that to increase the performance of a ready cube: an opportunity cache aggregations and exploit usage-based aggregations after users’ queries analysis.
One of the main characteristics of online analytical processing is the multidimensional data structure. The cube can have multiple dimensions. The combination of dimension members defined in the query point to a specific cell that contains an aggregated value, allowing the system to process queries almost instantaneously. The multidimensional structure also allows viewing the data from multiple angles. Due to such a data model, the whole process of data mining constitutes simple for managers and executives without IT skills as the objects presented in the cells are real-world business entities readable for analysts. Furthermore, this data model enables users to handle not only structured data but also unstructured and semistructured datasets owing to metadata. All this makes OLAP databases especially popular for data analysis and BI applications.
One of the core components in OLAP concepts is its server. In the traditional three-tier data warehouse architecture, it occupies the middle tire, between the database server and the client-side OLAP tool. Besides aggregating and pre-computing the data from the relational database, it provides advanced calculation and write-back options, additional functions that extend the basic possibilities of the query languages, and other OLAP features. As we’ve already mentioned, the leading OLAP software vendors to date are Microsoft, Oracle, IBM, SAP, and others. Among those, there are plenty of opportunities to choose from while looking for the most suitable and rich OLAP solution. Depending on an OLAP application example picked by a user there are different data models available and various tools including real-time alert, a function to apply ‘what-if’ scenarios, KPI optimization and complex reporting.
There are several types of online analytical processing systems that differ in the way the data is stored, the way access to the data is organized, etc. A user decides upon the OLAP type according to the requirements of the organization and the needs of the employees who are going to adopt the application.
Relational OLAP, or ROLAP, stores all data, including aggregations, in relational databases rather than cubes, and doesn’t use pre-computation. Relational online analytical processing works with SQL tools that send queries directly to the relational database, but do it in a way that creates an appearance of OLAP operations such as slicing, dicing, and drill down. This approach is considered best suited for enterprises with large data warehouses as if offers better scalability and short load times and is good at processing text information. However, the relational database needs to be specifically designed (an OLTP database won’t do), SQL doesn’t support a lot of complex calculations, and it is often required to create aggregate tables to improve query execution. Overall, relational online analytical processing tools usually have lower performance than those based on MOLAP. Nevertheless, multiple companies choose to acquire the ROLAP application as they believe that this type of system will keep the existing relational tables from the data model and exploit them. However, it also reduces performance in comparison with multidimensional online analytical processing which in its turn would require reloading the existing data.
Multidimensional online analytical processing is the traditional OLAP model. All data, including aggregations, is stored in multidimensional data cubes. The data in the cubes is pre-computed, which ensures fast query performance. The multidimensional model makes it easier to locate cells in a cube than in a table, which, along with powerful OLAP techniques and calculation algorithms, allows retrieving and analyzing data significantly faster than when working with relational databases. At the same time, since the data in the cubes is pre-calculated and thus harder to update, such systems are not very scalable. Besides, transferring data from a relational database to a MOLAP engine can lead to data redundancy. Although multidimensional online analytical processing is deemed to be the most rapid when it comes to querying, extremely compact for datasets with low dimension, and efficient while data retrieval when viewed among all online analytical processing applications types offered.
Hybrid OLAP service combines features of the two previous approaches in order to provide fast query processing in combination with high scalability. In hybrid online analytical processing, the data is divided between a relational database and multidimensional cubes. There are two approaches as to how to divide the data:
- Horizontal partitioning implies that some data slices that require faster query processing are stored in multidimensional online analytical processing, and the others are stored in relational online analytical processing.
- In the vertical partitioning, aggregations are stored in MOLAP and detailed data – in ROLAP.
Major vendors now support this approach in one form or another.
Desktop OLAP implies that users can download a part of the cube on their computer and work with the data locally. The main advantage of such online analytical processing applications is that they are easy and cheap to deploy and maintain. The biggest drawback, however, is their very limited functionality. Given that this type of online analytical processing system is operating on the computer, the space for data storage is critically slow which won’t let a user deal with vast amounts of data probably necessary for full understanding of the ongoing business processes.
WOLAP includes web-based OLAP reporting tools that do not provide options for data download or local access to data. Such OLAP solutions have low deployment cost as all the client needs is Internet connection and a web browser, but, just as desktop OLAP, they have limited functionality. Besides, all major providers now offer web-only access to data as a feature of their much broader solutions.
The main feature of Real-Time OLAP (ROLAP) is that aggregations are not stored in relational tables or cubes, but are calculated in-memory during the query execution. The good news about this OLAP example is that it delivers a split-second data output as queries are run on small amounts of data. However, the system may seem not reliable and stable enough due to being solely web available.
Online analytical tools were created due to the inability of OLTP systems to process complex queries fast enough (OLTP stands for online transactional processing). Since then, the two technologies have been contrasted with each other constantly. Here is a short overview of the main differences between OLAP and OLTP tools.
- Queries. Online analytical processing works with a relatively small amount of very complicated read-only queries, while OLTP processes a large volume of simple queries of different kinds, such as insert, update, read, and delete queries.
- Users. Online analytical processing tools are mainly used by data analysts, managers and decision-makers; OLTP systems are used by database administrators and IT professionals.
- Data. Online analytical processing uses consolidated historical data organized according to a star schema or a snowflake schema. OLPT uses operational data stored in many tables. Online analytical processing data is rarely updated and the updating process takes a lot of time, while OLTP data is updated regularly and fast.
- Data representation. Multidimensional vs. flat two-dimensional.
- Database size. As OLAP databases have to store consolidated data, they are significantly larger than OLTP ones, containing millions of record and occupying up to 1 TB as opposed to thousands of records and the size of up to 1 GB of the OLTP databases.
- Purpose. The main purpose of online analytical processing is data analysis and business intelligence operations scaffolding planning, problem-solving and decision-making processes in the company, while OLTP is used for running and controlling ongoing business tasks.
To sum up, OLTP systems are intended to deal with data online so that to modify it, online analytical processing in its turn is aimed at data extracting to examine business performance and strategize. Both of the technologies handle large amounts of data though they serve different purposes: online analytical processing provides deep analysis of the trends while OLTP ensures the personalized approach to customers. Additionally OLTP application act as is a data supplier for online analytical processing.
The data in cubes is organized in dimensions that consist of hierarchies of multiple levels. Such data structure lays down the foundation for operations of OLAP tools that facilitate data analysis. The main OLAP operations include: roll up, drill down, slice, dice, and pivot.
The Drill Down operation allows users to step down the hierarchy, from the most aggregated values to the most detailed ones, for example, “year” to “month” to “day” in the Date dimension, “country” to “province” to “city” in the Geography dimension.
The Slice operation extracts a subcube from a cube by selecting a value for its dimension, for example, we can retrieve a “slice” from a cube showing the number of internet sales in all countries for all years by choosing “Austria” as the value of the country dimension.
The Dice operation is an “extension” of the slice operation as it allows users to extract a subcube by selecting values for several dimensions. Using the example above, we can perform the dicing operation by selecting “Austria” for the country dimension and “2017” for the Date dimension.
Pivot operation figuratively rotates the cube in order to provide a view of the online analytical processing data from an alternative perspective. This operation allows analyzing the data from different perspectives, making it one of the most important online analytical processing features for creating analytical reports. You can see the graphic representation of the main operations in the figure below. These are just the main OLAP operations. Other operations include drill through, scoping, screening, etc. Besides, the OLAP cube will keep all historical data which can be appealed to later. Through the operations, it’s possible to see the required information in a multidimensional, summarized form easy to understand and analyze
The first standard API, OLE DB for online analytical processing (ODBO), was released by Microsoft in 1997. Four years later, XMLA was announced as the standard for data access in online analytical processing tools. To date, they remain the industry standards that are accepted by most vendors. ODBO introduced MDX as the query language for multidimensional cubes. It was written specifically for business analysts and bears a slight resemblance to SQL. MDX is the only query language that is considered standard for XMLA and is used by most OLAP tools examples. Some systems, like Oracle, for example, implement OLAP in SQL language extensions but also support MDX. Due to the API in online analytical processing, there’s an opportunity to receive online analytical processing diagrams, graphs, chart pies etc. The following example displays a potential OLAP diagram view available in Ranet OLAP model API:
Unlike OLTP, online analytical processing supports just read-only queries, which means that there are no OLAP functions for data editing, but only for its selection – select. MDX syntax is complicated, but there are some basic concepts. MDX functions for cube members include .Children, .Parent, .Level, and .Hierarchy. They return the member’s child elements, parent element, hierarchy level and the hierarchy itself, respectively. .Members function can be applied to axes, OLAP hierarchies and hierarchy levels and returns all members. MDX allows you to select Sets (members from the same axis), Tuples (members from different axes), and Sets of tuples (using the Crossjoin function). Filtering in MDX is introduced using where. All cube elements (dimensions, hierarchies, levels, members, etc.) are written in square brackets, while sets and tuples are written in parentheses. MDX also provides functions for calculating custom members (both measures and attributes) that are not currently available in the cube.
As any online analytical processing model, Ranet OLAP helps deal with large amounts of business data. Therefore, corporations, companies, enterprises, holdings, and companies that face big data management turn to Galaktikasoft looking to optimize their business processes. Ranet OLAP is noted for the following features:
- This OLAP model is fast at extracting the requisite data from the warehouse. Commonly the information is available afterwards in the form of a pivot table or any other form of visualization required owing to Ranet OLAP Dashboard.
- The application example keeps a user aware of the business situation as it alerts about current information right when it emerges.
- It doesn’t require programming skills and training.
- This OLAP example discloses hidden facts and brings up significant insights.
- It’s applicable for any industry and company’s sphere.
Nowadays, the market offers a large number of online analytical processing data analysis tools that cater for users’ versatile business needs. There is no shortage in vendors, and customers have a lot of freedom choosing whether they want to work with MDX or SQL and what type of online analytical processing will work best with their system architecture (ROLAP, MOLAP, etc.). We can single out two main types of OLAP tools available on the market. The first type encompasses the server-side tools like the OLAP environment. They occupy the spot between the data warehouse and the client application and aggregate the data from the database into cubes. They also provide advanced calculation options, data access security, and other server-side functionality. The second type includes client-side tools that provide easy and comprehensible interfaces for business users, allowing them to work with data and create OLAP reports without any specific IT or business analysis skills. These tools usually provide such functionality as ad hoc report generation, drag-n-drop interface provided by Dynamic Pivot Table control particularly in Ranet OLAP, data visualization, working with KPIs, etc. Two of such tools are Ranet OLAP and Ranet Analytics.
Ranet tools provide end users with intuitive easy-to-use interfaces to work with their business data. Ranet OLAP tool is a library of powerful OLAP components for data analysis, reporting, KPI analysis, trend analysis, and data visualization. Ranet allows end users to connect instantly to their OLAP data source and create interactive reports by simply dragging and dropping elements from the cube to the report designer area. A built-in cell style designer allows creating custom formatting styles for web pivot table, making it easier to spot underlying data trends. Special extensions allow visualizing your online analytical processing report data on different types of pivot charts. For those users that know MDX syntax and have experience working with MDX parser, Ranet OLAP also provides an option to write custom MDX queries and create custom calculations, which helps users to extract and analyze data slices that are not readily available in the cubes. Ranet Analytics provides end users with interactive dashboards for visualizing the data in their ad hoc reports. It allows analyzing various aspects of their business and determining trends in their data, which helps to make more grounded and informed decisions about the business development. Besides, it also significantly improves the overall look and feel of the OLAP reporting software functionality. Both Ranet OLAP and Ranet Analytics can be easily embedded in the users’ applications and customized to their visual style.
Like any other technology, online analytical processing has its advantages and disadvantages that need to be thoroughly considered before database and data warehouse design and implementation.
The most important advantage of online analytical processing tools is their speed. The main reason for developing it was the inability of other systems to process queries fast enough. Due to aggregation and pre-computation of data, online analytical processing can execute complex queries in just a tiny fraction of time in which the same query will be processed in OLTP. Another considerable advantage is advanced calculating and data analysis options. Online analytical processing is heavily used for data analysis and BI purposes, which incentivizes vendors to improve the computational power of their tools constantly. MDX is a powerful instrument that offers numerous options for data extraction and analytics. Data representation is another important advantage of executing OLAP analytics. Not only is multidimensional data structure convenient for the human brain to visualize and comprehend, but also it offers an opportunity to rotate the data and analyze it from different perspectives. The cube structure provides an option to not only analyze actual data but also create what-if scenarios for business modeling using write-back partitions of the cube. Besides, it has a relatively flat learning curve. Created for business analysts rather than engineers, it allows end users to operate easily comprehensible business terms, and the help of database and IT specialists is rarely required for designing and creating OLAP reports.
When designed and used appropriately, online analytical processing has hardly any disadvantages. However, this technology has a number of limitations that prevent it from being used in certain situations. One of the biggest factors that can become a disadvantage is that it requires a lot of storage space. In most OLAP systems, the data is aggregated and pre-computed before storing in the cubes, which leads to steep increase in the amount of data and can potentially result in data redundancy. In order to avoid it, a lot of online analytical processing tools use powerful data compression algorithms. Besides, some systems don’t aggregate the data at all or aggregate just parts of it. However, depending on the database size, data warehouse architecture, the purpose of the system and other factors, these steps are not always effective. Another disadvantage is that online analytical processing can’t process a large number of queries. It was designed to process a small number of queries that are more complex in nature and will perform poorly when required to execute multiple queries. One more disadvantage is that the data in the cubes cannot be updated often. Again, as the data in the cubes is aggregated and pre-computed, the update process will take a considerable amount of time. OLAP was designed to work with a large amount of historical data that is not supposed to be frequently updated, and if you need to process a lot of current data it will not perform well.
As we have already mentioned above, the main asset of online analytical processing is its powerful data analysis capabilities. It determines extensive use of OLAP in business intelligence and reporting applications. Standard examples of online analytical processing applications include marketing analysis and planning, sales analytics, business modeling and forecasting, financial reporting and other spheres. Online analytical processing tools are widely used in data mining, providing insights into the company’s data trends and delivering information to the management used in the company's’ problem solving and decision-making processes. An application example such as Ranet OLAP enables its users to stay aware of an ongoing situation in the market and take a pivotal decision right away when it’s necessary.