Tо begin to design an effective OLAP application, designers need a background in basic OLAP concepts, as well as knowledge of general OLAP design principles. Next, they need to learn the specifics of their selected OLAP product. Designers should know about the product's features, functionality, and restrictions, so that they can design an OLAP application that takes advantage of the strengths of the product while avoiding potential pitfalls. They should also understand the OLAP product's architecture and how that architecture fits into their existing infrastructure. Designers need to know about the capabilities of the product components that come out of the box—including the tools provided to make the process of designing and building an OLAP solution easier—as well as understand the front-end tools that may be available to support the analysis and reporting needs of business users.
Finally, the developer should work with end users to train them. Often, end users miss many nuances that truly make their lives easier in an attempt to "just get started." A training program should be implemented that starts with the basics, then allows users to get used to the system, then follows up with more advanced features that make tasks easier. Without this approach, end users often get frustrated and do not adopt the system. End-user adoption will be a critical component of the success or failure of the OLAP application.
In this article, we build on some of the OLAP concepts introduced in this blog, showing how they affect the design process. We begin our discussion with some general design principles that are applicable to the design of any OLAP application. We then look at specific design issues for Oracle OLAP and Oracle Essbase. We conclude with a review of product architectures, product components, and compatible Oracle products for Essbase and Oracle OLAP.
General Design Principles intoduction
To reduce the concept of OLAP application design methodology into a few pages is a challenging exercise. As with developing most of types of applications, there are as many ways to design and build an OLAP application as there are OLAP consultants. The purpose of this blog article is to share general principles of successful methodologies.
In this artice, we use the general term OLAP application. For Essbase, the application is made up of one or more Essbase multidimensional databases. For Oracle OLAP, the application is an analytic workspace with multiple cubes and multidimensional objects.
A good OLAP application design methodology considers the following general principles:
- Design is an iterative process, requiring multiple releases of the application.
- User requirements must drive design.
- What is omitted from each release is as important as what goes in.
- Dimension types offer convenience for both the designer and end users.
- Data types improve data quality.
- Different uses require different views of the data.
- User access and security need to be planned in advance.
Applying these principles to the design of either an Essbase database or an Oracle OLAP analytic workspace will go a long way to ensuring the success of your project. In the following pages, we describe each of these general principles in more detail.
Design Is an Iterative Process
One of the most effective approaches to OLAP design is an iterative approach, which seeks feedback from users and incorporates that feedback into the application. Figure 3-1
Figure 3-1. Designing and building an OLAP application is an iterative process.
shows a very simple diagram describing an iterative design and build process. While the diagram in Figure 3-1 was originally created with Essbase in mind, it applies just as well to Oracle OLAP and to OLAP design in general. Of course, the process may be more complex than this figure implies, requiring more than four steps.
There are three primary inputs for the design: the dimensions you identify and use, the source data you select, and the calculations you run. You need to consult with your user base while initially designing each input.
The output is a set of reports, which are used to validate the data and verify that the results match the expectations of business users. The results of the validation step feed back into the other steps of the process, allowing you to tweak your design and implementation to better suit your users.
With time and repetition, you will see incremental improvements to your OLAP application. This iterative process is a powerful way to ensure that your design actually does meet the needs of your users.
User Requirements Drive Design
We have said it before, and we will say it again throughout this book: OLAP applications are about the business user. While IT plays an important role in data management, business users—and their need for capabilities such as fast and easy reports, ad hoc analyses, and scenario modeling—will drive the implementation of an OLAP application. Ignoring this fundamental design principle will inevitably lead to dissatisfied users and the ultimate failure of the system to meet business needs.
With that in mind, the logical place to start the design process is with the tool most used by business users: reports. Long before you ever considered an OLAP system, users were creating reports. The reports might be in spreadsheets or some other reporting front end, but they exist. These reports will tell you a great deal about how to structure your OLAP model.
For example, consider the spreadsheet report shown in Figure 3-2. For the purposes of discovering the important design elements, the numbers are irrelevant. The labels and the delivery vehicle, however, tell us a great deal about the business. You just need to decode it.
When decoding a report, you need to channel your internal Sherlock Holmes. Looking at this report, we can deduce the following:
- There are six, possibly seven, required dimensions: Product (Colas), Region/ Market (East), Customer/Account (N.A. Strategic), Measures (row headers), Scenario (columns Actual, Budget, and Variance), and Time (may be two dimensions, see the next item for an explanation).
- For Time, we can see that the page header denotes fiscal years (FY10), and the column headers specify quarters (Qtr1 and Qtr2). This generally indicates that there is a cross-tab reporting requirement for time. Cross-tab
Figure 3-2. Design begins with reports.
means using a dimension on both the row and column at once. This may or may not need to be done in two dimensions, depending on the tool and the business environment.
- Variance calculations are required in the Scenario and Measures dimensions.
- The sample report is a Microsoft Excel spreadsheet, which may indicate a preference for Excel as a reporting vehicle.
- Report formatting might be a key concern in the delivery (see the previous item).
The point in reviewing existing reports is to start forming assumptions and questions for your end users. The end users will not tell you what dimensions they need. While your initial assumptions may not be correct, reviewing the elements the end users are currently using in their reports will be the best place to start your investigation for the design. You users will validate or invalidate your assumptions down the line.
What's Left Out Is as Important as What Goes In
What you leave out of an OLAP application is just as important as what you put into it. Creating a meaningful application is not about extensive dimensionality or volumes of data. It is not about tricky mathematics or an abundance of available metrics. At the end of the day, no matter how elegant a solution you create, the OLAP application will live and die on the perceptions of the users. Simplicity is often best. This advice does not mean that you cannot have a 30-dimension model.
In fact, we have seen a number of OLAP applications with many dimensions in production. It simply means that you should ensure that the dimensionality you build into your model is meaningful.
For example, consider the following dimensions used in an application whose purpose is to analyze profits: Time, Profit, Customers, Markets, and Weather. The dimensions that reference time, customers, and markets all make sense in respect to profit and in respect to each other. The dimension for weather seems a bit out of place. We are not arguing that weather can never have an impact on profits; depending on the products you sell, analyzing weather conditions may be important. Rather, we are suggesting that, in most cases, weather is not an indicator of profitability. If a dimension is irrelevant to the purpose for the database, omit it from the model. In this scenario, ask the end users to show how weather is being used in the current set of reports. If you hear "We think it would be nice to understand how weather impacts profits," gently push back on the request and suggest that it be added in the next release.
One best practice when building an OLAP application is to solve what is currently being manually analyzed by spreadsheets. After the first delivery of the application, it is very likely the end users will uncover other trends or hypothesize about other dimensions that may be relevant to the analysis. At that point, when the original problem is solved, you can begin a new iteration and consider extending the application to incorporate these other dimensions.
In addition, designers often add dimensions that really are simply extensions of other dimensions or not needed at all. For example, consider a model that has dimensions for Scenario, Project, Business Units, Time, and Accounts. On the surface, all of these dimensions seem to be relevant. However, you might discover that a given project is worked on by only one business unit. Figure 3-3 shows what a report might look like when Project and Unit dimensions are modeled separately in this situation. It can be difficult for users to find the data with this design. Practically speaking, if a project belongs to only one department, you could easily model Unit and Product as a single dimension, as shown in Figure 3-4.
|project 1||project 2||project 3||project 4||project 5||project 6|
Figure 3-3. Project and Unit as separate dimensions results in many null cells.
Figure 3-4. Project may be better modeled as a member of Unit.
You can often spot irrelevant dimensions by considering the purpose of the application—the metrics. If your application is tracking customer profitability, then a dimension with employee numbers may not be relevant.
As a last note about dimensionality, remember that values in an OLAP cube must always be represented as an intersection of a member from all base dimensions. If you put 20 dimensions in your OLAP model, either your user must think across 20 dimensions or you must create a good template-reporting approach to make it easy for the users to focus on the dimensions that are important to them. Again, a step-wise rollout of dimensions to support additional functionality over the multiple releases of the OLAP application can ease this pain.
Dimension Types Offer Convenience
Dimension types provide the OLAP engine with a wealth of information about the dimensions, including metadata and inferences about how to process the data connected to the dimension. For example, dimensions of the time type have a sequential order (January is always before February), values that can be aggregated (rolled up), and members that can be compared to other members within the dimension (Quarter 1 versus Quarter 2). If you create a dimension as a time dimension, you can specify things such as time period ranges (January 2010 to May 2010), year-to-date values (January 2010 through May 2010 summed up), and year-over-year comparisons (January 2010 versus January 2009), and the OLAP engine will know how to store and calculate the data.
When designing your OLAP application, you should be aware of the built-in dimension types offered by the OLAP product and plan to take advantage of the convenience they offer. This can save time for the developer as well as the end users, because OLAP systems have built-in capabilities for certain dimension types.
Data Types Improve Data Quality
Similar to dimension types, data types provide information about the data to the OLAP engine. Setting the appropriate data type for the data in your application can save space and execution time. Data types also help to ensure that only data with the appropriate data type is written back to the database. As with dimension types, you need to learn about the built-in data types offered by the OLAP product.
The most common data type in both Oracle OLAP and Oracle Essbase is the numeric type. Essbase stores numeric data in decimal format, occupying 8 bytes per cell of data. Oracle OLAP defaults to this data type, but can use any Oracle Database data type. Both products also offer a text data type, though with varying restrictions.
Different Uses Require Different Views of the Data
Hierarchical dimensions provide one way of looking at data. Business users often want to look at their data in other ways as well. For example, different analyses call for different rollup structures. In addition, users require an efficient means for selecting specific dimension members for analyses and for summarizing dimension members in different ways.
During the design process, be on the lookout for user requirements that might indicate the need for alternate views of the data. Look at how your business is organized for additional clues on how to define alternate views. For example, the sales team may want to look at the data by sales representative and sales managers, and the product team may want to look at the same data, but by the company's product lines.
In general, there are two methods for repurposing members: attributes and alternate hierarchies, which were discussed in this blog. Recall that an attribute is a tag assigned to a member. Its purpose is to aid an end user in finding that member without needing to navigate the dimensional hierarchies. Alternate hierarchies provide different ways to aggregate a dimension. Oracle OLAP and Oracle Essbase take a similar approach to user-defined attributes but differ in how they support alternate hierarchies. For more information, see the product-specific design sections later in this chapter.
User Access and Security Needs Planning
In OLAP applications, user access rights are often defined on specific dimensions. For example, the Eastern region manager may require access to only the data for the Eastern region and all of the customers in the Eastern region.
While access rights are often considered after the first prototype of a system is built, they should be considered carefully in the design process, as often the data that drives security policies (such as which users map to the Eastern region) is not available in the source data. This data may need to be created and added.
When you design an OLAP application, you should be aware of your organization's policies and infrastructure for user authentication and user roles, as well as the security features included with the OLAP product.
Allow Areas for Training and Testing
As mentioned at the beginning of this blog, training end users is an important part of an OLAP project. You may want to include areas in your application design that are specifically designed for training purposes. Often, these areas can also be used for testing purposes. By incorporating these into your design, you will ease the job of those that are training people and testing the application.