The previous my article focused on designing OLAP applications. In this article, we walk you through the steps required to build an Oracle OLAP analytic workspace. Recall that an Oracle OLAP analytic workspace contains a collection of dimensions and a collection of cubes, where any given cube contains only the dimensions required to describe the measures in that cube. The analytic workspace also holds other multidimensional objects, such as folders and programs, that are required for an OLAP analysis.
The article starts with an overview of the components and files used in the demonstration. During the demonstration, we walk you through how to prepare data, create an analytic workspace, create and populate dimensions, and create and populate cubes.
Oracle OLAP Demonstration Overview
Our goal for this demonstration is to provide an overall workflow and some best practices to get you started with Oracle OLAP. You can use the same workflow to build both simple and complex OLAP models. This walk-through is for demonstration purposes only. Before building an analytic workspace for a production environment, see the Oracle OLAP User's Guide and the Oracle OLAP DML Reference.
In this demonstration, we create an analytic workspace using a four-step process:
- Prepare the data.
- Create an analytic workspace.
- Create and populate dimensions.
- Create and populate cubes.
After we show how to create a cube with dimensions and some simple calculations, we discuss adding more advanced calculations and operations to dimensions and cubes.
Unless otherwise specified, the instructions and screen captures in this article reflect Oracle OLAP tig and Oracle Database tig running in a UNIX- based environment. For those using Oracle OLAP 10g, we point out where version tig is different from version 10g. For instructions for other operating systems, see the Oracle OLAP documentation set.
In the architecture diagrams for Oracle OLAP, we introduced the key components involved in the building process. These components, circled on Figure 4-1, are OWB and AWM.
OWB is a general-purpose warehouse construction and extract, transform, and load (ETL) tool that has the ability to build and maintain OLAP objects, as well as all
Figure 4-1. You can build analytic workspaces using the administration tools.
the other warehouse components. AWM focuses solely on the construction and maintenance of OLAP objects. Both tools are written in Java and use the Java API for Oracle OLAP to perform a bulk of the work. For this article's example, we use AWM. If you want to use OWB, you will find the processes and concepts to be similar, although the user interface is different.
The examples and screen captures used in this article are based on the OLAPTRAIN OLAP demo data. You can download the sample data from the Oracle Press web site.
From Source to Cubes with Analytic Workspace Manager
In this section, we describe how to use AWM to build an analytic workspace and the objects contained within. After an overview of AWM, we walk you through the four steps required to build and populate OLAP cubes, as noted in the previous section.
Getting Started with Analytic Workspace Manager
The first time you start AWM, you are asked to create a database connection. AWM uses either SQL*Net or JDBC to connect to an Oracle database. You will need to know the host, port, and service ID (SID) for the database or the TNS Alias created to connect to the database. Your Oracle Database Administrator can supply this information. After you connect to the database, you are prompted to log in to the database. If you like, you can configure your AWM environment at this time. The following sections describe the steps to get started with AWM.
Creating a Database Connection and Logging in
- To launch AWM, do one of the following:
- On Windows-based systems, click Start I All Programs I Oracle - OraClient Homel I Integrated Management Tools - OLAP Analytic Workspace Manager.
- On UNIX-based systems, open a prompt and run the following shell script:
- If this is the first time you have opened AWM, you are prompted to define a database connection.
- Enter a description to identify the server.
- If you want to use SQL*Net, specify the TNS Alias for the server. Otherwise, specify the host, port, and SID, separated by colons. Then click Create.
- The following image is for illustrative purposes only; do not copy these values.
- Right-click the connection and select Connect Database.
- When prompted, specify the user name and password that you use to connect to the database. For this example, we are using the OLAPTRAIN schema with a password of oracle.
- After you are connected, you will be presented with a list of schemas to which you have access. Select the schema that will contain the new analytic workspace. Expand a schema tree to see what, if any, analytic workspaces exist in those schemas.
Configuring Analytic Workspace Manager
You can configure AWM to suit your environment and preferences. The configuration settings affect the way that AWM behaves. To set your configuration preferences, select Tools | Configuration. We recommend setting the following options:
- Template directory Specify where you want template files stored.
- Enable plugins Select this option to install plug-ins automaticall.
You can specify other options as desired. For descriptions of the options, click Help.
If you are building 10g mode AWs with either Oracle Database 10g or Oracle Database 11 g in 10g mode, and you want to use SQL to access cubes, you need to obtain and install the plug-in that enables the generation of SQL views. You can download the plug-in from the Oracle Technology Network (OTN) web site under Oracle OLAP. In Oracle OLAP 11g connected in 11g mode, SQL views are generated automatically.
Setting Read-Only Access on Analytic Workspaces
Before we get started building the analytic workspace, here is one more helpful tip. Normally, when you attach an analytic workspace, it comes up read-write. This behavior can be changed by modifying the awm.properties file. By adding a setting, you can have AWM prompt you for how to open the analytic workspace. This will give you the option to attach the workspace as read-only, which is useful when you want to explore an analytic workspace without the danger of inadvertently modifying it.
To make this modification, follow these steps:
- Go to the directory where the AWM program is started, usually Oracle Home/olap/awm.
- Open the awm.properties file.
- Insert the following command in the list of settings:
Your file should look like similar to this:
awm.object.display= _aw.model.show.attach=y olap_dml_log.log_results=y awm.hide.map.graphical=n awm.plugin_dirname=/
- Save and close the file.
The next time you attach an analytic workspace, you will be prompted to choose how you want to open it: Read Only, Read Write, or Read Write Exclusive. If you want to save your work, do not select Read Only.
Preparing the Data
Before we jump into building the OLAP analytic workspace, we need to start with some data. In this demonstration, we are working with a relational fact table that contains daily sales data for a computer sales company. The fact table has four lookup tables, as shown in Figure 4-2.
In Figure 4-2, the All level is represented explicitly in the source data. This single value represents the top of the hierarchy and it is repeated in every data row. In Oracle OLAP 11 g, you do not need to include it and can enjoy some space savings by leaving it out. Version 11 g also provides the ability to enter equations in the mapping panel, so that you can calculate source values on the fly from other data without physically storing it. This gives a mechanism for handling missing values. Many reporting tools, including OBIEE, require an All level for user-defined dimensions, so it is a good idea to include this level where possible.
Oracle OLAP uses Oracle tables, views, and materialized views as the primary data sources. These data sources can be represented as a star schema (as in Figure 4-2) or a snowflake schema (as in Figure 4-3), as parent-child relations, or as a collection of tables and views. You can also use flat files as data sources, however, the flat files need to be represented by external tables or loaded into tables via an ETL process. It is possible to use gateways to non-Oracle data as well.
FIGURE 4-2. Demonstration data in a star schema
If a dimension has skip-level or ragged hierarchies, consider using a snowflake schema to represent the data source. AWM automatically handles the relationships between levels in a snowflake schema.
FIGURE 4-3. Demonstration data in snowflake schema
When preparing the dimension table or view, you need to ensure that each member has a key value and description. If you are using a star schema, the dimension sources must have the full parentage in each row. With a snowflake schema, where each level of a dimension is in a separate source table or view, you need to provide an additional column for the parent key of that level. Here is an example of a star schema table for a CHANNEL dimension:
create table channel (all_channels_key number,
class_key number, class_name varchar2(20),
channel name varchar2(20) )
In addition, the following is an example of the snowflake schema tables for a CHANNEL dimension:
create table all_channels
all channel name varchar2(20))
create table channel_class
foreign key ("ALL_CHANNELS_KEY")
("ALL_CHANNELS_KEY") enable ) create table channel
foreign key ("CLASS_KEY")
("CLASS_KEY") enable )
As you can see, the snowflake tables represent the same data as the star table. The same data is spread across three tables linked with parent keys. To support skip and ragged levels, the parent keys can be any parent key above the child level in any table.
Lastly, a parent-child table represents a value-based hierarchy with columns for the member key, parent key, and member description.
Creating an Analytic Workspace
Now we are ready to create an analytic workspace that will hold our OLAP dimensions and cubes. Before we get started, let's review some best practices for creating workspaces and naming the metadata used within the workspace.
Following the advice in this section will help you to create cubes that are easy to understand and use. The naming conventions that we suggest ensure that the generated column names in your views are easy to read, and reduce the chances that generated column names will be truncated to fit within the limits for a column name in the Oracle Database. These naming conventions also make it easier and quicker to map to the columns in AWM, because the screens are less cluttered with long object names.
The following guidelines have proven effective in many Oracle OLAP implementations:
- If possible, create the analytic workspace in its own schema. This helps with security and backing up the analytic workspaces.
- Store the workspaces in a separate tablespace from the relational data. In most cases, you should turn off logging during maintenance to reduce the size of the IO and redo buffers.
- Create simple, meaningful names and avoid the use of the underscore character (_), especially when naming dimensions, hierarchies, levels, and attributes.
- Keep names as short as possible for dimensions, levels, hierarchies, and attributes. In Oracle OLAP 11g, the dimension level creation tool limits the length of level names to 30 characters.
- Di mension names must be unique within an analytic workspace, but level and attribute names do not need to be unique. You can use the same level name, such as TOTAL or ALL, in more than one dimension. The same is true for attributes.
- Do not put the dimension name in level or attribute names.
CLASSis fine to represent
- Do not be too cryptic in your naming standard (for example, do not use just the first two characters of the dimension names in the cube as the name of the cube). This may make sense initially, but it can be very hard to read and find the objects later
|Object Type||Instead Of||Use||Length||Length|
|Cubes||TI_PR_GE_CH or SALES_CUBE||SALES||6-12||<20|
TABLE 4-1. Suggested Naming Conventions
- Do not try to make names too descriptive (for example, avoid putting "DIM" in the name of everything that is a dimension or "ATT" into the name of
all the attributes). Adding prefixes makes names too long and harder to read. The database knows which objects are which, and you can query the analytic workspace if needed. For example, to find the names of all the dimensions in an analytic workspace, you can use a simple SQL command like the following:
select * from all_cube_dimensions
- Avoid using plurals in the names of objects where appropriate. For example, name a dimension PRODUCT instead of PRODUCTS.
Table 4-1 suggests some possible names for common object types and provides guidelines for the number of characters to use for each type of object.
To illustrate the importance of naming your objects appropriately, Figure 4-4 shows two implementations of the same CUSTOMER dimension. The dimension on the right side implements the preceding best practice guidelines.
Figure 4-5 shows the same dimensions in a different view. Notice that in this view, some column names have been truncated due to the maximum length of column names in the database. For the poor names, the truncation makes things even more unreadable.
Here is an example of a SELECT statement to retrieve data from the poor naming convention:
select all_company_customers_di,sales_districts_long_des,sales_managers_long_desc/ customer_sales_represent,sales_customers_long_des from company_customers_dim_mai_view
In addition, the following is an example of a SELECT statement using the recommending naming conventions:
select all_customers_long_descr,district_long_descriptio,managers_long_description, sales_rep_long_descripti,customer_long_descriptio from customer standard view
As you can see, the shorter names make for easier SELECT statements, which are also easier to read and may actually run faster.
Long and short description attributes are created automatically. You can change them if shorter names are desired.
Creating the New Analytic Workspace
Analytic workspaces can become very big, and the maintenance of the analytic workspaces is very disk-intensive. If your analytic workspace will be large, we recommend that you create a separate tablespace to store the analytic workspace. Depending on the update frequency and the size of each update, it may be appropriate to disable logging on this tablespace.
Figure 4-4. Names are easier to read when you follow best practices
Figure 4-5. Truncation can cause readability issues.
To create an analytic workspace:
- In AWM, right-click Analytic Workspaces and select Create Analytic Workspace
2. Specify a name for the analytic workspace. For this example, enter SALESTRACK.
3. Select the tablespace where you want to store the analytic workspace. For this example, use the default tablespace.
4. Click Create. The analytic workspace is created and displayed in the main AWM window.
The next step is to create the dimensions for the cubes.
Creating and Populating Dimensions
You can create your dimensions in any order. AWM will list them in alphabetical order once they are created. Remember that dimensions are created once in the analytic workspace and are reused in the cubes. Dimensions require a default hierarchy and may contain more than one hierarchy. You need to create all dimensions for a cube before you can create the cube itself. The process consists of the following steps:
- Create the dimensions.
- Optionally, define summary levels (not required for value-based hierarchies).
- Organize dimension levels into hierarchies.
- Map to a data source.
- Load the dimensions.
Oracle OLAP has two basic types of dimensions: user-defined and time. User-defined dimensions represent a majority of the dimensions used by the cubes. Time dimensions are specialized dimensions that have additional characteristics that allow for time-series analysis. If the application or the cube does not require time-series analysis, you do not need to create a time dimension.
|Incoming Member Key||Level||Assigned Key|
|ALL_CHANNELS||All channels||All channels All channels|
TABLE 4-2. Assigned Keys When Generate Surrogate Keys Is Selected
The two types of hierarchies are level-based and value-based. Level-based hierarchies, the default, require at least one level. Value- based hierarchies do not require levels; you simply give the hierarchy a name and select the Value Based Hierarchy option in AWM.
Oracle OLAP requires that all keys for a dimension be unique across all levels and hierarchies within that dimension. They do not need to be unique across all dimensions; for example, you can have a customer 1001 and a product 1001. If they are not unique within a dimension, the data loader will not load the dimension members during the maintenance process.
When you create a dimension, you specify how you want Oracle OLAP to handle duplicate members by selecting either the Generate Surrogate Keys option or the Use Keys from Data Source option. As a safety measure, AWM defaults to Generate Surrogate Keys, which means that Oracle OLAP adds the level name as a prefix to all incoming member names to ensure uniqueness across levels. For example, Table 4-2 shows the assigned keys for the members in the CHANNEL dimension.
Surrogate keys are especially helpful if you are not sure that your members are unique across levels. As you can see, there were nonunique NA members in the preceding example. Prefixes create large key values, which can increase the size of the analytic workspace. However, the increase in size due to surrogate keys is very slight, because the dimension members are only stored once (unlike with tables, where the impact on storage is larger because the key repeats for each row in the table). Surrogate keys can also make querying the data via the OLAP command language and possibly SQL map more complex, because you may need to know the level name.
Keys must be unique within a level. For example, you are not allowed to have a city with the key KC that is a child of Missouri if there is a separate city with the key KC that is a child of Kansas. In this case, surrogate keys do not help you, because the key for each city KC would be assigned the key CITY_KC; Oracle OLAP would not be able to differentiate these when loading the data.
Now let's build the dimensions. In this demonstration, we are creating the four dimensions shown in our original star schema (shown in Figure 4-2): CHANNEL, PRODUCT, GEOGRAPHY, and TIME.
To build dimensions:
- Right-click Dimension and select Create Dimension. The Create Dimension dialog box has four tabs where you can enter information for a dimension. However, you do not need to enter information on all the tabs at the time the dimension is first created.
- On the General tab, specify the name of the dimension. In this case, enter CHANNEL. As you type, the Short Label, Long Label, and Description fields are filled in automatically. The name is created in uppercase characters.
The name cannot contain spaces or special characters. Underscores are permitted, though it is a best practice to avoid them where possible. If you use an underscore in the Name field, the underscore is converted to a space in the label and description fields. If desired, edit the labels and description for the dimension.
In Oracle OLAP 10g, the Levels and Materialized Views tabs are not part of the Create Dimension dialog box, and this information is entered separately.
- On the Levels tab, enter ALL_CHANNELS, CLASS, and CHANNEL. If desired, edit the labels and description for the level.
While not required, you can avoid confusion later by entering the levels in a top-down order that reflects the parent-child relationships between members, as shown in the following illustration
- On the Implementation Details tab, if you are sure that the data source contains no duplicate member names (keys), select Use Keys from Data Source. The data for this demonstration uses unique keys.
- Click Create to generate the CHANNEL dimension.
- 6.Repeat this process to create the following dimensions and levels:
|Dimension Name||Level Names|
For this demonstration, we create four level-based hierarchies: one for each of the dimensions. While there is no functional limit to the number of hierarchies in a dimension, you can have only one default hierarchy in each dimension. You can use any name you want for hierarchies, but it is a good idea to establish a naming convention for the default hierarchies, such as STANDARD or DEFAULT.
By default, when you create a new hierarchy, it is a level-based hierarchy. It is also set as the default hierarchy. If you have more than one hierarchy in a dimension, you need to choose one of them to be the default hierarchy.
If you did not create levels in the previous procedure (for example, you are
using Oracle OLAP 10g) or if you want to add new levels, create the levels now.
- For the CHANNEL dimension, right-click Levels and select Create Level.
- Specify the name for the level and fill in short label, long label, and description. Then click Create.
- Repeat this procedure to add the rest of the levels for this dimension and the rest of the dimensions.
You are now ready to create the default dimension hierarchies.
- For the CHANNEL dimension, right-click Hierarchy and select Create Hierarchy.
- Enter the name of the hierarchy. For this example, you may want to use the default name: STANDARD. If desired, change the autogenerated labels and description.
- Select the levels to include in the hierarchy from the Available Levels list and add them to the Selected Levels list. If necessary, change the order of the levels in the Selected Levels list so that the list reflects the desired top- down hierarchy order (and by extension, the desired drill path).
If you created the levels in a top-down order, you will not need to change the order here.
- Click Create to create the default hierarchy for the CHANNEL dimension.
- Repeat this procedure to create default hierarchies for the other dimensions,
|Dimension Name||Default Hierarchy Name||Level Names/Order|
Attributes are used in two ways: to display as dimension member labels, and to use in calculations and when selecting data. For user-defined dimensions, AWM automatically creates the attributes LONG_DESCRIPTION and SHORT_DESCRIPTION. These attributes represent the dimension member labels. These labels can be used by the front end to display a long or short descriptive label for the dimension. The name of the attribute can be changed, but the attribute type should not be changed. You can also create user-defined attributes.
In this example, we create a user-defined attribute called TYPE that can be used in calculations and when selecting data. We define the name, labels, and description, but Oracle OLAP sets the attribute type.
To create the TYPE attribute:
- Right-click Attributes and select Create Attribute.
- Specify a name for the attribute. For this example, enter TYPE. The labels and description are filled in automatically. If desired, modify the labels and description. By default, the new attribute will be included in SQL views (not available in Oracle OLAP 10g). In addition, an index is created to improve the performance of attribute-based queries. You can choose not to create an index by deselecting the Index option.
- Associate the attribute with levels by selecting the levels in the Apply Attributes To area. For this example, select CHANNEL.
The data source will need an attribute value for each level selected here.
- Click Create to create the TYPE attribute.
Repeat this procedure to create additional attributes as desired.
As we described previously, we have source data for the dimension contained in either a table or a view. We must now tell AWM how to relate this source data to the dimension levels. This is done through a process called mapping.
There are three basic steps to mapping dimensions:
- Choose the mapping type.
- Choose the data source.
- Map the source columns to the dimension members and attributes.
The mapping tool supports mapping standard star schemas, snowflake schemas, and source collections. This section covers the two basic types of mappings: star schema and snowflake schema.
Mapping Dimensions Using a Star Schema In the following procedure, we map the demonstration data to dimension members and attributes using a star schema. After the procedure, we discuss the differences that would occur if we selected a snowflake schema instead.
To map to a star schema:
- Click Mappings to display the mapping panel. The window shows a list of available schemas, and the mapping panel displays the dimensions to be mapped. By default, Star Schema is selected as the type of the dimension table, and Oracle OLAP 11g displays information in a table mapping view. Version 10g defaults to the graphical mapping view. You can switch between views in either version. In 11 g, if you use expressions in the mapping, you will not be able to go back to the graphical mapping. The following illustration shows the graphical mapping view in version 11 g.
- In the schema list, find the table or view, and either drag it over to the mapping panel or double-click it. For this example, double-click the CHANNELS table.
- If desired, you can now hide the schema list so there is more room to map to the dimension by shrinking the schema pane or clicking on the shrink icon on the divider.
- Map table columns to dimension members by dragging and dropping the column name to the appropriate dimensional object. Key columns are mapped to level members. Name columns are mapped to the description attributes.
If there is only one description column for each level in the source, map the column to both attributes. It is important that all levels and attributes be mapped, especially if you plan to use cube-organized materialized views, as discussed in the "Working with Cube-Organized Materialized Views" section later in this blog.
For this example, create the following mappings:
The following illustration shows what the resulting mapping looks like in the graphical view.
You may have noticed that we did not map columns to the All Channels level. In Oracle OLAP 11 g, we do not need to have these columns because it has the same value for every row. You can choose to add custom values, as shown in the following steps.
- To add values, switch to the table mapping view by clicking the Table Map button in the toolbar.
- Under ALL_CHANNELS, specify 'ALL_CHANNELS' to the right of Member. The long and short descriptions are filled in automatically. Click Apply to add the value.
Here, we assigned a literal string as a value for the dimension member. However, you can also concatenate columns (nameNtitle), apply functions (upper(name)), or create more complex SQL expressions. Also, we specified only text in this example, but you can also specify equations. We will explore the use of equations in later examples.
In Oracle OLAP 10g, it is not possible to enter text or equations here; the source table or view must contain these values. You may need to create and populate the necessary columns in the source or define a view that contains the necessary columns.
Understanding the Differences for a Snowflake Schema Snowflake schema mapping is used when the relational schema is a snowflake or when the data has ragged or skip-level hierarchies. The information required for a value-based hierarchy is the same as for a snowflake mapping. The source must contain columns for a parent, child, and description.
To change to a snowflake schema, select Snowflake Schema from the Type of Dimension Table(s) drop-down list. In Figure 4-6, there are two hierarchies: Region
Figure 4-6. GEOGRAPHY dimension mapped as a snowflake schema
and Customer. The Zip level (above Customer) is a skip-level hierarchy because not all customers have a ZIP code. When a ZIP code is not provided for a member, the parent key for that member is the State key. This ensures that all the customers will roll up to State and above. When drilling down on a State, both ZIP codes and customers that are children of that state will be shown.
Notice that every level in the mapping has a different source (in this case, views) specified with parent, member, and name fields. If a dimension has more than one hierarchy, the sources can have additional parent keys. For example, GEOG_ REGION_V has two keys—ALL_REGIONS_ID and ALL_CUSTOMERS_ID—to reflect its inclusion in the Region and Customer hierarchies, respectively. You will also notice that there are sources for the All levels. This is required for Oracle OLAP 10g, but in version 11g, they can be replaced with a text literal similar to the one we used for the CHANNEL dimension.
In the case of a value-based hierarchy, the mapping screen requires only the parent and member keys.
Understanding Time Dimensions
A time dimension is a specialized dimension that is similar to a user-defined dimension, except that the time dimension must be level-based, and you need to specify two additional attributes for each level:
- END_DATE, which represents the end date of a period and must be of type date
- TIMESPAN, which is a number representing the number of days in the period
Time dimensions generally have multiple hierarchies to handle items such as calendar year and fiscal year. In addition, if weeks and months are modeled, you need two separate hierarchies if weeks can cross months. It is very simple to support a 4-5-4 calendar, where weeks easily fit within months and quarters. Figure 4-7 shows the additional attributes.
If you never want to be able to combine all years together (display the sum of all years), you do not need to have an All level. In some reporting tools, such as OBIEE, the All level is desired even for time.
You will need to provide the attributes for all of the levels. If these fields are not populated, or populated incorrectly, errors will result when performing a time-series analysis. If the required data is not present in the time sources, you can specify equations for the missing information. For example, in Figure 4-8, we specify an equation for the END_DATE attribute.
In Oracle OLAP 10g, the easiest way to provide data not present in the time table is by using a view.
FIGURE 4-7. Time dimension with four attributes
FIGURE 4-8. A fully mapped time dimension with an equation for END_DATE
As you can see, the values for the ALL_YEARS attributes were created manually using constants or function calls. This All level is occasionally required by some front-end tools, like OBIEE, where a total level with only one member must be provided. The All level allows for the proper reporting of summary data when a time dimension member is not part of the query. An All level is not always required and can be added later if required.
After the dimensions have been defined, the next step is to populate the dimensions with data. You will need to set a synchronization policy, to control whether and how to synchronize the dimension with the source. If synchronization is set to off,
Oracle OLAP retains members that are specified in the dimension, but not in the source. When synchronization is on, Oracle OLAP adds and deletes members from the dimensions to match the source.
You can choose from a variety of synchronization refresh methods. Dimensions usually use the Complete refresh method, which means that all data is cleared, and then loaded and aggregated from the source tables. The other refresh methods are used with cubes. For more information, see the "Populating Cubes" section later in this article.
To populate the dimensions:
- Right-click the dimension and select Maintain. For this example, select CHANNEL.
- The Maintenance Wizard starts. This wizard is used to populate and maintain dimensions and cubes. The selected dimension appears in the Selected Target Objects area on the right. You can add more dimensions as needed from Available Target Objects list. With the desired dimensions selected, click Next.
3. Specify the synchronization option and the refresh method. For this example, choose Yes and Complete. Then click Next.
- Set the processing options as follows, and then click Next to continue.
- Atomic Refresh If you want all updates to the dimensions (and cubes) committed to the database as a single transaction rather than individual transactions, select Atomic Refresh. For this example, select this option.
- Refresh After Errors If you want Oracle OLAP to stop processing individual transactions when an error occurs, select Refresh After Errors. For this example, deselect this option.
If Refresh After Errors is selected, the Atomic Refresh option should be deselected so that if an error occurs, processing is stopped.
- Set when and how to update the dimension:
- Run maintenance task immediately in this session If the maintenance process runs quickly and does not need multiple processors, select this option. For this example, deselect this option.
- Submit the maintenance task to the Oracle Job Queue If the process requires multiple processors or runs a long time, you need to submit as a batch job. Select this option and choose whether to run it immediately or at a specified date and time. For this example, select this option and select Run immediately.
- Maximum number of parallel processes If you are maintaining a cube that has been partitioned, you can also set the number of parallel processors to use. Only partitioned cubes can be run using the parallel processing capabilities of the database. Dimensions and nonpartitioned cubes are always maintained using one processor, regardless of the number of processors selected. This is discussed further when we build cubes.
- Save maintenance task to script If you want to save this maintenance process as a SQL script that can be run from a PL/SQL process or from an external batch process such as cron, select this option and specify (or browse to) a file. For this example, select this option and specify a file name.
- Click Finish to populate the CHANNEL dimension. Because we chose to run the process immediately, the process begins and the dimension is populated with data. When the process ends, the Build Log dialog box is displayed. It shows the run statistics and can show the number of members processed during the run. The build log is beneficial for discovering errors and tracking how long it takes to load the rows from dimension and fact tables.
- Repeat this procedure to populate the PRODUCT, GEOGRAPHY, and TIME dimensions.
This completes the dimension creation portion of the demonstration. However, there are two related topics—templates and security—that we will cover here before moving on to creating a cube.
Saving Dimensions as XML Templates
You can save the dimensions as XML templates. Templates serve as a backup of the workspace definitions and make it easy to move objects from one schema or database to another. When you select the Save as Template option, you can select a location and specify a file name for the XML file, as shown in Figure 4-9. You can also save analytic workspaces, cubes, and even calculated measures as templates using the same dialog box. The XML file stored on the local file system contains only the structure, and not the data for the OLAP object.
Figure 4-9. Saving the CHANNEL dimension as a template
Setting Security on Dimensions
In Oracle OLAP 11 g, you can control security of OLAP objects. In the case of dimensions, you can grant the ability to select, update, delete, and alter a dimension to a role or database user using the Set Object Security dialog box, as shown in Figure 4-10. You can open the dialog box from the dimension context menu.
Creating and Populating Cubes
Now that we have created, mapped, and populated our dimensions, we can use those dimensions to create a cube. In this part of the demonstration, we start by reviewing some key concepts related to cubes. Then we complete the following steps:
- Build a cube using the Create Cube wizard.
- Create measures (stored and calculated).
- Map the cube.
- Populate the cube.
We end with a discussion of cube scripts.
FIGURE 4-10. Setting security on the CHANNEL dimension
Understanding Oracle OLAP Cubes
In this section, we discuss aggregation policies, partitions, and storage for Oracle OLAP cubes. In Oracle OLAP, cubes can be defined as compressed or uncompressed. If the cube is to use compression, then there are some additional restrictions. Compressed cubes must use the same aggregation types for the dimensions and all stored measures in the cube must be the same data type. If you want the measures in the cube to have different data types or aggregation methods, then you must either define a different cube to hold them or use an uncompressed cube.
Aggregation Policies Aggregation policies include the following information:
- Aggregation operators for each dimension
- The order in which to aggregate dimensions
- Whether or not any given dimension or hierarchy should be loaded and aggregated
- The aggregation method
Aggregation operators define what happens when dimensions are rolled up. By default, the values for sibling members are added together and attributed to their common parent member. You can choose from the following predefined aggregation operators:
- Sum (default)
- First Non-NA Data Value / Last Non-NA Data Value
- Hierarchical Average / First Member / Last Member
- Hierarchical Weighted Average / Weighted First / Weighted Last
- Minimum / Maximum
- Non Additive (do not summarize)
- Scaled Sum
- Weighted Average / Weighted First / Weighted Last
- Weighted Sum
If you use an aggregation operator other than Sum, the order in which the dimensions are aggregated can make a difference in the results. It is important that aggregations occur in the order specified. When aggregating across multiple dimensions, you must keep in mind that numbers may not appear to add up the way you look at two-dimensional spreadsheets. This is especially true when aggregation types are different for each dimension. Be aware of this when you start reviewing the results. It may be necessary to change the order of aggregation to achieve the desired results.
By default, all dimensions and hierarchies are loaded and aggregated. In some cases, it may not be required, or desired, to load and aggregate all the dimension hierarchies. Some hierarchies may be used only for loading the data, while other hierarchies are used for aggregation. Restricting the aggregation to only the required dimensions and hierarchies can improve load and aggregation performance of the cube. However, this can also mean that those levels that are not aggregated will be calculated on the fly, which can affect query performance.
A good example of the use of multiple hierarchies is having one hierarchy that is used to load and aggregate the data, and another hierarchy that is used for navigational purposes and may not be used for aggregation. This allows for displaying only levels that are needed to fulfill a business requirement, although the data is still loaded into the cube.
There are two aggregation methods: cost-based aggregation and level-based aggregation. With cost-based aggregation, you can precompute none, some, or all data by specifying a value between 0 and 100. Note that although the scale goes to 100, this value is not a percentage of the cube being precomputed; rather, it is a relative scale—the designers could have just as easily picked 0 to 200. Using a setting of 0 means do nothing to optimize the aggregation of the cube. This represents a low cost for the build, but it is not usually very useful for queries. A setting of 1 causes the cube composite to be built, but none of the measure values are precomputed. Unless the cube is small (like what-if type cubes), you should always choose a value of 1or higher. Values between 2 and 100 precompute progressively more of the measure values. A good starting point is 20. A value of 50 is on the high side.
In cases where cost-based aggregation is not going to be used or if the cubes are uncompressed and level-based aggregation is used, you will need to select which levels should be solved and stored. The default setting is to use skip-level aggregation, which aggregates and stores data at only every other level. This conserves space and still provides reasonable performance. It is also a good practice to select those levels that are most commonly viewed by the end users.
Only level-based aggregation is available in Oracle OLAP 10g.
Partitions If a cube is small and loads quickly, it may not be necessary to define a partition. However, most real-world cubes can benefit from partitioning. Partitioning can be used for life-cycle management, where you can partition by time, allowing for old time periods to be dropped from the cube as new periods are added.
The most common use of partitioning is to improve load and query performance. Without partitioning, you are restricted to the processing power of a single processor. Partitioning enables you to spread data loads and other cube maintenance tasks across multiple processors, which means that the cube can take advantage of parallel processing and complete tasks faster.
Each cube can be partitioned by only one dimension, so it is important to select the correct dimension and level. While there is only one partition specification for a cube, there are still multiple partitions. A cube partitioned by quarter over two years will still have eight partitions for the eight quarters plus a total partition. The number of partitions is maintained automatically, like table-based interval partitioning.
When selecting a dimension for partitioning, you need to choose one that has a level-based hierarchy, because you will need to specify a level along which to partition. It is a good idea to select an upper level within a dimension. Do not choose the top level, as this would not provide any practical advantage. This level should have a number that will divide the data as evenly as possible, so that the aggregation can be spread as evenly as possible across as many processors or processes as would be appropriate for the hardware utilized.
A good dimension to consider partitioning on is a time dimension. When you partition using a time dimension, you are generally loading and using more than one partition, which means that you can take advantage of parallel processing. The Month or Quarter levels are both suitable candidates for partitioning. For example, if the data contains 24 months (eight quarters) of data, you could partition at the Quarter level and split the load process across up to eight processors.
There are reasons to choose other dimensions instead of time. For example, suppose that after the initial data load, you choose to refresh the cube data only for the current month. Because the data is for a single month, you will be using only one partition (the one assigned to that Month or Quarter), which in turns means you will be using only one processor (the processor assigned to that partition). In this case, you should consider a different dimension and level, such as geographical region or product division, where the incremental changes to data will be spread across multiple partitions. It may take some trial runs to determine the best partition to use for the particular data in the cube and refresh methodology used.
You may also consider using partitioning to improve query performance. If a query can be satisfied within a partition, only that partition is accessed, thereby reducing the amount of the cube that is queried. This is known as partitioning pruning.
Before we wrap our discussion of partitions, it is important to understand the relationship between partitioning and the cost-based aggregation method. When you partition a cube, Oracle OLAP creates cost-based aggregation settings for both the bottom and top partitions. The bottom partition includes all the members of the partitioned dimension, up to and including the partition level. For example, if you partition on Quarter, the bottom partition includes the Day through Quarter levels (as well as values from the other dimensions). The top partition includes the levels above the partition dimension, such as Year and All Years.
The top partition can be very large. For example, it could include all years by all other dimension values. Therefore, any significant preaggregation of the top partition can be expensive—great amount of data in a single partition, which often gets bogged down in input to and output from the cube. A good rule of thumb is to partition at as high a level as possible, and set the top partition cost-based aggregation parameter to 0 (no preaggregation) or 1 (minimal preaggregation).
It is not important to partition at a low level to isolate refreshes. A cube is aggregated incrementally within a partition during a refresh (unless the data is cleared out first). Partitioning has more to do with parallel processing and keeping data in memory to reduce cube access.
In Oracle OLAP 11 g, you can run the Cube Partitioning Advisor (located on the Partitioning tab of the Create Cube dialog box) to help you decide how to partition the cube. You need to fully define and map the cube before you can run this advisor.
Cube Storage You have a fair amount of control over how Oracle OLAP cubes are stored. You can choose whether or not the cube is compressed, the order and sparsity of the dimensions, and the base data type of the cube.
Cubes can be defined as compressed or uncompressed. Oracle OLAP compresses cubes by default. Most cubes are very sparse, and compression helps to save disk space and data load time. All stored measures in a compressed cube must be of the same data type, and all dimensions in a compressed cube must use the same aggregation operator. If you need more flexibility, you must either define a different cube to hold the nonconforming stored measures and dimensions or use an uncompressed cube. By default, compressed cubes use the cost-based aggregation method. Level-based aggregation is the default aggregation method for uncompressed cubes.
You can identify dimensions as sparse or dense. A composite is created to contain combinations of all sparse dimension members for which there is data. By default, time is considered dense, because if data exists for one time dimension member, it likely exists for multiple time dimension members. This may not be the case for all cubes. It may be that all dimensions in a given cube are sparse, and all dimensions will be contained in the composite. The order of dimensions is important only for uncompressed cubes and has no effect on compressed cubes. For uncompressed cubes, all dimensions identified as sparse must be grouped together.
All cubes have a default data type. If you are creating a cube to be used as a materialized view for query rewrite (see the "Working with Cube-Organized Materialized Views" section later in this article), you should ensure that the cube uses the same data type as the source data. For uncompressed cubes, the measures will inherit the cube data type, but you can specify a different data type for each measure. Depending on the data type chosen, you can also define the data precision (number of significant digits), scale (number of decimal places), and the maximum number of bytes (for text data types).
Each cube, whether compressed or uncompressed, has at least one composite. Unpartitioned cubes always have only one composite. A partitioned, compressed cube always has a composite for each partition. With partitioned, uncompressed cubes, you have the choice between a single (global) composite and multiple composites.
In Oracle OLAP 11 g, you can run the Cube Storage Advisor (located on the Storage tab of the Create Cube dialog box) to help you select storage parameters. You need to create the cube before you can run the advisor. The Cube Storage Advisor will examine the data and suggest storage options appropriate for your data.
Creating a Cube
Now we use the Create Cube wizard to specify dimensions for the new cube, aggregation policies, partitions, and storage policies.
To create a cube:
- Select Create Cube from the Cube selection menu.
- Specify a name for the cube. The labels and description are filled in automatically. If desired, edit the labels and description.
- Select dimensions for this cube from the Available Dimensions list and add them to the Selected Dimensions list. For this example, the cube has four dimensions: TIME, CHANNEL, PRODUCT, and GEOGRAPHY. Generally speaking, we recommend placing the dense dimension first, if there is one, followed by the user-defined dimensions, in order from fewest members to most members. By default, AWM puts TIME first, because this dimension is usually dense. But that is not always the case and can certainly be changed.
4. Select the Aggregation tab. The Rules subtab should be selected. The default dimension order, aggregation operators, and hierarchy settings are suitable for this example.
5. Select the Precompute tab. Because the cube is compressed, the default aggregation method is cost-based aggregation. The cube is not partitioned, so only the bottom partition value is used. The default bottom partition value is 20, but we changed it to 35 because it is a more reasonable value for the demonstration data set. Even if a cube is partitioned, we generally leave the top partition value at its default value of 0, as explained in the previous section.
6. Select the Partitioning tab. By default, cubes are not partitioned. For this demonstration, we will create a partition. Select the Partition cube checkbox and specify the following partition details:
- Dimension TIME
- Hierarchy CALENDAR
- Level CALENDER_QUARTER
7. Select the Storage tab. By default, all cubes are compressed. You can change this cube to an uncompressed cube by deselecting the Use compression option. We will leave it compressed for this example. The TIME dimension is dense by default. Change it to sparse by selecting the checkbox in the Sparse column in the TIME dimension row. Now the TIME dimension will be included in the composite with all the other dimensions. For our data set, this approach loads faster.
Once a cube has been defined as compressed or uncompressed, it cannot be changed.
8.Click Create to generate the new cube.
After the cube is created, you can return to the Storage tab of the Create Cube dialog box and run the Cube Storage Advisor. It will recommend storage parameters for the cube. After you have defined and mapped measures, you can use the Materialized Views tab, as discussed in the "Working with Cube-Organized Materialized Views" section later in this article.
Recall that in its basic form, a cube is a collection of measures that have the same dimensionality. Oracle OLAP cubes contain two types of measures: stored measures and calculated measures. Stored measures are values saved in the cube. The values are either loaded from a data source or calculated and stored using a program. Calculated measures are defined by calculations or formulas. The calculated measures are not stored in the database, but rather calculated on the fly when requested by the application or an end-user query.
Many traditional data warehouses simply publish basic measures, such as sales, cost, and quantity. Most of these measures are considered stored measures because they come from the fact table. End users can do very little with this basic data. Both business and financial communities usually want to see information like sales compared to the same time last year, rate of growth of sales, and actual sales versus forecast. This means that end users are actually more interested in calculated measures derived from the base data.
Calculated measures are one of the most powerful features of OLAP. While it is fairly easy to do some of these calculations in spreadsheets and other reporting tools, these calculations can be a major performance drain, and the calculations reside only in the reporting repository, limiting access to and use of the measures. Furthermore, many calculated measures, such as time-series analysis, are very hard to compute outside the database.
AWM provides a wizard to help define many common types of business calculations. These are divided into four functional categories of calculations:
- Prior/Future Comparisons, and
- Time Frame.
In addition, Oracle OLAP has an expression language that allows for building just about any calculation desired. The following calculations are available in Oracle OLAP 11 g:
|Percent Difference||Diff from Parallel Period|
|Index||% Diff from Parallel Period|
|Prior Period||Moving Total|
|Diff from Prior Period||Moving Average|
|Percent Diff from Prior Period||Moving Maximum|
|Future Period||Moving Minimum|
|Diff from Future Period||Cumulative Total|
|Period to Date||Cumulative Average|
|Period to Date Period Ago||Cumulative Maximum|
|Percent Diff from Future Period||Cumulative Minimum|
|Diff from Period to Date Period Ago||User Defined Expression|
|% Diff from Period to Date Period Ago|
Oracle OLAP 10g uses a calculation wizard for specifying calculated measures and can use the OLAP DML. Some calculations are not available in Oracle OLAP 10g.
In order to create a calculated measure, you must first have stored measures, so we begin by creating stored measures.
Creating Stored Measures For compressed cubes, creating a stored measure is as simple as opening the Create Measures dialog box and specifying a name for the measure. For uncompressed cubes, however, you can choose to override the cube- level aggregation policy and set one for the measure. The Aggregation tab in the Create Measure dialog box is identical to the same tab in the Create Cube dialog box.
To create a stored measure:
- Right-click Measures and select Create Measure.
- Specify a name for the measure. For this example, enter SALES. The labels and description are filled in automatically. If the cube is compressed, which it is by default, and for this example, you do not need to supply any other information.
3. Click Create to create the SALES measure.
Creating Calculated Measures For this demonstration, we create a calculated measure for profit, where the formula for profit is sales minus costs. To create this calculated measure, follow these steps:
- Right-click Measures and select Create Calculated Measure.
- Specify a name for the measure. For this example, enter PROFIT. The labels and description are filled in automatically.
- Select the type of calculation from Calculation Type drop-down list. For this example, select Subtraction.
4. Under Calculation, click the first SALES measure to view a list of measures, and then select COSTS.
5. Click OK to create the SALES measure. Under Calculation, SALES changes to COSTS. You do not need to change the second SALES measure, because profit is sales minus costs. Notice that the syntax for the calculation is displayed in the Expression field. The syntax is helpful when you want to build complex expressions that cannot be handled by the wizard.
The mapping of cubes is similar to the mapping of dimensions, but generally much simpler.
Creating Joins from Fact Tables For this demonstration, we set join conditions that map data from the fact table for a summary level above the level contained in the fact table. The sample data has a fact table that has data at the day level, but the TIME dimension starts at month. The TIMES table contains the day data, as well as the month ID. We join the TIMES table with the fact table to get the month key.
Join conditions are not available in Oracle OLAP 10g. You need to create a view to accomplish the same task.
To create a join condition:
- Click Mappings under the cube. The right panel displays the cube-mapping panel.
In Oracle OLAP tig, the default mapping type is table view. In Oracle OLAP 10g, the graphical mapping view is the default.
- While in the table view, drag the MONTH key from the TIMES table to the MONTH level source, and then drag the TIMES DAY_KEY to the Join Condition field.
- Drag the DAY_KEY from the SALES_FACT table to the Join Condition field. The join condition appears as OLAPTRAIN.SALES_FACT.DAY_KEY = OLAPTRAIN.TIMES.DAY_KEY.
Notice that the cube-mapping panel in our example contains a similar join condition in the GEOGRAPHY dimension, using the CUSTOMERS table.
When the data is loaded, the load processor will perform a join and summary operation on the data from the fact table and load the results.
Loading Data from Multiple Tables You can load data from multiple tables into the same cube, but only if you do not use materialized views. This is done using the graphical mapping view. You place both sources on the panel, and drag and drop the keys from the fact tables to the associated targets. Oracle OLAP is not able to do any joining for this type of mapping.
Figure 4-11 shows an example of this type of mapping. This example includes an additional time hierarchy called Week that rolls up the data from week to year. This requires loading the data at the Week level as well as Month level. Two views were created to provide data at these levels. The views were then mapped to the appropriate levels of the TIME dimension.
Figure 4-11. Mapping from multiple sources
You might ask, "Why not just load the data at the Day level and then roll up the data to the respective hierarchies?" This is definitely possible, but it would add an additional level in the TIME dimension and 365 members per year. Such a high number of members can have a dramatic impact on space requirements. If there is a need to report data at the daily level, add the extra level to the dimension. Otherwise, save the resources for something that is required.
One nice feature of the mapping tool is the ability to view the table data while you are in the mapping panel. You can use it to verify that there is data in the source or to validate that the correct source and columns are being used. To view the data, right-click the source and select View Data. The first 1,000 rows of the source will be displayed in a tabular format, as shown in Figure 4-12.
You can populate a cube using the same Maintenance Wizard that you use with dimensions. When you select a cube to maintain, you see the cube and all the
FIGURE 4-12. Source data view
FIGURE 4-13. Maintaining a cube
dimensions that the cube uses. As you can see in Figure 4-13, all four dimensions, along with the cube, are selected automatically.
It is generally a good idea to update the dimensions at the same time as the cube, because there are often changes to dimensions if there are changes to the fact data. However, you can remove any dimensions you do not want to maintain with the cube. If you do not want to maintain any dimensions, deselect the Add the Dimensions of the Cubes option before you move the cube over to the target objects.
The Refresh Methods page will now include the cube in the list of objects. With the cube, you have the option to choose a cube script, which is a series of steps required to maintain a cube. For example, a script might load the data, calculate a forecast, and aggregate the data. The cube script list shows the scripts available for this cube. Cube-organized materialized views always use the SYS_DEFAULT script. For more information, see the next section on creating cube scripts.
The cube refresh methods are:
- Complete Clear all data from the cube, then load and aggregate all the data from the source tables. All cubes can use this method.
- Fast Use the materialized view log tables to identify, load, and aggregate only the new and changed data from the source tables. Cubes defined as a fast refresh or a materialized view can use this method.
- Force Use the Fast method if possible; otherwise, use the Complete method.
- Partition Change Tracking Clear, load, and aggregate only the values from an altered partition in the source tables.
- Fast Solve Load all the detail data from the source tables, then aggregate only the new values. Compressed cubes and cube materialized views can use this method.
Some of the cube refresh options in Oracle OLAP 11g are not available in Oracle OLAP 10g.
The remaining pages are the same as those for maintaining dimensions. On the last page, Scheduling, for this cube, you may want to choose to run in batch mode, because the sample data has been partitioned. Running in batch mode allows you to set the amount of parallelism and run on more than one processor. Figure 4-14 shows that two processes will be used. This number can be set from 1 to the maximum number of processors accessible to the database instance. In cases where there are other processes running on the server and there are more than two processors, it is often wise to set the number to be equal to the number of processors minus 1 to reserve a processor for other processes. You can also set this batch process to run at a later time or date.
Figure 4-14. Scheduling maintenance on multiple processors
Figure 4-15. Creating a cube script
Creating Cube Scripts
In Oracle OLAP 11 g, cube scripts give you more control of the maintenance process. As shown in Figure 4-15, you define each script as a sequence of steps. You determine the order in which these steps are to be executed.
A cube script can contain any combination of the following actions:
- Clear Data Clear out the data in the cube. This should not be put after a load step!
- Load Load new data (insert or update) or synchronize data (insert, update, and delete).
- Aggregation Aggregate the data.
- Analyze Analyze the materialized views if the cube is set for use in query rewrite.
- OLAP DML Specify and run OLAP DML commands or programs.
- PL/SQL Specify and run PL/SQL commands.
Cube scripts open up many ways to customize the loading, calculation, and aggregation of the cube data. In the past, external OLAP DML and PL/SQL programs did much of this work. Now you can integrate custom maintenance tasks in the system processes.
Cube scripts are not available in Oracle OLAP 10g.
The Maintenance Wizard and refresh materialized views execute the default cube script when refreshing a cube. The default cube script is named SYS_DEFAULT, and it includes load and aggregation steps. The script is displayed in the navigation tree only after you define a second cube script. During the Maintenance Wizard process, you can explicitly choose which script you want to execute during the load.
To create a cube script:
- Expand the folder for a cube, right-click Cube Scripts and select Create Cube Script. The Create Cube Script dialog box is displayed.
- Fill in the information on the General tab.
- To create a new step, choose the type of step you want to create.
- Complete all tabs associated with the step, and then click Create. The new step is listed on the General tab.
- Create additional steps as desired. You can edit, delete, or reorder the steps at any time. Remember that if you are performing calculations and aggregations, your data must be loaded first, so be careful how you order your steps.
- Click Create to generate the script. The new cube script is displayed as an object in the Cube Scripts folder.
We will revisit cube scripts in the "Creating Advanced Cubes for Typical Business Purposes" section later in this blog.
Summary of the Cube-Building Process
For this demonstration, we completed the following major steps:
- Prepare the data.
- Create an analytic workspace.
- Create and populate dimensions.
- Create and populate a cube.
An analytic workspace generally contains a set of cubes, so in a real-world environment, steps 3 and 4 would be repeated to create the rest of the cubes.
Next, we take a more in depth look at the types of calculated measures that are available to you.
Adding Business-Savvy Calculations to Cubes
This section describes how to create the following business calculations:
- Share calculation
- Percent different prior period or parallel period calculation
- Moving average calculation
- Custom calculated measures
We also review how to modify and manage calculated measures.
Creating a Share Calculation
A very useful calculation is the calculation of share. A share is a dimension member's contribution compared to another member (a baseline), where this baseline is often a parent or ancestor at a level.
You create a share calculation through the Share template, shown in Figure 4-16. The most common use of the Share template is to express the share as a percent of total or percent of parent in the chosen hierarchy. You can also select a dimension member to be used as the baseline of the calculation. This is useful if you want to compare members of the dimension in question to a specific benchmark or model member, such as an established market leading product, flagship store, or key competitor.
FIGURE 4-16. Creating a share calculation
After you select the Share template, you are prompted for the components for the calculation. From here, you fill in values for the various parameters:
- Share of measure A measure or calculated measure (SALES in this example).
- in The hierarchy to be used while calculating the share for the selected dimension (if more than one hierarchy exists in the chosen dimension).
- Of the The d imension for which the share is to be calculated (PRODUCT in this example).
- As a ratio of The dimension member to be used as a baseline to calculate the share.
When calculating a ratio, select one of the following choices (the first three are disabled for a dimension that does not have any hierarchies):
- Top of hierarchy Specifies that the baseline consist of the total of all items on the level that is associated with the current member (that is, the item for which the share is being calculated).
- Member's parent Specifies that the baseline consist of the total on the level of the parent for the current member (that is, the item for which the share is being calculated).
Member's ancestors at level Specifies that the baseline consist of the total of a level to be specified. Choosing this item requires the selection of a value in the associated drop-down list. This list displays the names of levels from the selected hierarchy for the selected dimension that are available for calculating the share.
Member Specifies that the baseline consists of dimension member to be specified. Choosing this item requires the selection of a value in the associated drop-down list. This list displays the names of the dimension members that are available for calculating the share.
In this example, we are calculating the sales share of the member's parent. You can choose to represent the share as a percentage by selecting the Multiply Result by 100 option. Figure 4-17 shows the completed share calculation.
Figure 4-17. The completed share calculation
Figure 4-18. Sample share report
It may be necessary to create multiple calculated measures using the same Share template to provide different results, such as shown in Figure 4-18. The report in Figure 4-18 is showing the SALES base measure and two share calculations: share of product total and share of product parent level. Notice how the SALES SHARE TOT PROD and SALES SHARE PRNT PRODUCT measures return the appropriate results as you drill down the product hierarchy.
Creating a Percent Different Prior or Parallel Period Calculation
Using the Percent Difference from Prior Period template, you can create a calculated measure that indicates growth or decline of a measure over time. This calculation template is found in the Prior/Future Time Period calculation type folder. This template accepts the following inputs:
- For measure Select a measure or a dimension member for which you want to calculate the percentage difference from the prior period.
- in If there is more than one time dimension, you can select the one you want to use. Otherwise, the default time dimension is used.
- And Select the hierarchy for the specified dimension.
- Number of periods ago Enter a specified number of periods ago, for each level (such as Year, Quarter, or Month).
A sample measure is shown in Figure 4-19.
Figure 4-19. Creating a percent difference prior period calculation
A similar, but more complex, calculation is the percent difference between the current time period and a prior parallel period, such as prior year. To do this calculation, we need to use the parallel period calculation not the prior period calculation. To perform this calculation, you provide the following inputs:
- From Select either Parallel period or Parallel period closest.
- For measure Select a measure or a dimension member for which you want to calculate the percentage difference from the parallel period.
- In If there is more than one time dimension, select the desired dimension from the list box. Otherwise, the default time dimension is used.
- Number of periods Enter a number of periods ago.
- Ago Select the level to use as a basis.
- From Select either beginning to ending of period or ending to beginning of period.
Figure 4-20. Creating a percent difference from parallel period calculation
A sample percent difference from parallel period calculation is shown in Figure 4-20.
Figure 4-21 shows a report that contains calculations of a number of alternative percentage differences from prior periods. All the measures automatically handle the situation in which the user needs to drill down into the time dimension and look at time periods at different levels. A single calculated measure in the analytic workspace can be used at any level of time, by any query tool, including SQL tools.
Note the following from Figure 4-21:
- The Sales Pr Period calculation works at all levels of time, and shows the value for the previous time period at the same level.
- The Sales Pr Period Pct Chg calculation works at all levels of time, and compares each time period with the previous period at the same level.
- Similar calculations can be easily generated for costs, quantity, and profit measures.
Figure 4-21. Sample report with prior period calculations
Creating a Moving Average Calculation
The Moving Average template enables you to create moving averages over any of the measures in your analytic workspace. Moving averages are very useful when you analyze volatile data, because they smooth out the peaks and troughs, and enable you to visualize trends in the data more easily. In the Moving Average template, you are prompted to provide the following inputs:
- Measure Select the measure for which you want to calculate a moving average.
- Over time in If there is more than one time dimension, select the desired dimension from the list box. Otherwise, the TIME dimension and the default hierarchy will be used. This is the case for our demonstration.
- include preceding Enter the number of periods preceding the given time period to be included in the moving average calculation.
- include following Enter the number of periods following the given time period to be included in the moving average calculation.
- Within The choices here are level, parent, ancestor at level, Gregorian year, Gregorian week, Gregorian month, and Gregorian quarter.
For example, let's say you want to create a calculation that represents the
moving average of sales for the last three months. When displaying this value for
Figure 4-22. Creating a moving average calculation
March, this would average January, February, and March. When displaying this value for April, this would average February, March, and April. Figure 4-22 shows how this calculation is structured in the Moving Average template.
Similar templates are available for creating moving total, moving maximum, and moving minimum calculations.
Figure 4-23 shows a combination graph reflecting the sample moving average calculation. The fluctuating line is sales, and the smoother line is the three-month average.
Figure 4-23. Graph of moving average
Creating Custom Calculated Measures
Using the Expression template, you can build just about any calculation desired. You can use this template to simply take action on more than one measure or to create very complex calculations. Oracle OLAP has a very powerful calculation engine that supports a huge library of functions:
- Time series
- Date and time
- Data type conversion
Figure 4-24. Expression for sales percent of 2006
Any these functions can be used to create a custom calculated measure, and can be used in a procedural function or program.
A simple custom calculation would be to produce a measure that is the percent of sales for 2006, such as sales year to date divided by total sales for 2006. As shown in Figure 4-24, the expression would look like this:
дГТ 100 * (SALES_CUBE.SALES_YTD / SALES_CUBE.SALES["TIME" = 'CY2006'])
Note that this expression uses an existing calculated measure for
SALES_YTD, and then uses what is called a qualified data reference to find the sales for 2006. (Qualified data reference syntax is discussed in the "Using OLAP DML" section later in this blog.)
You can also edit the expression and define your own calculations using any expression you care to enter as shown in Figure 4-25.
Figure 4-25. Editing an expression
You can also use OLAP DML in the expression. For example, you could use an OLAP_DML_EXPRESSION(,function,,datatype) form such as ('lag(sales,1,time,nostatus)', NUMBER). Furthermore, you can run programs in a calculated measure, such as a forecast program. For example, you could create a forecast measure, such as a crossover linear regression, in an OLAP DML program and then use it in the expression using OLAP_DML_EXPRESSION ('program_name', NUMBER). For more information about these various functions, see the Oracle OLAP DML Reference. We will also demonstrate some complex calculations in the "Advanced Topics" section later in this blog.
Managing Calculated Measures
You can edit existing calculated measures from within AWM. To change a calculated measure, click the calculated measure. You will see the general information displayed on the right. You can change the labels and description, but not the name. You can also change the calculation type and the expression.
In Oracle OLAP 10g, you need to select the Launch Calculation Editor button to change the details of a calculated measure. Note that you can change the details, but not the type, of the measure.
As you have seen, calculated measures are part of the cubes, but occasionally cubes need to be changed, such as by modifying their dimensionality or data type. These changes require that the cube be deleted. There are two ways to delete a cube but keep the calculated measures:
- Save the calculated measures to XML templates.
- Edit the XML definition for the cube.
Saving calculated measures to XML files is always a good idea because this creates a backup of the definition. Once the cube is dropped and rebuilt, you can simply import the calculated measures. However, you can save only one measure per XML template, which is fine if you create the XML template when you define the calculated measure, but not practical if you have many calculated measures in a cube and then decide to save them all to XML templates.
Editing the XML is not something we would normally recommend, but it is possible to move calculated measures from one template file to another using a text editor such as Notepad. Assuming that you do not change the dimensionality, you can cut and paste the XML. The calculated measures are all defined at the bottom of the XML definition file. Each measure is contained in a single block of XML code, using the tag DerivedMeasure. Copy all the DerivedMeasure blocks to your new cube XML template and reload that template to restore all your calculated measures. This work-around works in Oracle OLAP 10g Release 2 and later.
In this section, we cover the following advanced topics:
- Managing workspaces with Oracle OLAP Worksheet
- Working with cube-organized materialized views
- Managing security of cubes and dimensions
- Creating advanced cubes for typical business purposes
- Using SQL with OLAP
Managing Workspaces with OLAP Worksheet
Oracle OLAP Worksheet provides full use of the OLAP DML for users who need to manage the contents of an object or execute a program. It opens in a separate window from the AWM console. As shown in Figure 4-26, OLAP Worksheet provides menus, a toolbar, an input pane for OLAP DML commands on the bottom, and an output pane on the top. You can access the OLAP DML documentation through the Help menu.
In Figure 4-26, the output pane shows that the SALESTRACK workspace is attached with read/write access in both OLAP Worksheet (as shown by the AW LIST command) and AWM. The two applications share the same session, so whatever you do in OLAP Worksheet affects AWM as well. Changing the status or closing the workspace may have consequences on commands that you issue in AWM.
Use the AW LIST command to check which workspaces are attached and in which order, because many OLAP DML commands—like LISTNAMES and DEFINE—operate on only the first workspace.
OLAP Worksheet is an interactive environment for working with analytic workspaces, similar to SQL*Plus Worksheet. In addition to providing easy access to
Figure 4-26. OLAP Worksheet opened from AWM
the OLAP DML, OLAP Worksheet enables you to perform sophisticated business analyses, such as modeling, forecasting, and allocation. You can switch between two different modes: one for working with analytic workspaces in the OLAP DML language and the other for working with relational tables and views in SQL.
You can use OLAP Worksheet to perform the following tasks:
- Connect to an analytic workspace.
- Execute most OLAP DML commands.
- Create and populate data objects.
- Create, modify, compile, and execute DML programs.
- Execute SQL statements.
To open OLAP Worksheet from AWM, after you have connected to your database and opened an analytic workspace, place your cursor on the analytic workspace or an object within the analytic workspace and select Tools | OLAP Worksheet. To execute an OLAP DML command, type it in the input pane at the bottom of the window. For example, to view the list of attached analytic workspaces, issue the command AW LIST.
Using the Editor in OLAP Worksheet
You can use the built-in editor in OLAP Worksheet to change the content of a program, model, or aggregation map. You cannot use the editor to change the contents of a dimension, variable, relation, value set, or other data containers.
To change a program, issue the following command to open the program in the editor window:
For example, enter EDIT ONATTACH to open the ONATTACH program, which is shown in Figure 4-27. The object type PROGRAM is the default type, so you do not need to specify it. You must specify the other object types. For example, to edit an aggregation map, you would issue a command such as EDIT AGGMAP units_ cube_aggmap.
In the editor, type the OLAP DML commands that you want in the program. When you are finished editing the program, save it and close the editor. The status bar at the bottom of the window will tell you if there have been changes that have not been saved.
Figure 4-27. Opening a program in the OLAP Worksheet editor
To execute the program, issue this command:
To issue SQL commands, you need to select SQL Mode from the Options menu. To resume issuing OLAP DML commands, clear the SQL Mode selection.
Using OLAP DML
The ability to manipulate data directly using a stored procedure language is key to developing sophisticated OLAP-based applications. We have previously shown how to do this by creating calculated measures. These calculations use the OLAP stored procedure language, OLAP DML. As you have seen, the custom expression option available for calculated measures allows for entry of a complete OLAP DML script or program.
The OLAP DML is very much like PL/SQL. In fact over the years, Oracle has included many of the same functions and features of SQL in the OLAP DML.
The OLAP DML language is the same language as the Express language mentioned in this blog. There have been many enhancements to the language (mostly new commands and functions), but the language is the same. Originally, entire applications were written in this language, so all of the capabilities that you would expect in a modern computer language are there. Today, most applications use other technology to display the data, so generally, commands that generate output (such as the REPORT command) are not used, but they are still in the language.
OLAP DML programs contain a series of OLAP DML statements and reside in an analytic workspace. An OLAP DML program is an object in the analytic workspace, just like a dimension or variable. In order to change or run a program, you must first attach the analytic workspace that contains the program.
The OLAP DML language is used for multiple purposes:
- Specify a data expression.
- Manipulate OLAP data with commands (potentially in a batch process).
- Diagnose problems.
- Load data.
Table 4-3. Some OLAP DML Data-Manipulation, Numeric, and Text Functions
Using OLAP DML for Expressions BI applications often need to calculate data.
This data can often be calculated on the fly, at run time. The OLAP DML operates as a sophisticated expression language, much like MDX.
Many functions are built in to the OLAP DML, including those specifically designed to manipulate multidimensional data, as well as numeric and text functions. Table 4-3 lists some common OLAP DML functions.
In addition, there are functions to convert from one data type to another, time manipulation functions, database information functions, statistical functions, and more. These functions, along with operators such as + (plus), - (minus), * (multiply), / (divide), ** (exponent), and others provide the ability to calculate many business calculations as simple expressions. Expressions can also refer to programs that calculate a return value, allowing you to create your own functions. For a complete list of functions, see the Oracle OLAP DML Reference online documentation or the OLAP Worksheet Help.
OLAP DML expressions automatically have "dimensionality." If the SALES measure is dimensioned by PROD, GEOG, and TIME, the expression SALES*2 (multiple sales by the constant 2) is likewise dimensioned by PROD, GEOG, and TIME. Oracle OLAP always knows the dimensionality of an expression and will loop over the dimensionality in most situations, so no explicit FOR loop is necessary.
Using QDRs to Change the Dimensionality of an OLAP Expression If you wish to work with only a single dimension value, you can limit your expression using syntax called a qualified data reference (QDR). This syntax removes a dimension from the dimensionality of an expression. For example, the expression SALES (GEOGRAPHY ’US') qualifies the GEOGRAPHY dimension to the single value 'US'. The expression is dimensioned by the remaining dimensions of SALES, namely PRODUCT and TIME.
QDRs are often used in expressions when calculating a share. For example, the expression SALES / SALES(GEOGRAPHY ’US') calculates the sales share of the United States. Because the numerator (SALES) is not qualified in any way, this will work for any geography value.
The expression SALES(GEOGRAPHY ’US') specifies a single geography dimension value. The literal expression 'US' can be replaced with another variable, such as TOPCOUNTRY, which can be set to different values depending on the region of interest.
If instead of a single scalar value, you use another dimensioned expression, the dimensions of the reference are added to the other dimensions of the qualified expression. For example, suppose the top region needed to depend on the sales channel of interest. TOPCOUNTRY could be dimensioned by CHANNEL and have a different value for each CHANNEL. In that case, the expression SALES (GEOGRAPHY TOPCOUNTRY) would be dimensioned by PRODUCT, TIME, and CHANNEL. The qualification of the GEOGRAPHY dimension removes that dimension from the expression; the reference to TOPCOUNTRY, which is dimensioned by CHANNEL, adds the dimension CHANNEL to the overall expression.
This ability to remove and add dimensions using QDRs is used often to transform and manipulate Oracle OLAP data. Note that this expression language allows for combining data from multiple measures, cubes, and dimensions.
Manipulating OLAP Data with Commands If you wish to store the result of an expression permanently in a database, you can use an assignment statement to assign this expression to a variable. Usually, this can be calculated on the fly, eliminating the need to use disk space to store results permanently. Other commands are specifically designed to manipulate data, such as the following:
- AGGREGATE and ALLOCATE, which aggregate data up and down a series of hierarchies, respectively
- FORECAST and REGRESS and commands beginning with FC for advanced forecasting
- FILEREAD and OUTFILE and other file reading and writing commands
- EXPORT and IMPORT for moving data between workspaces or database instances
- Commands to directly access the Oracle relational data
- Commands to create or execute Models
- And many more commands
The OLAP DML language also contains standard control structures such as FOR loops, while loops, switch statements, and if-then-else statements. You can also trap for error conditions, and transfer control to an error procedure to handle anticipated and unanticipated error conditions (such as division by zero). Options are available to control behavior of certain conditions.
One option in particular can be very handy: If you set DIVIDEBYZERO to yes, Oracle OLAP will return an NA when you attempt to divide a number by zero, instead of producing an error.
The OLAP DML language has more than 100 commands. Some of the more common include:
- DEFINE, to create new objects
- DESCRIBE, to obtain the definition of an object
- LISTNAMES, to list all of the objects in an analytic workspace
- SHOW, to display the contents of an expression
- REPORT, to produce a formatted report of a series of expressions
- LIMIT, to control the current status of dimensions
- AW, to attach or detach an analytic workspace, or to list the attached analytic workspaces
See the Oracle OLAP DML Reference for more information about the OLAP DML commands.
Persistent Status Within a Session Dimensions in Oracle OLAP sessions have a "status" associated with them. This is persistent throughout a given user session. You can think of dimension status as a sort of persistent SQL WHERE clause. Session status allows you to focus a series of OLAP DML statements on certain dimension values. All subsequent statements that are executed apply to only those values in current status until the status is reset to something else. For example, suppose you execute the following commands:
limit prod to all
limit geog to 'US'
limit time to '2007' to '2009'
limit time add descendants
sales = na
limit geog to 'CANADA'
sales = 100
Sales will be set to the special value na (not available) for all products, for just the geographic region US and for the time periods 2007 to 2009, and all values that are descendants of those time periods (typically quarters and months in 2007, 2008, and 2009). Sales for Canada for the same time periods will be set to 100. Note that no FOR loop is necessary—Oracle OLAP loops over all dimension values for sales currently in status. This status is automatically reset to ALL at the beginning of every session.
Wrong assumptions about the status of key dimensions are a common source of bugs and performance problems in Oracle OLAP applications. When operating on the data, be sure to limit to the minimum status necessary to perform a given task. For example, if you need to calculate only the average price for certain products, limit the Product dimension to only the products in question.
A powerful DML is at the core of the Oracle OLAP environment. This language expands Oracle OLAP to be an analysis environment, instead of simply a reporting environment. This section only introduces the concept of the Oracle OLAP DML. To become more familiar with all the features of the language, download and read the Oracle OLAP DML Reference.
Working with Cube-Organized Materialized Views
With query rewrite, Oracle can automatically convert a SQL query into a more efficient SQL query. The Oracle query engine can direct a query to take advantage of materialized views or, as in Oracle OLAP 11 g, a cube-organized materialized view from an analytic workspace cube. Oracle's optimizer is intelligent enough to take advantage of preexisting summaries instead of reading all the detail records and summarizing them again. The end users do not know a transformation has occurred. The end users would only notice the improvement in performance.
Oracle's optimizer examines the query to determine if the SQL query can be rewritten for improved performance. If Oracle's optimizer determines it cannot improve the results, it performs the original plan by accessing the detail table rows and summarizing them.
Originally, Oracle had a rules-based optimizer that used a heuristic approach. In other words, the optimizer followed a series of rules. For example, one of the top rules was for the optimizer to utilize a unique index over any other access method. The rules-based optimizer would not care about other particulars of the situation, such as the number of rows in the table or whether the index was fragmented; it would simply follow the rule, even if the change actually resulted in poorer performance.
For Oracle8Z, Oracle created a cost-based optimizer. A cost-based optimizer uses statistics on the objects, such as tables and indexes, to determine the best execution plan to return the results of the query. For example, if a table has very few rows in it, the cost-based optimizer may choose to perform a full table scan as opposed to an index scan. Normally, a full table scan would be more expensive; however, the expense of a full table scan depends on the size of the table.
In Oracle OLAP 11 g, the cost-based optimizer takes into account cube- organized materialized views for query rewrite. If the cost-based optimizer thinks that the query will perform faster using OLAP cubes, the query will be rewritten to use OLAP cubes.
Setting Up for Cube-Organized Materialized Views
Each Oracle database has a set of parameters used to configure the database. For example, these parameters establish how much memory is allocated or whether to allow query rewrite. The following sections describe the two parameters that need to be set— QUERY_REWRITE_ENABLED and QUERY_REWRITE_INTERGRITY— and how to enable a cube for query rewriting.
Setting the QuERY_REWRITE_ENABLED Parameter The QUERY_REWRITE_ ENABLED database parameter controls the query rewrite feature for the database. It has three settings: TRUE, FALSE, and FORCE. Query rewrite is enabled on TRUE or FORCE, and disabled on FALSE. If set to FORCE, this parameter forces the optimizer to utilize the rewritten queries, even when the cost of the original query is lower. The following example enables the query rewrite feature on the database:
query_rewrite_enabled = true;
You can override the database setting with the ALTER SESSION command:
alter session set query_rewrite_enabled = true;
Enabling Query Rewrite for Cube With Oracle OLAP 11 g, it is very easy to enable query rewrite on the cube-organized materialized view. You just need to check a box in AWM, as follows:
- In AWM, select the cube.
- Select the Materialized View tab.
- Select Enable Materialized View Refresh on the Cube.
4. If desired, set the Refresh Method and Refresh Mode options.
Enabling materialized view refresh of the cube also enables the dimensions associated with the cube. Once this refresh is enabled, it is not possible to make changes to the cube or dimension structures, such as adding levels or hierarchies. If changes are necessary, you must first disable materialized views. Then you can make the changes and re-enable materialized views.
Setting the QUERY_REWRITE_INTEGRITY Parameter The QUERY_REWRITE_ INTEGRITY parameter has three settings:
- ENFORCED This is the default value. The optimizer will utilize the query rewrite feature only if the summary data represents the current detail values. For example, if the source fact table has been changed since the cube was loaded, the optimizer will not rewrite the query. The materialized view cube must be current, and no changes can have occurred to the detail tables since the last refresh of the materialized view cube.
- TRUSTED With the TRUSTED value, the optimizer assumes that the relationships with foreign keys constraints are correct. It also trusts that declared, but not enabled, primary keys or unique keys are valid. With this setting, data integrity is assumed (not checked with the database constraints); therefore, there may be some invalid data.
- STALE_TOLERATED With STALE_TOLERATED, the optimizer can rewrite queries even though the summary data may not be current. This setting has the risk that the summary data does not reflect the detailed data in the database. For example, detail data has been loaded with new rows; however, the materialized view cube has not yet been refreshed to reflect the current changes.
To set the parameter at the database level, use the following syntax:
query_rewrite_integrity = enforced;
In addition to setting query_rewrite_integrity at the database level, you can set the level of query rewrite for a session. This allows users to set this parameter to meet their needs, as in the following example:
alter session set query_rewrite_integrity = stale_tolerated;
Verifying Query Rewrite Occurred
If the query rewrite happens automatically, how do you check to see if the query rewrite worked? You can run an explain plan to see the execution plan.
The easiest way to run an explain plan is to run the query in SQL Developer, press f6, and examine the results in the Explain tab. Figure 4-28 shows the SQL Developer results
Figure 4-28. Sample execution plan in SQL Developer
Managing Security of Cubes and Dimensions
With Oracle OLAP 11 g, Oracle has now brought an additional security option to OLAP. In the past, access control needed to be provided through the use of OLAP DML programs, was difficult to set up, and was not user-friendly. AWM 11 g includes the capability to control access to OLAP data. This is fully compatible with the virtual private database (VPD) features in the Oracle Database. Oracle VPD enforces security, to a fine level of granularity, directly on database tables, views, or synonyms. Because you attach security policies directly to these database objects, and the policies are applied automatically whenever a user accesses data, there is no way to bypass security. With VPDs, you are able to control users' access to data related to them only, and prevent them from accessing data they are not authorized to see.
Setting Basic Security
You can grant access to users and roles on a particular analytic workspace, just as you can grant access to other database objects such as tables and views. These privileges are SELECT, ALTER, INSERT, and UPDATE. Granting these accesses to
the analytic workspace does not grant these privileges to dimensions or cubes within an analytic workspace; these permissions must be granted separated. It is possible to grant basic access using SQL or using AWM.
The following shows an example of granting privileges. These SQL commands enable user Scott to query the Units cube. They give Scott SELECT privileges on the Global analytic workspace, the cube, and all of its dimensions. Scott also gets privileges on the dimension views, so that he can query the dimension attributes for formatted reports. Notice that we are using standard database GRANTS, which are familiar to all DBAs and users of the Oracle Database.
/* Grant privileges on the analytic workspace */ grant select on olaptrain.aw$salestrack to scott;
/* Grant privileges on the cube */
grant select on olaptrain.sales_cube to scott;
/* Grant privileges on the dimensions */ grant select on olaptrain.channel to scott; grant select on olaptrain.geography to scott; grant select on olaptrain.product to scott; grant select on olaptrain.time to scott;
/* Grant privileges on the cube, dimension, and hierarchy views */
grant select on olaptrain.sales_cube_view to scott;
grant select on olaptrain.channel_view to scott;
grant select on olaptrain.channel_primary_view to scott;
grant select on olaptrain.geography_view to scott;
grant select on olaptrain.customer_shipments_view to scott;
grant select on olaptrain.customer_segments_view to scott;
grant select on olaptrain.product_view to scott;
grant select on olaptrain.product_primary_view to scott;
grant select on olaptrain.time_view to scott;
grant select on olaptrain.time_calendar_view to scott;
/* grant privileges to materialized views using query rewrite */ grant global query rewrite to scott;
Different types of privileges can be granted to users and roles individually for dimensions and cubes. For example, you may want a user to see only the Sales cube and not the Forecast cube. This is considered object-level security, which is set using the Object Security wizard in AWM, as shown in Figure 4-29.
Setting Finer-Grain Security
Data security policies enable you to grant users and roles privileges on specific dimension members. For example, you might restrict district sales managers to the data for just their own districts instead of all geographic areas. While it is possible to set security at the dimension and cube levels, it is not recommended to set this policy only at the cube level. Setting security on the cube level can cause confusion
Figure 4-29. Setting object-level security
for end users; they will be able to choose a dimension member that they are not permitted to see, but the cube will not report that data.
The data security policy on a dimension extends to all cubes within that dimension. You do not need to re-create the policy for each cube. When you create a data security policy on a cube, you select the members for each dimension of the cube. The policy applies to only that cube. When you create data security policies on both dimensions and cubes, users have privileges on the most narrowly defined portion of the data, where the policies overlap.
As soon as you create a data security policy, all other users are automatically denied access. AWM creates a default policy that grants all privileges to the owner. Otherwise, the owner would not be able to access any data.
Set these policies using AWM by selecting the Data Security option under the dimension or cube that will have the policy. Name the policy, and then add a user
Figure 4-30. Creating a data security policy
or role. In the example in Figure 4-30, we have added the user Scott to a new policy called NA_ONLY (NorthAmericaOnly).
Next, use the Member Selection tab to choose the dimension members that will be visible to this user. In this case, we want to limit the Geography to only North America and the members below it. Using the selector, choose Geography and then select North America to appear in the selected panel, as shown in Figure 4-31. Since we do not want to restrict access on any of the other dimensions, we leave them alone. If you look at the condition expression, you will note that the statement WHERE 1 = 1 for all other dimensions. Clicking Create will create this policy, and it will be applied. The next time Scott logs on, he will have access to view the data in the Sales cube, but only for data from North America. This applies to looking at the data via AWM as well as using the SQL views, assuming Scott has access to the views.
Creating Advanced Cubes for Typical Business Purposes
Oracle OLAP has the flexibility and power to handle a wide range of analytical needs. The previous sections have addressed how to work with typical star and snowflake multidimensional models. This section addresses how some more complex problems can be handled using Oracle OLAP. As an example, we show how you can use advanced forecasting techniques to compute additional cube- based measures.
Figure 4-31. Selecting data security members
A close examination of the OLAP DML guide shows that several forecasting and statistical analysis functions are provided in the OLAP engine. The forecasting engine supports basic forecasting, as well as advanced forecasting using the Geneva Forecasting engine, acquired from Roadmap Technologies. OLAP DML supports simple linear regressions, several nonlinear regression methods, single exponential smoothing, double exponential smoothing, and the Holt-Winters method. The OLAP engine also can decide the best fit for your data based on past performance.
For our forecasting program example, the following commands are used to calculate a forecast:
- FCOPEN, to open a forecast full table scan
- FCSET, to specify the options of the forecast
- FCEXEC, to execute the forecast
- FCQUERY, to retrieve information and characteristics about the forecast
- FCCLOSE, to close a forecast
The following is an example forecast:
"Set forecast parameters for 'best fit'
fcset _handle approach 'APPAUTO' periodicity 12 histperiods 36
"Execute the forecast
"save seasonal and seasonal smoothed into the variables just defined
fcexec _handle time time into forecast_best -
seasonal forecast_seasonal -
smseasonal forecast_smseasonal backcast -
We can use this in a program written in OLAP DML to calculate the forecast and store the results back to the cube so that it can be viewed like any other stored variable.
OLAP DML programs are very similar to PL/SQL scripts. The program Load_ Forecast, shown in Figure 4-32, defines variables and then calls the FCEXEC function to compute the forecast and place it in the variable MyForecast.
Figure 4-32. Forecast program
The execution of the forecast takes less than a minute to run for the entire cube. This program can be executed from a SQL or PL/SQL call, or can be incorporated into the maintenance routine of the Forecast cube, using cube scripts.
The full code for the forecasting program is available as part of the OLAPTRAIN demo.
As mentioned earlier in the article, it is possible to have custom steps in Oracle OLAP 11 g cube scripts. In this case, we can add an OLAP DML command step to the LOAD_AND_AGGREGATE cube script for the Forecast cube by doing the following:
- Select the LOAD_AND_AGGREGATE script.
- Delete the existing Load step.
- Add a new step (OLAP Command).
- Save the step and reorder the steps so that this new step is the first step.
Now the Forecast cube can be maintained along with the Sales cube. However, when the maintenance is performed, this cube must be maintained after the Sales cube has been maintained. This is because the data needed for the forecast is in the Sales cube, and it must be present before the forecasting program is run.
Using SQL with OLAP
One of the most useful features introduced with Oracle OLAP is the ability to use SQL queries to access the multidimensional calculation engine and multidimensional data. This single feature dramatically increases the reach and applicability of OLAP to a vast range of BI query and reporting tools, as well as SQL-based custom applications. Many more applications and reporting tools can now benefit from the superior performance, scalability, and functionality of a first-class multidimensional server contained within the Oracle Database. The bottom line is that applications that can connect to an Oracle database instance and execute simple SQL can benefit from analytic workspaces.
Follow these recommendations to gain the maximum benefit from OLAP SQL views:
- Always build your analytic workspaces to Oracle OLAP standard form. This happens automatically if you build them with AWM, OWB, or the supplied API.
- If you are using Oracle OLAP 10g, use the View Generator plug-in for AWM 10g to build your 10g views. If you are using Oracle OLAP 11 g, leverage the automatically generated views.
If you follow this advice, you will save much of time on your project and increase your ability to support the application in the future.
The plug-in for AWM 10g Release 2 is free shareware and can be downloaded from the Oracle Technology Network (OTN) web site. The plug-in adds a simple wizard within AWM. As you follow the steps in the wizard, you choose the measures and other items you need, and then the wizard creates the views for you. The wizard stores the biggest lump of syntax—the limitmap parameter, which describes which analytic workspace objects show up in what columns in your view—inside the analytic workspace itself, in a multiline text variable/measure.
In Oracle Database 11 g, OLAP_TABLE() is still available for you to use. It is sometimes suitable for your needs, as it has many very clever hooks by which you can trigger various OLAP actions whenever a user selects from the view. For most cases, however, the new CUBE_TABLE() function added in Oracle Database 11 g is recommended. CUBE_TABLE() views are what AWM 11 g automatically creates for you when defining the objects inside the analytic workspace. Assuming that you have a valid standard form analytic workspace, such as you might build in AWM 11 g, CUBE_TABLE() is much easier to use than OLAP_TABLE(). For example, the entire syntax required to create a dimension view for a specified hierarchy of that dimension in an analytic workspace is as follows:
create or replace force view mydim_myhier_view as
select * from table( cube_table('MYSCHEMA.MYDIM;MYHIER') );
Remember that AWM 11 g already does this for you. All you need to know about your analytic workspace is the name of the hierarchy (MYHIER), dimension (MYDIM), and schema that the analytic workspace is built in (MYSCHEMA). All the object mappings that you need to tell OLAP_TABLE about, in the limitmap parameter, are automatically done as a result of improvements in the Oracle Database 11 g data dictionary. The data dictionary is now fully aware of all the OLAP objects created by AWM, OWB, and the OLAP API.
AWM 11 g creates the necessary view for each dimension and cube in the analytic workspace. Figure 4-33 shows the view information (from AWM) for the Sales cube view.
Notice that the dimensions and measures are shown in the view. The dimensions have only one column for each dimension, which represents the dimension member or key. There is one row for each member in the cube that has data, for all levels of all dimensions of the cube. This data can either be stored or not stored. The data that is not stored is aggregated on the fly when the data is retrieved from the view. The cube view can be joined with the dimension views in queries to produce data at any
Figure 4-33. Sample cube view
Figure 4-34. Example of selecting data from a cube
desired aggregation level. The data can be used in any SQL-based query tool, just like any other relational data.
A simple SELECT statement such as the one shown in Figure 4-34 on the preceding page yields results from the cube rapidly. The performance of the SELECT statement is not significantly affected by the complexity of the calculations, unlike a similar SELECT from relational tables. Remember that many of these measures are calculated on the fly, and yet the SELECT performance is still excellent.
Oracle OLAP is a logical extension of the Oracle Database that increases the power of the database. The Oracle OLAP environment provides a great deal of flexibility in building multidimensional applications. We have shown how easy it is to take relational data and build simple but powerful OLAP cubes that can greatly improve the ability to provide rapid answers to end-user questions. By using the built-in features of AWM, or by extending it with the functionality of OLAP DML, you can take those simple solutions and extend them into complex applications that can solve just about any BI problem.
If you keep in mind Oracle OLAP's strengths and take into account what it is going through to service your requests, you can create more efficient applications. As is true with any computer environment, having an understanding of what the software is doing at a lower level helps to enhance database design and performance.