Let’s start by taking a brief look at the history of database use, before introducing the architecture of the future. As you can see in Figure 1-1, we will not refer to dates, but version numbers, going back to the time when the Oracle Database became predominant.
When Oracle Database 8i and 9i were on the market, midrange computers became prevalent in data centers. We were moving from the mainframe era to a client/server era, and Oracle architecture was very well suited for that. Written in the C programming language, it was available on multiple platforms and with all user management contained within its data dictionary. It was ready for the client/server architecture, using the OS only to listen on a TCP/IP port and to store files. Furthermore, the architecture was scalable on minicomputers, thanks to the parallel server feature, which would later become RAC (Real Application Clusters).
FIGURE 1-1. From IT consolidation to the cloud
These databases were growing along with the overall number of servers. It was common at that time to have a lot of physical servers with direct attached disks (DAS), each running one or two Oracle Database 8i or 9i instances.
With the number of databases increasing, it became a nightmare to maintain all those servers and disks. Having internal disks made capacity planning tremendously difficult when facing the exponential increase of data. By the time Oracle Database 10g was introduced, we needed to consolidate storage, so we put the database datafiles into a storage array, shared by all servers through a storage area network (SAN); that was storage consolidation.
By the time Oracle Database 11g rolled around, the prevailing thought was to do with servers what had been done with disks earlier. Instead of sizing and maintaining multiple servers, virtualization software brought us the possibility of putting our physical servers in a farm and provisioning virtual machines on top of them. That was the way in this previous era: application servers, SAN or network-attached storage (NAS), and virtual machines.
And now Oracle Database 12c accompanies a new day. Even organizations with consolidated storage and servers now realize that operating this infrastructure is not their core business, and instead, they now demand IT as a service, which is both scalable and flexible. Small companies want to provision their IT from a public cloud, and larger companies build their own private clouds. In both cases, virtualization can provide Infrastructure as a Service (IaaS). But we also want Application as a Service (AaaS) and need Database as a Service (DBaaS). This is a significant change in the technology ecosystem, similar in scale and importance to the earlier move from client/server to application servers. This new direction will not be immediate—it will take time. But it is safe to predict that over the next ten years, the hybrid mixed model (on-premise/cloud) will start strong, but be slowly supplanted by the cloud.
As is expected, a new era has different requirements, and the future of databases seems bound up with consolidation, agile development, and rapid provisioning. For Oracle, some such features came progressively from Oracle Database 9i to 11g, such as easy transport of data, cloning, and thin provisioning. But two core architectural foundations came from the previous era and were not ready to accommodate consolidation: the need to run one instance per database, and having one data dictionary per database. Oracle Database 12c provides the answer: multitenancy. Retaining its portability philosophy, Oracle has designed this architecture to enable you to run your application on the same database, with the same software running from small server to large cloud.
The Road to Multitenant
This new era is about consolidation. Some people can imagine it as a centralized system and centralized administration, recalling the time of mainframes. But there is another challenge that comes with it: we need more and more agility. Provisioning a database is not an easy operation today, and we cannot make it worse.
Consider an example. You are an Oracle DBA. A developer comes to your desk and asks for a new database; she is under the impression that this is a simple demand, merely requiring a few clicks of an administration interface. You look at her, wide eyed, and tell her she has to fill out a request with specifics related to storage, memory, CPU, and availability requirements. Furthermore, you explain, the request will have to be approved by management, and then it will take a few days or a week to set up. And here begins a pattern of misunderstanding between dev and ops.
The developer probably hasn’t worked with Oracle databases before, so she has some notion of a database as a lightweight container for her application tables—and in many other non-Oracle RDBMSs, this is actually what is referred to as a “database.”
In Oracle, however, we have lightweight containers—schemas at logical level and tablespaces at physical level—but a database is an entity that comprises much more than that. An Oracle database is a set of schemas and tablespaces, plus all the metadata required to manage them (the data dictionary), along with a significant amount of PL/SQL code to implement the features (the DBMS packages). Each database must have its own instance, including a number of background processes and shared memory. And each database also has a structure to protect the transactions, comprising undo tablespaces and redo logs.
For these reasons, provisioning a new database is not a trivial operation. To do so, you must interact with the system administrators and the storage teams, because you need server and disk resources for it. You don’t want to put too many instances on the same server, but you can’t have a dedicated physical server for each database. Because of this, today we often virtualize and have a dedicated virtual machine (VM) for each instance, but this is not possible for every application, for every environment, in any agile sort of way—there are just too many of them. Furthermore, you end up wasting a lot of resources when you have to allocate server, storage, and instance for each database.
Prior to Oracle Database 12c, the answer to the developer, in this scenario, probably was to create a new schema for her in an existing database. But this solution is not always possible or feasible. Let’s explain why.
Schema was exactly the objective prior to Oracle 12c. Each application had one schema owner, or a set of schemas if you wanted to separate tables and procedures. They were logically independent of each other, and security was controlled by grants.
Physically, you dedicated tablespaces to each application. This meant that, in case of a datafile loss, only one application was offline during the restore, which would also be the case if you wanted to relocate the tablespace to another filesystem. However, everything else was shared to optimize resource usage: instance processes and memory, SYSTEM and SYSAUX tablespaces, with dictionary.
The backup strategy is common, and the high availability (HA) policy is common. One DBA administers one database, and several applications run on it. This is exactly what the Oracle Database was designed for from its first versions.
A large number of operations in the Oracle Database can be performed at the tablespace level. This is especially true since the inception of the transportable tablespaces feature, which enables you to physically copy your application datafiles to another database, and even to a newer version. Transportable tablespaces are significant because they were a forerunner to, and an ancestor of, multitenant. The Oracle Corporation patent for Transportable Tablespaces published in 1997 was entitled “Pluggable tablespaces for database systems.” And the multitenant architecture is the foundation for pluggable databases.
In this context, pluggable means that you can directly plug a physical structure (datafile) into a database and have it become part of that database. The transportable tablespaces feature enabled user tablespace datafiles to be plugged into the database. Then only the metadata (dictionary entries) had to be imported so that the logical object definitions matched what was stored physically in the datafiles.
In 12c you can transport databases, which is nothing less than transporting all user tablespaces: a “FULL=Y” transportable tablespace. But metadata still has to be transferred logically, and that operation can be lengthy if you have thousands of tables, even if those tables are empty. For example, if you want to migrate a PeopleSoft database, which has 20,000+ tables, the metadata import alone can take hours to create all those empty tables.
As you will see, with the superior implementation in multitenant, the transport of pluggable databases is actually the transport of all datafiles, including SYSTEM and SYSAUX, which stores the data dictionary, and perhaps even the UNDO. This means that all metadata is also imported physically and, as such, is a relatively quick operation.
Schema Name Collision
Schema consolidation is in fact difficult to achieve in real life. You want to consolidate multiple applications into the same database, along with multiple test environments of the same application, but you are faced with a number of application constraints.
What do you do if the schema owner is hard-coded into the application and you cannot change it? We were once involved in installing a telco billing application that had to be deployed in a schema called PB, and we wanted to consolidate multiple environments into the test database, but that was forbidden. The reason was that the schema name was hard-coded into the application, and in packages, and so on. We better understood that strange schema name when we hosted a consultant from the application vendor. You may be able to guess what his initials were.
If the application design is under your control you can avoid this problem, and needless to say, you should never hard-code the schema name. You can connect with any user and then simply set ALTER SESSION SET CURRENT_SCHEMA to have all referenced objects prefixed by the application schema owner. And if you have multiple schemas? It’s not a bad idea to have multiple schemas for your application. For example, you can separate data (tables) from code (PL/SQL packages). That makes for good isolation and encapsulation of data. But even in that case, you don’t need to hard-code the table schema name into the package. Just create synonyms for them into the package schema, which will reference the objects from the table schemas. You reference them from your PL/SQL code without the schema name (synonyms are in the same schema), and they are resolved to the other schema. If a name changes, you have to re-create only those synonyms. That can be done very easily and automatically.
Public Synonyms and Database Links
With the above-mentioned synonyms, we were talking about private synonyms, of course. Don’t use public synonyms. They cannibalize the whole namespace. When an application creates public synonyms, you cannot consolidate anything else on it. That’s a limitation for schema consolidation: objects that do not belong to a specific schema can collide with other applications and other versions or environments of the same application.
Roles, Tablespace Names, and Directories
An application can define and reference other objects, which are in the database public namespace—such is the case for roles, directories, and tablespace names. An application for which several environments can be consolidated into the same database must have parameters in the Data Definition Language (DDL) scripts so that those database objects names can be personalized for each environment. If this is not the case, schema consolidation will be difficult.
Those public objects that do not pertain to a schema also make data movement more complex. For example, when you use Data Pump to import a schema, those objects may need to have been created earlier.
Even with an application that is designed for schema consolidation, you may encounter performance issues when consolidating everything into the same database. We once had a database with 3000 identical schemas. They were data marts: same structure, different data.
And, obviously, the application code was the same. The user connected to one data mart and ran the queries that were coded into the applications. This meant that the same queries—exactly the same SQL text—were run on different schemas. If you know how cursor sharing works in Oracle, you can immediately see the problem: one cursor has thousands of child cursors. A parent cursor is shared by all identical SQL text, and child cursors are created when the objects are different, which is the case when you are not on the same schema. Parsing has to follow a long chained list of children cursors, holding latches during that time, and that means huge library cache contention.
In multitenant, the parent cursors are shared for consolidation purposes, but enhancements may be implemented in the child cursor search to alleviate this problem.
When you want to consolidate the data for multiple environments of the same application and same version of an application, which means that the tables have exactly the same structure, you can put everything into the same table. This is usually done by adding an environment identifier (company, country, market, and so on) into each primary key. The advantage of this is that you can manage everything at once. For example, when you want to add an index, you can add it for all environments.
For performance and maintenance reasons, you can separate the data physically by partitioning those tables on the environment ID and put the partitions into different tablespaces. However, the level of isolation is very low, and that affects performance, security, and availability.
Actually most of the applications that were designed like this usually store only one environment. In most cases, the ID that is added in front of each primary key has only one value, and this is why Oracle introduced the skip scan index optimization. You can build virtual private database policy to manage access on those environments. You can manage the partitions independently, even at physical level, with exchange partitions. If you want to see an example of that, look at the RMAN repository: information for all registered databases is stored in the same tables. However, the isolation is not sufficient to store different environments (test, development, production), or to store different versions (where the data model is different).
If you want several independent databases but don’t want to manage one server for each, you can consolidate several instances on the same server. If you go to Oracle’s Ask Tom site (asktom.oracle.com/) for questions about the recommended number of instances per server, Tom Kyte’s answer is this: “We recommend no more than ONE instance per host—a host can be a virtual machine, a real machine, we don’t care—but you want ONE HOST = ONE INSTANCE.” In real life, however, most database servers we have seen have several instances running on them. You can install multiple versions of Oracle (the ORACLE_HOME ), and you can have a lot of instances running on one server—and you often have to do it. We have seen servers running as many as 70 instances.
There are few ways to isolate the resources between instances. As for memory, you can divide the physical memory among instances by setting the shared memory with SGA_MAX_SIZE, and in 12c you can even limit the process memory with PGA_AGGREGATE_LIMIT. You can also limit the CPU used by each instance with instance caging, setting for each instance the maximum number of processes that can run in the CPU. And with the latest license, Standard Edition 2, you don’t even need Enterprise Edition to do instance caging.
Running a lot of instances on one server is still a problem, however. For example, when you reboot the server, you will have lot of processes to start and memory to allocate. A server outage, planned or not, will penalize a lot of applications. And you waste a lot of resources by multiplying the System Global Area (SGA) and database dictionaries.
Today, virtualization is a good way to run only one instance per server without having to manage a lot of physical servers. You have good isolation of environments, you can allocate CPU, memory, and I/O bandwidth, within limits. And you can even isolate them on different networks. However, even if those servers are virtual machines, you don’t solve the resource wastage of multiple OSs, Oracle software, memory, and the dictionary. And you still have multiple databases to manage—to back up, to put in high-availability, in Data Guard, and so on. And you have multiple OSs to patch and monitor.
In addition to that, virtualization can be a licensing nightmare. When Oracle software is licensed by the processors where the software is installed, Oracle considers that, on some virtualization technologies, the software is installed everywhere the VM can run. The rules depend on the hypervisor vendor and on the version of this hypervisor.
Multiple Databases Managed by One Instance
The idea, then, is to find the consolidation level that fits both the isolation of the environment and the consolidation of resources. This is at a higher level than schema consolidation, but at a lower level than the instance and the database as we know it today. It means you can have several databases managed by the same instance on the same server.
This did not exist in versions of Oracle Database prior to 12c, but it is now possible with multitenant architecture. Now, one consolidation database can manage multiple pluggable databases. In addition to a new level that can be seen as an independent database, the pluggable database architecture brings agility in provisioning, moving, and upgrading.
Summary of Consolidation Strategies
Table 1-1 briefly summarizes the different consolidation alternatives prior to multitenant.
TABLE 1-1. Consolidation Strategies Pros and Cons