The following post describe common features you can use with Data Pump. Many of these features are standard with Data Pump, such as creating a consistent export and taking action when imported objects already exist in the database. Other features, such as compression and encryption, require the Enterprise Edition of Oracle or an extra license, or both. I’ll point out these requirements (if relevant) for the Data Pump element being covered.
Estimating the Size of Export Jobs
If you’re about to export a large amount of data, you can estimate the size of the file that Data Pump creates before you run the export. You may want to do this because you’re concerned about the amount of space an export job needs.
To estimate the size, use the ESTIMATE_ONLY parameter. This example estimates the size of the export file for an entire database:
$ expdp mv_maint/foo estimate_only=y full=y logfile=n
Here is a snippet of the output:
Estimate in progress using BLOCKS method...
Total estimation using BLOCKS method: 6.75 GB
Similarly, you can specify a schema name to get an estimate of the size required to export a user:
$ expdp mv_maint/foo estimate_only=y schemas=star2 logfile=n
Here is an example of estimating the size required for two tables:
$ expdp mv_maint/foo estimate_only=y tables=star2.f_configs,star2.f_installations \
logfile=n
Listing the Contents of Dump Files
Data Pump has a very robust method of creating a file that contains all the SQL that’s executed when an import job runs. Data Pump uses the DBMS_METADATA package to create the DDL that you can use to recreate objects in the Data Pump dump file.
Use the SQLFILE option of Data Pump import to list the contents of a Data Pump export file. This example creates a file named expfull.sql, containing the SQL statements that the import process calls (the file is placed in the directory defined by the DPUMP_DIR2 directory object):
$ impdp hr/hr DIRECTORY=dpump_dir1 DUMPFILE=expfull.dmp \
SQLFILE=dpump_dir2:expfull.sql
If you don’t specify a separate directory (such as dpump_dir2, in the previous example), then the SQL file is written to the location specified in the DIRECTORY option.
Tip You must run the previous command as a user with DBA privileges or the schema that performed the Data Pump export. Otherwise, you get an empty SQL file without the expected SQL statements in it.
When you use the SQLFILE option with an import, the impdp process doesn’t import any data; it only creates a file that contains the SQL commands that would be run by the import process. It’s sometimes handy to generate an SQL file for the following reasons:
- Preview and verify the SQL statements before running the import
- Run the SQL manually to precreate database objects
- Capture the SQL that would be required to recreate database objects (users, tables, index, and so on)
In regard to the last bulleted item, sometimes what’s checked into the source code control repository doesn’t match what’s really been applied to the production database. This procedure can be handy for troubleshooting or documenting the state of the database at a point in time.
Cloning a User
Suppose you need to move a user’s objects and data to a new database. As part of the migration, you want to rename the user. First, create a schema-level export file that contains the user you want to clone. In this example the user name is INV:
$ expdp mv_maint/foo directory=dp_dir schemas=inv dumpfile=inv.dmp
Now, you can use Data Pump import to clone the user. If you want to move the user to a different database, copy the dump file to the remote database, and use the REMAP_SCHEMA parameter to create a copy of a user. In this example the INV user is cloned to the INV_DW user:
$ impdp mv_maint/foo directory=dp_dir remap_schema=inv:inv_dw dumpfile=inv.dmp
This command copies all structures and data in the INV user to the INV_DW user. The resulting INV_DW user is identical, in terms of objects, to the INV user. The duplicated schema also contains the same password as the schema from which it was copied.
If you just want to duplicate the metadata from one schema to another, use the CONTENT parameter with the METADATA_ONLY option:
$ impdp mv_maint/foo directory=dp_dir remap_schema=inv:inv_dw \
content=metadata_only dumpfile=inv.dmp
The REMAP_SCHEMA parameter provides an efficient way to duplicate a schema, with or without the data. During a schema duplication operation, if you want to change the tablespace in which the objects reside, also use the REMAP_TABLESPACE parameter. This allows you to duplicate a schema and also place the objects in a tablespace different from that of the source objects.
You can also duplicate a user from one database to another without first creating a dump file. To do this, use the NETWORK_LINK parameter. See the section “Exporting and Importing Directly Across the Network,” earlier in this chapter, for details on copying data directly from one database to another.
Creating a Consistent Export
A consistent export means that all data in the export file are consistent as of a time or an SCN. When you’re exporting an active database with many parent-child tables, you should ensure that you get a consistent snapshot of the data.
Tip If you’re using Oracle 11g Release 2 or higher, you can take a consistent export by invoking the legacy mode parameter of CONSISTENT=Y. See the section “Data Pump Legacy Mode,” later in this chapter, for details.
You create a consistent export by using either the FLASHBACK_SCN or FLASHBACK_TIME parameter. This example uses the FLASHBACK_SCN parameter to take an export. To determine the current value of the SCN of your data set, issue this query:
SQL> select current_scn from v$database;
Here is some typical output:
CURRENT_SCN
-----------
5715397
The following command takes a consistent full export of the database, using the FLASHBACK_SCN parameter:
$ expdp mv_maint/foo directory=dp_dir full=y flashback_scn=5715397 \
dumpfile=full.dmp
The previous export command ensures that all data exported are consistent with any transactions committed in the database as of the specified SCN.
When you use the FLASHBACK_SCN parameter, Data Pump ensures that the data in the export file are consistent as of the specified SCN. This means that any transactions committed after the specified SCN aren’t included in the export file.
Note If you use the NETWORK_LINK parameter in conjunction with FLASHBACK_SCN, then the export is taken with the SCN consistent with the database referenced in the database link.
You can also use FLASHBACK_TIME to specify that the export file should be created with consistent committed transactions as of a specified time. When using FLASHBACK_TIME, Oracle determines the SCN that most closely matches the time specified and uses that to produce an export consistent with that SCN. The syntax for using FLASHBACK_TIME is as follows:
FLASHBACK_TIME="TO_TIMESTAMP{<value>}"
For some OSs, double quotation marks appearing directly on the command line must be escaped by a backslash (\), because the OS treats them as special characters. For this reason, it’s much more straightforward to use a parameter file. Here are the contents of a parameter file that uses FLASHBACK_TIME:
directory=dp_dir
content=metadata_only
dumpfile=inv.dmp
flashback_time="to_timestamp('24-jan-2014 07:03:00','dd-mon-yyyy hh24:mi:ss')"
Depending on your OS, the command line version of the previous example must be specified as follows:
flashback_time=\"to_timestamp\(\'24-jan-2014 07:03:00\',
\'dd-mon-yyyy hh24:mi:ss\'\)\"
This line of code should be specified on one line. Here, the code has been placed on two lines in order to fit on the page.
You can’t specify both FLASHBACK_SCN and FLASHBACK_TIME when taking an export; these two parameters are mutually exclusive. If you attempt to use both parameters at the same time, Data Pump throws the following error message and halts the export job:
ORA-39050: parameter FLASHBACK_TIME is incompatible with parameter FLASHBACK_SCN
Importing When Objects Already Exist
When exporting and importing data, you often import into schemas in which the objects have been created (tables, indexes, and so on). In this situation, you should import the data but instruct Data Pump to try not to create already existing objects.
You can achieve this with the TABLE_EXISTS_ACTION and CONTENT parameters. The next example instructs Data Pump to append data in any tables that already exist via the TABLE_EXISTS_ACTION=APPEND option. Also used is the CONTENT=DATA_ONLY option, which instructs Data Pump not to run any DDL to create objects (only to load data):
$ impdp mv_maint/foo directory=dp_dir dumpfile=inv.dmp \
table_exists_action=append content=data_only
Existing objects aren’t modified in any way, and any new data that exist in the dump file are inserted into any tables.
You may wonder what happens if you just use the TABLE_EXISTS_ACTION option and don’t combine it with the CONTENT option:
$ impdp mv_maint/foo directory=dp_dir dumpfile=inv.dmp \
table_exists_action=append
The only difference is that Data Pump attempts to run DDL commands to create objects if they exist. This doesn’t stop the job from running, but you see an error message in the output, indicating that the object already exists. Here is a snippet of the output for the previous command:
Table "MV_MAINT"."INV" exists. Data will be appended ...
The default for the TABLE_EXISTS_ACTION parameter is SKIP, unless you also specify the parameter CONTENT=DATA_ONLY. If you use CONTENT=DATA_ONLY, then the default for TABLE_EXISTS_ACTION is APPEND.
The TABLE_EXISTS_ACTION parameter takes the following options:
- SKIP (default if not combined with CONTENT=DATA_ONLY)
- APPEND (default if combined with CONTENT=DATA_ONLY)
- REPLACE
- TRUNCATE
The SKIP option tells Data Pump not to process the object if it exists. The APPEND option instructs Data Pump not to delete existing data, but rather, to add data to the table without modifying any existing data. The REPLACE option instructs Data Pump to drop and recreate objects; this parameter isn’t valid when the CONTENT parameter is used with the DATA_ONLY option. The TRUNCATE parameter tells Data Pump to delete rows from tables via a TRUNCATE statement.
The CONTENT parameter takes the following options:
- ALL (default)
- DATA_ONLY
- METADATA_ONLY
The ALL option instructs Data Pump to load both data and metadata contained in the dump file; this is the default behavior. The DATA_ONLY option tells Data Pump to load only table data into existing tables; no database objects are created. The METADATA_ONLY option only creates objects; no data are loaded.
Renaming a Table
Starting with Oracle 11g, you have the option of renaming a table during import operations. There are many reasons you may want to rename a table when importing it. For instance, you may have a table in the target schema that has the same name as the table you want to import. You can rename a table when importing by using the REMAP_TABLE parameter. This example imports the table from the HEERA user INV table to the HEERA user INVEN table:
$ impdp mv_maint/foo directory=dp_dir dumpfile=inv.dmp tables=heera.inv \
remap_table=heera.inv:inven
Here is the general syntax for renaming a table:
REMAP_TABLE=[schema.]old_tablename[.partition]:new_tablename
Note that this syntax doesn’t allow you to rename a table into a different schema. If you’re not careful, you may attempt to do the following (thinking that you’re moving a table and renaming it in one operation):
$ impdp mv_maint/foo directory=dp_dir dumpfile=inv.dmp tables=heera.inv \
remap_table=heera.inv:scott.inven
In the prior example, you end up with a table in the HEERA schema named SCOTT. That can be confusing.
Note The process of renaming a table wasn’t entirely bug free in Oracle 11g Release 1 but has been corrected in Oracle 11g Release 2. See MOS Note 886762.1 for more details.
Remapping Data
Starting with Oracle 11g, when either exporting or importing, you can apply a PL/SQL function to alter a column value. For example, you may have an auditor who needs to look at the data, and one requirement is that you apply a simple obfuscation function to sensitive columns. The data don’t need to be encrypted; they just need to be changed enough that the auditor can’t readily determine the value of the LAST_NAME column in the CUSTOMERS table.
This example first creates a simple package that is used to obfuscate the data:
create or replace package obfus is
function obf(clear_string varchar2) return varchar2;
function unobf(obs_string varchar2) return varchar2;
end obfus;
/
--
create or replace package body obfus is
fromstr varchar2(62) := '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ' ||
'abcdefghijklmnopqrstuvwxyz';
tostr varchar2(62) := 'defghijklmnopqrstuvwxyzabc3456789012' ||
'KLMNOPQRSTUVWXYZABCDEFGHIJ';
--
function obf(clear_string varchar2) return varchar2 is
begin
return translate(clear_string, fromstr, tostr);
end obf;
--
function unobf(obs_string varchar2) return varchar2 is
begin
return translate(obs_string, tostr, fromstr);
end unobf;
end obfus;
/
Now, when you import the data into the database, you apply the obfuscation function to the LAST_NAME column of the CUSTOMERS table:
$ impdp mv_maint/foo directory=dp_dir dumpfile=cust.dmp tables=customers \
remap_data=customers.last_name:obfus.obf
Selecting LAST_NAME from CUSTOMERS shows that it has been imported in an obfuscated manner:
SQL> select last_name from customers;
LAST_NAME
------------------
yYZEJ
tOXXSMU
xERX
You can manually apply the package’s UNOBF function to see the real values of the column:
SQL> select obfus.unobf(last_name) from customers;
OBFUS.UNOBF(LAST_NAME)
-------------------------
Lopuz
Gennick
Kuhn
Suppressing a Log File
By default, Data Pump creates a log file when generating an export or an import. If you know that you don’t want a log file generated, you can suppress it by specifying the NOLOGFILE parameter. Here is an example:
$ expdp mv_maint/foo directory=dp_dir tables=inv nologfile=y
If you choose not to create a log file, Data Pump still displays status messages on the output device. In general, I recommend that you create a log file with every Data Pump operation. This gives you an audit trail of your actions.
Using Parallelism
Use the PARALLEL parameter to parallelize a Data Pump job. For instance, if you know you have four CPUs on a box, and you want to set the degree of parallelism to 4, use PARALLEL as follows:
$ expdp mv_maint/foo parallel=4 dumpfile=exp.dmp directory=dp_dir full=y
To take full advantage of the parallel feature, ensure that you specify multiple files when exporting. The following example creates one file for each thread of parallelism:
$ expdp mv_maint/foo parallel=4 dumpfile=exp1.dmp,exp2.dmp,exp3.dmp,exp4.dmp
You can also use the %U substitution variable to instruct Data Pump to create dump files automatically to match the degree of parallelism. The %U variable starts at the value 01 and increments as additional dump files are allocated. This example uses the %U variable:
$ expdp mv_maint/foo parallel=4 dumpfile=exp%U.dmp
Now, say you need to import from the dump files created from an export. You can either individually specify the dump files or, if the dump files were created with the %U variable, use that on import:
$ impdp mv_maint/foo parallel=4 dumpfile=exp%U.dmp
In the prior example the import process starts by looking for a file with the name exp01.dmp, then exp02.dmp, and so on.
Tip Oracle recommends that the degree of parallelism not be set to more than two times the number of CPUs available on the server.
You can also modify the degree of parallelism while the job is running. First, attach in the interactive command mode to the job (see the section “Interactive Command Mode,” later in this chapter) for which you want to modify the degree of parallelism. Then, use the PARALLEL option. In this example the job attached to is SYS_IMPORT_TABLE_01:
$ impdp mv_maint/foo attach=sys_import_table_01
Import> parallel=6
You can check the degree of parallelism via the STATUS command:
Import> status
Here is some sample output:
Job: SYS_IMPORT_TABLE_01
Operation: IMPORT
Mode: TABLE
State: EXECUTING
Bytes Processed: 0
Current Parallelism: 6
Note The PARALLEL feature is only available in the Enterprise Edition of Oracle.
Specifying Additional Dump Files
If you run out of space in the primary data pump location, then you can specify additional data pump locations on the fly. Use the ADD_FILE command from the interactive command prompt. Here is the basic syntax for adding additional files:
ADD_FILE=[directory_object:]file_name [,...]
This example adds another output file to an already existing Data Pump export job:
Export> add_file=alt2.dmp
You can also specify a separate database directory object:
Export> add_file=alt_dir:alt3.dmp
Reusing Output File Names
By default, Data Pump doesn’t overwrite an existing dump file. For example, the first time you run this job, it will run fine because there is no dump file named inv.dmp in the directory being used:
$ expdp mv_maint/foo directory=dp_dir dumpfile=inv.dmp
If you attempt to run the previous command again with the same directory and the same data pump name, this error is thrown:
ORA-31641: unable to create dump file "/oradump/inv.dmp"
You can either specify a new data pump name for the export job or use the REUSE_DUMPFILES parameter to direct Data Pump to overwrite an existing dump file; for example,
$ expdp mv_maint/foo directory=dp_dir dumpfile=inv.dmp reuse_dumpfiles=y
You should now be able to run the Data Pump export regardless of an existing dump file with the same name in the output directory. When you set REUSE_DUMPFILES to a value of y, if Data Pump finds a dump file with the same name, it overwrites the file.
Note The default value for REUSE_DUMPFILES is n. The REUSE_DUMPFILES parameter is available only in Oracle 11g and higher.
Creating a Daily DDL File
Sometimes, in database environments, changes occur to database objects in unexpected ways. You may have a developer who somehow obtains the production user passwords and decides to make a change on the fly, without telling anybody. Or a DBA may decide not to follow the standard release process and make a change to an object while troubleshooting an issue. These scenarios can be frustrating for production-support DBAs. Whenever there is an issue, the first question raised is, “What changed?”
When you use Data Pump, it’s fairly simple to create a file that contains all the DDL to recreate every object in your database. You can instruct Data Pump to export or import just the metadata via the CONTENT=METADATA_ONLY option.
For instance, in a production environment, you can set up a daily job to capture this DDL. If there is ever a question about what changed and when, you can go back and compare the DDL in the daily dump files.
Listed next is a simple shell script that first exports the metadata content from the database and then uses Data Pump import to create a DDL file from that export:
#!/bin/bash
export ORACLE_SID=O12C
export ORACLE_HOME=/orahome/app/oracle/product/12.1.0.1/db_1
export PATH=$PATH:$ORACLE_HOME/bin
#
DAY=$(date +%Y_%m_%d)
SID=DWREP
#---------------------------------------------------
# First create export dump file with metadata only
expdp mv_maint/foo dumpfile=${SID}.${DAY}.dmp content=metadata_only \
directory=dp_dir full=y logfile=${SID}.${DAY}.log
#---------------------------------------------------
# Now create DDL file from the export dump file.
impdp mv_maint/foo directory=dp_dir dumpfile=${SID}.${DAY}.dmp \
SQLFILE=${SID}.${DAY}.sql logfile=${SID}.${DAY}.sql.log
#
exit 0
This code listing depends on a database directory object’s being created that points to where you want the daily dump file to be written. You may also want to set up another job that periodically deletes any files older than a certain amount of time.
Compressing Output
When you use Data Pump to create large files, you should consider compressing the output. As of Oracle 11g, the COMPRESSION parameter can be one of the following values: ALL, DATA_ONLY, METADATA_ONLY, or NONE. If you specify ALL, then both data and metadata are compressed in the output. This example exports one table and compresses both the data and metadata in the output file:
$ expdp dbauser/foo tables=locations directory=datapump \
dumpfile=compress.dmp compression=all
If you’re using Oracle 10g, then the COMPRESSION parameter only has the METADATA_ONLY and NONE values.
Note The ALL and DATA_ONLY options of the COMPRESS parameter require a license for the Oracle Advanced Compression option.
New with Oracle 12c, you can specify a compression algorithm. The choices are BASIC, LOW, MEDIUM, and HIGH. Here is an example of using MEDIUM compression:
$ expdp mv_maint/foo dumpfile=full.dmp directory=dp_dir full=y \
compression=all compression_algorithm=MEDIUM
Using the COMPRESSION_ALGORITHM parameter can be especially useful if you’re running low on disk space or exporting over a network connection (as it reduces the number of bytes that need to be transferred).
Note The COMPRESSION_ALGORITHM parameter requires a license for the Oracle Advanced Compression option.
Changing Table Compression Characteristics on Import
Starting with Oracle 12c, you can change a table’s compression characteristics when importing the table. This example changes the compression characteristics for all tables imported in the job to COMPRESS FOR OLTP. Because the command in this example requires quotation marks, it’s placed in a parameter file, as shown:
userid=mv_maint/foo
dumpfile=inv.dmp
directory=dp_dir
transform=table_compression_clause:"COMPRESS FOR OLTP"
Assume that the parameter file is named imp.par. It can now be invoked as follows:
$ impdp parfile=imp.par
All tables included in the import job are created as COMPRESS FOR OLTP, and the data are compressed as they’re loaded.
Note Table-level compression (for OLTP) requires a license for the Oracle Advanced Compression option.
Encrypting Data
One potential security issue with Data Pump dump files is that anybody with OS access to the output file can search for strings in the file. On Linux/Unix systems, you can do this with the strings command:
$ strings inv.dmp | grep -i secret
Here is the output for this particular dump file:
Secret Data<
top secret data<
corporate secret data<
This command allows you to view the contents of the dump file because the data are in regular text and not encrypted. If you require that the data be secured, you can use Data Pump’s encryption features.
This example uses the ENCRYPTION parameter to secure all data and metadata in the output:
$ expdp mv_maint/foo encryption=all directory=dp_dir dumpfile=inv.dmp
For this command to work, your database must have an encryption wallet in place and open. See the Oracle Advanced Security Administrator’s Guide, available for download from the Technology Network area of the Oracle web site (http://otn.oracle.com), for more details on how to create and open a wallet.
Note The Data Pump ENCRYPTION parameter requires that you use the Enterprise Edition of Oracle 11g or higher and also requires a license for the Oracle Advanced Security option.
The ENCRYPTION parameter takes the following options:
- ALL
- DATA_ONLY
- ENCRYPTED_COLUMNS_ONLY
- METADATA_ONLY
- NONE
The ALL option enables encryption for both data and metadata. The DATA_ONLY option encrypts just the data. The ENCRYPTED_COLUMNS_ONLY option specifies that only columns encrypted in the database are written to the dump file in an encrypted format. The METADATA_ONLY option encrypts just metadata in the export file.
Exporting Views As Tables
Starting with Oracle 12c, you can export a view and later import it as a table. You may want to do this if you need to replicate the data contained in a view to a historical reporting database.
Use the VIEWS_AS_TABLES parameter to export a view into a table structure. This parameter has the following syntax:
VIEWS_AS_TABLES=[schema_name.]view_name[:template_table_name]
Here is an example:
$ expdp mv_maint/foo directory=dp_dir dumpfile=v.dmp \
views_as_tables=sales_rockies
The dump file can now be used to import a table named SALES_ROCKIES into a different schema or database.
$ impdp mv_maint/foo directory=dp_dir dumpfile=v.dmp
If you just want to import the table (which was created from a view during the export), you can do so as follows:
$ impdp mv_maint/foo directory=dp_dir dumpfile=v.dmp tables=sales_rockies
The table will have the same columns and data types as per the view definition. The table will additionally contain rows of data that match what would have been selected from the view at the time of the export.
Disabling Logging of Redo on Import
Starting with Oracle 12c, you can specify that objects be loaded with nologging of redo. This is achieved via the DISABLE_ARCHIVE_LOGGING parameter:
$ impdp mv_maint/foo directory=dp_dir dumpfile=inv.dmp \
transform=disable_archive_logging:Y
While performing the import, the logging attributes for objects are set to NO; after the import the logging attributes are set back to their original values. For operations that Data Pump can perform with direct path (such as inserting into a table), this can reduce the amount of redo generated during an import.