How to bulk data into database using Oracle Data Pump (impdp)
- How do we collect the actual execution plan from queries using Extended Event and how do we read its data - 2 December 2024
- How do we find what permissions a user has on an Oracle database - 1 November 2024
- How we enable Unified Auditing in Oracle Database - 7 October 2024
Many times we will need to keep the table data and their metadata (packages, procedures, functions, etc.) so that we can then import them into the same or a different system.
In previous article we had seen its use Oracle Data Pump Export (expdp) so that we can export a dump file with this data. In this article we will analyze the tool that imports this data into the database Oracle Data Pump Import (impdp).
How Oracle Data Pump Import (impdp) works
Calling it Data Pump Import (impdp) a job is created and starts running whose job is to read dump files and insert them into the database.
We can clean up what will be inserted and how by using parameters. These parameters can be declared either on the command line or in a separate parameter file.
The preliminary work
For starters we'll need a user that has the role DATAPUMP_IMP_FULL_DATABASE. Alternatively we can use a user set to SYSDBA although it is not recommended.
grant DATAPUMP_IMP_FULL_DATABASE to stratos;
We'll need one too logical directory which will point to a physical directory on the server where the dump files are located.
create directory fakelos as '/home/oracle/Documents';
To see the existing directories:
select owner,directory_name,directory_path from all_directories;
To see the existing tablespaces and the tables they contain:
select tablespace_name,status,max_size from dba_tablespaces; select owner,segment_name,tablespace_name,bytes from dba_segments where tablespace_name = 'USERS' and segment_type = 'TABLE';
Optionally
As we may import a table that you associate with another table we may need to disable the constraints.
SELECT * FROM user_cons_columns where OWNER = 'SCOTT' TABLE_NAME ='DEPT' ALTER TABLE SCOTT.DEPT DISABLE CONSTRAINT fk_dname;
We may want to pass an entire schema, so the process will be easier if we drop it first.
drop user SCOTT cascade;
Also we may want to delete the tablespace to insert it e.g. with a different name during the import process.
DROP TABLESPACE USERS INCLUDING CONTENTS AND DATAFILES;
The example
The simplest example we can try is to import a small array. The parameters needed are the logical directory we made earlier, the name of the dumpfile and the name of the table (we have set metrics and logfile to also be exported to the same directory). But since the table already exists we can enter it with another name with the parameter REMAP_TABLE.
[opc@dp-gr ~]$ impdp stratos/stratos directory=FAKELOS dumpfile=pinakas_dept.DMP logfile=import.log metrics=y TABLES=SCOTT.DEPT REMAP_TABLE=SCOTT.DEPT:DEPT_TEST
Alternatively if we would like to use a parameter file. We would create a file named e.g. params.par which would contain all the parameters one below the other and we would set it with the parameter in impdp parfile.
directory=FAKELOS
dumpfile=pinakas_dept.DMP
logfile=import.log
metrics=y
TABLES = SCOTT.DEPT
REMAP_TABLE=SCOTT.DEPT:DEPT_TEST
[opc@dp-gr ~]$ impdp stratos/stratos parfile=/home/oracle/Documents/params.par
[opc@dp-gr ~]$ impdp stratos/stratos directory=FAKELOS dumpfile=pinakas_dept.DMP logfile=import.log REMAP_TABLE=SCOTT.DEPT:DEPT_TEST metrics=y
...
Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real
Processing object type TABLE_EXPORT/TABLE/TABLE
Completed 1 TABLE objects in 1 seconds
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "SCOTT"."DEPT_TEST" 6.023 KB 4 rows in 1 seconds
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Completed 1 INDEX objects in 0 seconds
Processing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
ORA-31684: Object type CONSTRAINT:"SCOTT"."PK_DEPT" already exists
Completed 1 CONSTRAINT objects in 1 seconds
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Completed 1 INDEX_STATISTICS objects in 0 seconds
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Completed 1 TABLE_STATISTICS objects in 0 seconds
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
Completed 1 MARKER objects in 43 seconds
Completed 1 TABLE_EXPORT/TABLE/TABLE_DATA objects in 1 seconds
Job "STRATOS"."SYS_IMPORT_FULL_01" completed with 1 error(s)
To see the Data Pump jobs in progress.
select * from dba_datapump_jobs
We also have the possibility to do it by having the name of the job (using the previous query). attach in another window and see lots of useful information like completion rate and bytes written.
[opc@dp-gr ~]$ impdp stratos/stratos attach=SYS_IMPORT_TABLE_01
Alternatively if we use the sysdba account.
[oracle@dp-gr opc]$ impdp "'/as sysdba'" attach=SYS_IMPORT_TABLE_01
Import> status
Job: SYS_IMPORT_TABLE_01
Owner: STRATOS
Operation: IMPORT
Mode: TABLE
Instance: ORCL
Max Parallelism: 1
NLS character set: AL32UTF8
NLS NCHAR character set: AL16UTF16
Dump File: /home/oracle/Documents/pinakas_dept.DMP
Worker 1 Status:
Instance ID: 1
Instance name: ORCL
Host name: dp-gr
Process Name: DW00
State: UNDEFINED
Object Schema: SYS
Object Type: TABLE_EXPORT/TABLE/STATISTICS/MARKER
Total Objects: 1
Completed Bytes: 18,160
Worker Parallelism: 1
This information is updated by writing the command status.
If for some reason the job stops, e.g. there is not enough space, a message will also appear in the following dynamic view. If we fix it in a reasonable time it will continue automatically.
select name,status,timeout,error_number,error_msg from DBA_RESUMABLE;
The result
After the task is completed we will see that the table SCOTT.DEPT_TEST has been created.
select count(*) from SCOTT.DEPT_TEST;
4
Useful parameters
Let's also look at some useful parameters that we can set in Data Pump Import so that we have better control over the data we will enter.
REMAP_TABLE
With this parameter we can import the table with a different name as it may already exist.
REMAP_TABLE =SCOTT.DEPT:DEPT_NEW;
REMAP_SCHEMA
With this parameter we can import the schema into a different one.
REMAP_SCHEMA=SCOTT:STRATOS
REMAPS_TABLESPACE
With this parameter we can import the data into a different tablespace.
REMAPS_TABLESPACE=USERS:TABLESPACE_ARCHIVE
TABLE_EXISTS_ACTION
We use this parameter when we want to import tables that already exist and accepts 3 options:
With the append writes the entries in continuation of the previous ones.
TABLE_EXISTS_ACTION=APPEND
With the replace overwrites the table along with its metadata.
TABLE_EXISTS_ACTION=REPLACE
With the truncate deletes as many records as the table already has without disturbing its metadata and keeping the sequential numbering in a key if it existed as it is.
TABLE_EXISTS_ACTION=TRUNCATE
CONTENT
We use this parameter when we want to import only the data or only metadata.
content=data_only content=metadata_only
TRANSFORM
If we are not interested in the information being recorded in the redo log and force_logging is not activated, we can not record the import information in order to speed up the process.
TRANSFORM=DISABLE_ARCHIVE_LOGGING:Y
TABLESPACES
With this parameter we can import selected tablespaces.
TABLESPACES=USERS
TABLES
With this parameter we can insert selected tables.
TABLES=SCOTT.DEPT,SCOTT.SALGRADE
SCHEMES
With this parameter we can enter entire specific user.
schemas=SCOTT
DUMPFILE
With this parameter we can give a name to the dumpfile that Data Pump Export will produce. However, with the characters "%u" several sequentially numbered files (up to 99) can be created.
dumpfile=export_pinaka%u.DMP
EXCLUDE
With this parameter we can exclude specific objects such as schema, tables, etc.
EXCLUDE=SCHEMA:"=SYS"
INCLUDE
Accordingly, with this parameter we can add specific objects such as schema, tables, etc.
INCLUDE=SCHEMA:"=SCOTT" INCLUDE=TABLE:"IN (select table_name from dba_tables where table_name like 'DEP%')"
QUERY
With this parameter we can write queries to insert specific data only from one table.
query=SCOTT.SALGRADE:"where HISAL > '2000'"
PARALLEL
With this parameter we define the simultaneous parallel channels that will perform the task. We usually define the number of CPU cores by two.
Parallel=8
LOGFILES
In this parameter we define the name of the logfile that will contain information about the process.
METRICS
With this parameter, additional information is displayed in the log file.
metrics=y
SQLFILE
If we want to see what the dump files (dmp) contain without doing the import, we can use the SQLFILE parameter to export the metadata only with the DDL (Data Definition Language) to a file.
SQLFILE=expfull.sql