Saturday 7 April 2012

What are External Tables ?

For an introduction to the table types and list visit : What are Index Organized Tables?

"External Tables" are probably the types of tables used in Oracle only (other database systems might use some other mechanism, and I request users of those systems may share their knowledge through posting comments / links), as an intermediate staging area during bulk data load operations, particularly in Data Warehouses. They were introduced in Oracle version 9i and the limitation of being only "read only" has been removed since version 10g. The "External Tables" feature is a complement to existing SQL*Loader functionality. It enables the user to access the data in external sources as if it were in a table in the database, and also allows data transformations.

What are External Tables? -
Contrary to what their name indicates, "external tables" are not really the tables. They are only the objects in the form of a definition stored in the data dictionary (catalog). They are capable of providing the data in the normal table format while the data is actually stored in operating system files "external" to the database (hence the name "external tables"). So dropping of the external tables only removes its definition from the data dictionary while data still remains safe in the files. These tables allow all sorts of operations like normal tables do - queries, joins, and DML (from Oracle 10g onwards). So external tables are objects in the form of a definition which do not store any data in themselves but only enable you to view externally stored data as if it were inside a table in the database. The trick is actually the data fields in the external file are merely "mapped" to the external table columns, not actually loaded into them.

External tables are said to imitate functionality of a bulk data load utility called "SQL*Loader" and is considered to be equivalent to storing "SQL*Loader Control File" in the form of a definition of table and hence the same plain SQL (and also PL/SQL, a server side programming language in Oracle) is also allowed to be used against the External Tables. However data transformation and filtering capabilities are considered to be much more flexible than SQL*Loader.


A working example of External Tables -

Step 1 - creating a directory object
A directory object is object defined in Oracle database to point to a file system location where our source flat files are stored. It is the way of protecting the source files from unauthorized access as there are also privileges associated with the directory objects which are granted from the database with "grant" command -

create directory extdir as 'c:\oracle\dfile'; -- directory creation


grant read, write on directory extdir to scott; --read & write privilege granted

Step 2 - Creating external table

create table empxt (
   empno  number(4),
   ename  varchar2(15),
   job    varchar2(15),
      .....
   )
organization external
(
type oracle_loader
default directory extdir
access parameters
(
   records delimited by newline
   badfile empxt.bad
   logfile empxt.log
   discardfile empxt.dis
   fields terminated by ','
   missing field values are null
)
location ('emp.dat')
)
reject limit unlimited;

Explanation : The part of definition above "organization external" is plain create table with definitions of colulmns, "organization external" clause tells oracle that the table is to be created as external table, "type" clause specifies the engine - oracle_loader is used for loading data from file to the internal table, "default directory" points to the directory where all of our source data files and rejection & logfiles will be housed, "location" provides the name of the input data file.

Step 3 - the input data file (external file) 
The input data files may contain the records in two formats - (1) fields delimited by a character for example comma as is used here (2) or fields with fixed width format [in which case the access parameters of this example will change accordingly]. For this example the data may in the following way -

7788,scott,analyst,....,20
7839,king,president,....,10
     .......
Step 4 - we can now query the external table by simple SQL

select * from empxt;

Step 5 - We can populate our EMP table (internal) 

insert into EMP select * from empxt;

This was a very simple example but it is possible to perform a lot of transformation operations, use oracle's built-in functions, in-line formulas etc. for a more exiting run of the bulk data load operation.

The most important of the advantages of using the external tables is that you can use the common SQL to query the data from external tables which mask the source data files with its structure and almost any operation that we normally do in query may be performed in this query too against the external tables.

4 comments:

  1. NEED THE PROFESSIONAL CONCEPTS. NOT THE BOOKISH ONE.......

    ReplyDelete
    Replies
    1. Hi Rango,
      Thank u so much for your interest in these posts. I was just wondering if you can explain this same concept in the professional concept way for an example so that next time I could pick up that skills. The post is guaranteed that it is neither a copy from any book nor contents from any other source on net. I have been creating it and wording it on my own from my own experience, but they are sounding like bookish, so definitely would like to change the style if you can indicate with an example.

      Delete
  2. IBM Informix Dynamic Server also supports that concept and implementation of external tables as of version 11.50.xC6 (circa 2009). They are very similar supporting the mapping of external files in any of three formats (delimited, fixed field length, native binary) to database tables allowing inserts and selects. The external table can map to one or many files to allow for processing data from multiple sources. External table definition is similar to the Oracle DDL or can be defines SAMEAS an existing native Informix table. In addition, the external table can be defined to interact with native tables in 'deluxe' mode (the default) where any data selected from the external table and inserted into a native table is inserted normally being processed through the transaction logging systems or it can interact in 'express' mode where logging is bypassed to improve data loading speed. Obviously after loading data in 'express' mode a full archive is required for safety so that the data is preserved if the systems experiences a hard crash and has to be restored.

    ReplyDelete
    Replies
    1. Hi Art,
      Thanks for sharing that invaluable information about Informix external tables. Hope the readers of this post will like this very much. Best Regards.

      Delete