Thursday 5 April 2012

What are temporary tables ?

For an introduction to the table types and list visit : What are Index Organized Tables?

Temporary Tables -
"Temporary Tables" are a great way to increase the performance by materializing (say aggregating and holding) the intermediate data required to solve an extremely complex query. The intermediate data may come from the results of some inline queries (also called in-line views) in the complex query but when temporary tables are used they may represent these inline queries' data. Traditionally views are used to store the complex queries as objects in the database - "view" being a select statement (query) stored in the database in compiled form which is simply invoked when the view is referenced. However, this approach requires that the intermediate data processing (fetch / aggregation) is carried out every time the view is invoked, as the views do not store any data, they are only in definition form. Views only simplify the call to the complex query which is stored as its definition but does not help optimize it, that temporary tables do.


The generalized concept of temporary tables as described above as a role to store the sub results of 3 or 4 subqueries (which make the complex query) and then joining such temporary tables to get the final result of the original complex query is not believed to be an appropriate application of temporary tables in Oracle, unlike other database systems (Tom Kyte, Expert one-on-one Oracle, wrox publication). However for a simplified understanding of the concept, using this as an example is not wrong either.


Oracle Temporary Tables -
In Oracle, they are called "Global Temporary Tables" (GTT) rather than just "Temporary Tables". A GTT is just really a template for the table itself. They are "statically defined" to exist permanently in the database (data dictionary) as objects; and as the term "global" indicates in their type name they are created only once per database and not once per stored procedure. Temporary table will allocate storage from the currently logged user's temporary tablespace, or if they are accessed from a "definer's right procedure" then temporary tablespace of the owner of the procedure. Actually every user referring to "same" GTT actually creates a temporary segment in his/her temporary tablespace in the background when their sessions put the first data in the GTT. This arrangement avoids interference and visibility of data and transactions across concurrent sessions. Also there are no possibilities of one session blocking another session in concurrent transaction and extremely small amount of redo is generated as compared to the transactions made against the normal "Heap Organized Tables" The GTT may be either created to hold the data for the duration of a transaction or session as per the requirement and application design.


GTT can have many of the attributes of a normal "Heap Organized Table" like triggers, check constraints, indexes, use same SQL for all database query and manipulation; but may not have referential integrity constraint (Foreign key), neither can be a target of a foreign key, no varray or nested table columns, not be part of cluster or partitioned and analyzed for optimization.


GTT Example -
With that little background let us have an example with explanation to understand the use of GTT. 

There are two tables "store" and "sales" and we have to answer "show all stores with above average sales". There are three aggregate operations needed -
  • Total number of stores.
  • Total sale for each store.
  • Total sale for all the stores.
Here is the normal query -
SELECT
store_name, sum(quantity) AS store_sales,
(select sum(quantity) from sales)/(select count(*) from store) AS avg_sales
FROM store s, sales sl
WHERE
s.store_key = sl.store_key
HAVING
sum(quantity) > (SELECT sum(quantity) FROM sales)/(SELECT count(*) FROM store)
GROUP BY store_name;
 
Here we create GTT to have the aggregation stored (for the highlighted queries in the above) -
 
create global temporary table store_qty
on commit preserve rows
as
select sum(quantity) as all_sales from sales;
 
create global temporary table store_cnt
on commit preserve rows
as
select count(*) as nbr_stores from store;
 
create global temporary table store_sales
on commit preserve rows
as
select store_name, sum(quantity) as store_sales
from store natural join sales
group by store_name;

These GTT as above will be created with the data pulled from their respective tables. The "preserve rows" option of the "on commit" clause will retain the pulled data until the session ends. The other option is "delete rows" which empties the global tables when users commits the transactions.
 
Anyway in the above example if the session is closed the data in the GTT will be deleted and we may require to re-populate those tables with the data on opening of the new session, so in such every subsequent opening of the session we may use following simple insert statements -
 
insert into store_qty select sum(quantity) from sales;
insert into store_cnt select count(*) from store;
insert into store_sales select store_name, sum(quantity)
                        from store natural join sales
                        group by store_name;
 
And of course our final solution producing query -
 
select store_name, store_sales, all_sales/nbr_stores as avg_sales
from store_qty, store_cnt, store_sales
where store_sales > (all_sales/nbr_stores);

No comments:

Post a Comment