Monday, 14 May 2012

Materialized Views

What are Materialized Views? -

Let us start with an example - A company has a database with a table "sales" maintaining millions of rows pertaining to the orders the company has received from its customers over a couple of years. The company managers are periodically interested to know breakdown of region wise total sales, or average value of region wise orders, or breakdown of customer wise total sales etc. for which you would write the queries, such as -

select region, sum(OrderValue) from sales group by region;
select region, avg(OrderValue) from sales group by region;
select CustName, sum(OrderValue) from sales natural join customer group by CustName;

When each of such query is fired, then every time the millions of rows of data shall be read and loaded in the memory, the aggregation / calculations are performed and then the results are returned to the user.

Now answer this simple question to yourself - why you used those log tables, or trigonometric tables in your school when you knew the method and formulas for calculating logs or trigonometric ratios? Used them as "ready reckoner", right? To save time!!

Then Materialized Views are such "ready reckoner" in the database, which when created with such queries, have captured and stored those summaries in them, for to be quickly referenced to, when such aggregating queries are fired. And because those Materialized Views store only the summaries, the number of records in them are obviously small. For example, if the territory over which the company carries out the business has 10 regions then the first and second of the examples above will have only 10 rows each. So not only much much less I/Os and scanning efforts are required to be performed to satisfy such queries but there will be much much less memory requirements to cache the data too; to conserve the server resources. Materialized Views are a data warehousing / decision support system tools that can decrease by many orders of magnitude the response time taken for queries involving hundreds of thousands to millions of records in calculating aggregations and performing complex joins and other operations, that too transparently. Due to the transparent use of Materialized Views, the SQL query statements still are fired in the same form & syntax at tables, so without any need for users to change them.

Another purpose for creating materialized view(s) may be to provide a local copy of the data (replication) in order to reduce network loads or enable data subsetting (based on rows and/or columns) for local sites or enable disconnected computing in the distributed database environment.

View vs. Materialized View - 

The typical view is a stored query written on table(s), which implicitly gets invoked, each time when the view is referenced in the SQL query. It is only a definition and is stored as such, there is no data storage, as such view is called as virtual table. The view is created to provide an abstraction layer, hide the source (table), implement a constraint or provide a predefined visibility to the data to different users. The views provide the fresh data, but incur the overheads of executing the query (stored in the view) each time on reference.

Materialized View stores the data, in most cases precomputed summaries. You can not refer to the materialized view in a query; the queries are same fired at tables, but if they involve the same expressions based on which there exist the materialized views created on the table(s), then those queries are internally directed to the materialized views to satisfy them. Materialized Views may provide the stale data (if non-updatable or less update frequency), and since do not execute frequently, do not incur any execution overheads, but must use space on the hard disk to store the precomputed summaries unlike the views.

Materialized view updatability -
Materialized Views as such are used in Oracle, while other DBMSs use an implementation of equivalent functionality through a slightly different concept and arrangement. For example - SQL Server uses "Indexed Views" or MySQL uses "FlexiViews". The forthcoming discussion is with reference to Oracle as a model database.

Materialized views come in 3 possibilities with respect to their updatability at the remote site :
  • Read-Only materialized Views
  • Updatable materialized Views
  • Writable materialized views
Read-Only materialized view as the name suggests, do not allow updates to be performed at the remote materialized view site and they are simple created by omitting "FOR UPDATE" clause from their definition syntax. 
Updatable materialized view allows the changes to be made at the remote site. The materialized view must belong to a "materialized view group" to push those changes back to the master site during refresh. The views use "FOR UPDATE" clause in their definition syntax.
Writable materialized view are same as the updatable, but being not made part of "materialized view group", it can't push the changes back to the master site (during refresh) rather changes are lost in the materialized view itself during refresh.

Other topics of interest - A short challenging quiz , Evolution of DB systems

No comments:

Post a Comment