Friday, 23 March 2012

Distributed Database System - Principles

Please spend a moment to read the previous post "Distributed Database System - Introduction" as the text below may contain some references to it.

Continuing from the last post where we underlined the need for the Distributed Database Management Systems (DDBMS) for the today's world of universal dependence on information systems in which all sorts of people; the company's management, employees, customers, shareholders, franchisees, channel partners, including general public (potential customers / investors) need to have a quick access to the company's data; we now need to understand the technology that works behind the DDBMS.

However, since this blog is entitled "Cool Database Fundas" and the target audience includes the curious apart from the professionals, I have decided to drive through using an example which I hope will make the various technical terms understand easily for some even technically challenged readers.

Our example Distributed Database -
We have to implement a distributed database system for a fictitious company having its offices at Los Angeles, Singapore, New York (HQ), Paris and Tokyo. We have six large tables in our database namely A, B, C, D, E, and F and the motivation behind our client asking us to implement DDBMS solution is the response time has to be reasonably small apart from ensuring maximum availability, reducing the network traffic and providing the site autonomy.

Centralized Database approach -
Though our aim here is to build a DDB, we start our approach with a non-distributed centralized database at just one site say New York, it being the Head Quarter of the company and everybody else may access this data over the net. Though there are lot of problems arising out of concurrency implementation, making joins on the tables, security implementation etc. NOT showing up in this setup, the major issue being that if this single site fails then the entire system will come to stand still and of course there will be huge network traffic involved from other locations to the centralized database and no site autonomy for data is applicable in this approach.

Distribution based on data access frequency -
In our next approach, it makes sense to locate the tables at the sites where they are more frequently accessed than the other sites. Accordingly we are locating our tables - A & B at New York, C at Singapore, D & E at Tokyo, F at Paris and no tables at Los Angeles. This distribution has brought about "local autonomy" over the data, each site taking responsibility for its security, backup and recovery and concurrency control (for local transactions). However, the problems of centralized database don't seem to have been solved, rather escalated.

The system is still prone to single point failure - for example if the Paris site is down the table F is not available to anybody (so also same case with the other tables). Secondly, the problem may arise with joins if the query requires tables located at different sites. With the dispersed data the security is also vulnerable to a little extent as compared to when all the tables were in centralized database.

Distribution with replication -
"Replication" is the name given to the practice of maintaining synchronized copies of an object (say table) at multiple locations (say sites). We can have two considerations while replicating the tables -
  1. A table may be best replicated to another site where it is frequently required to minimize the network traffic.
  2. A table may be replicated to a site where it is convenient to join it with the table on that site when users use join queries again not only to minimize the network traffic than otherwise if the tables were to be stored on different sites, but also remove the complexity in join when tables are dispersed.
Of course now we get advantage of replication in availability; if a table is replicated at two or more sites and one of those sites goes down, still nobody (users) is affected since the table is accessible from other site(s).

The replication however had introduced another problem; the problem of concurrency control and data consistency across multiple copies of the tables. The security risk is also higher than the preceding case of "distribution based on data access frequency".
If we decide to have only one another copy of each table at a different site with the entire six tables at New York (being HQ), then the security and concurrency exposures are limited (as compared to replicating a table at more than one location). So here we have all tables at New York (A, B, C, D, E). Then we have B & F at Paris, D at Singapore, A & E at Los Angeles, and C at Tokyo. There are two bottlenecks in this arrangement -
  1. If a table is accessed heavily at more than two sites then we violate our own premise. (security and concurrency control exposure will escalate).
  2. The centralized place where all the tables are residing (New York) will be heavily loaded for all join requests will have to be handled there along with many of the other access requests.
So going by the considerations of locating the replications (earlier two points) and avoid the bottlenecks (the just preceding two points) we may arrive at a solution - We may have tables A, B & F at New York, D & E at Singapore, C & D at Los Angeles, A & F at Tokyo and B, D & E at Paris. Here we are considering table C is only single (not replicated as we think the table is not very important and infrequently used but only mostly accessed at Los Angeles. Also this is a case to show that there is a flexibility that not each of the tables must be replicated).

Concurrency Control with Two Phase Commit -
"Concurrency" is the name given to allowing two more users to perform transactions concurrently on the database rows without violating the integrity and consistency of the data. Databases use transaction locks to limit the concurrent operations by many different users from causing inconsistency in the data. The implementation of the concurrency is much more difficult in the replicated environment such as above than dealing with all the tables in a single server and only accessed from there. For example if a user at New York updates a particular value in a record of table B while at the same time another user updates the very same value in the very same record in Paris may result into an inconsistency according to the definition of transaction.

It is possible to implement the transmission of update on a table to its copies in different ways, particularly if the nature of the data and of the applications that use it can tolerate delay (backlash) in such update. Those methods are said to be "asynchronous". It could be either done by sending the message from site of actual update to the remote site where copy of the table exists and apply it there right away, or alternatively choosing one of the sites to accumulate all the updates (for all the tables) and transmission of changes may be scheduled. In a third "asynchronous" approach, each table may have one of the sites declared as "dominant" (for that table) to which all of the updates may be sent and from there transmit it to other copies of the table on scheduled basis.

However some popular commercial DBMS systems like Oracle use Two Phase Commit (2PC) process which is a "synchronous" algorithm. The "2PC" consists of two phases namely "Prepare Phase" and "Commit" phase. The mechanism uses a special "log" file to hold the update temporarily at all the sites. When an update is initialized at one site in a certain table (say D at Singapore) then the actual update is first written temporarily to the log file at every site (say Los Angeles & Paris) and NOT the copies of the table anywhere at this stage. The "Prepare Phase" first confirms with the remote locations (LA & Paris) having the copies of a table (D) under update at one site (update initializing site, Singapore) if they are in an operating condition and if they could lock the respective data to be updated (Oracle implements row level locking) in their copy of the table (D at LA & Paris) and if any of this condition is not met then either the transaction waits until favorable conditions are reached or transaction is aborted or otherwise (if conditions are met) the data makes way to the copies of the tables at all sites from the respective log files; which is part of the "commit" phase. So in this way all the copies of the table either stay at the original state if the update proceeds to fail at any location or get updated "synchronously".

Distributed Joins -
"Distributed Joins" involve tables at discreet locations required to be joined to satisfy a "join query" (query that required data being combined from two or more tables in a join condition). Various permutations and combinations may emerge for transmitting the data from one location to another or even third depending upon -
  1. if or not the location from where the request for join originated has copy of any of the tables involved in the join.
  2. the location where the join may be actually performed, transmitting the final or intermediate result.
A DDBMS system comes with a query optimizer which works hand-in-hand with relational query optimizer to determine the optimal way of performing the join operation in consideration to the following -
  1. The number and size of the records from each table involved in the join.
  2. The distances and costs of transmitting the records from one location to another to execute the join.
  3. The distance and cost of shipping the result of the join back to the location that initialized the join query.
Partitioning or Fragmentation -
When a table is divided then it is called as "partitioning" or "Fragmenting". There are two basic ways in which the tables in the DDBMS could be fragmented - Horizontally and Vertically. There may be third type that is hybrid of these both.
In the "horizontal partitioning" a table may be split in such a way that each site may contain some records. Essentially the structure of the table at each site is exactly the same. For example, for our fictitious company we may have an employee table horizontally at each location to store the data of employees working in the vicinity of that location. So the structure of the "employee" table will be identical at every site but only the records. This arrangement makes sense in such case when a site has to deal more with particularly their employees and say location like New York (HQ) requires to access the data of all the employees of the enterprise and so it has to be fetched from all the different locations and had to be put together (Union).

In the "vertical partitioning" we divide the columns of a table among several sites. However each such fragment of table containing a subset of columns of the full large table must include a primary key column of the full large table and rows in each such fragment are coordinated through the common value of primary key. This arrangement can make sense when different sites are responsible for processing different columns (attribute functions) of same entity. However in such case a multi-site join will have to processed if some site requires all the data of each entity.

Distributed Directory -
The notion of "location transparency"  in DDBMS comes from that when a query is fired then the system simply knows the location of different tables, their partitions, columns (or objects in general and their privilege sets etc.) and the other metadata required to satisfy the query. The set of system tables (data dictionary, directory or catalog) provides this knowledge to the system. However the question of location of the directory is as complex as the distributed data itself in the DDBMS and there are n number of possibilities. However in most cases maintaining a "synchronously updatabase" copy of the directory at each location is considered to be the best option.
So here we have walked through the "principles" of the DDBMS with a example and practical approach, however there is a systematic mathematical treatment to each of the concept discussed here.

Next time we meet with a new topic and it will be equally interesting one. Keep coming back.

No comments:

Post a Comment