Sunday, 18 March 2012

Relational Database Model - Codd's Rules - Part 2

This is in continuation to the last post "Relational Database Model - Codd's Rules - Part 1"

The following is explanation of the remaining Codd's rules 7 - 12 -

Codd's Rule #7 (High-Level Insert, Update and Delete Rule) -
Supports "Set Based" operations or "Relational Algebra". A relational database must support basic relational algebra operations (projection, selection, join, product, union, intersection, division and minus). Relational databases must allow data retrieval in sets constructed of data from multiple rows and/or multiple tables (join / product operation...). This means that data manipulations must also be supported to happen on a set of rows retrieved on such basis rather than just for a single row in single table. For Example - If we want to delete certain rows, then for a "delete" statement fired by the user those rows which satisfy the specified criteria should be deleted as a set.

Codd's Rule #8 (Physical data independence) -
Programs or applications that manipulate the data in the relational database must be unaffected by changes in the way the data is physically stored. For example, in the file based system (non-RDBMS) of the database, the data is stored in the form of records in SDF (System / standard Data Format) or Delimited format. The programs accessing the data in such case was required to be aware of the position of the data, which when changed, required the set of programs to change that may be used to access / manipulate the data. One of the problems, for example that was faced in the recent years was "Y2K (year 2000) problem", also in the databases which required the century figures 19 or 20 to be introduce before the two figured years; like an insurance start date 051275 (ddmmyy) was required to be converted to 05121975 (ddmmyyyy) which increased the field length by 2 and so the change in the programs was required to cope up with this change in the structure and task was not just colossal but was also limited by time to be finished before the actual year 2000 dawned.

Codd's Rule #9 (Logical data independence) -
Relationships among tables, structure of tables, rows or columns should be able to be modified without impairing the function of applications and ad hoc queries. Logical data independence is more difficult to achieve than the physical data independence. Not only this change in database schema or structure of tables and relationships should not affect the application but further it should not require to recreate the database too.

Codd's Rule #10 (Data integrity independence) -
Data integrity is the function of DBMS. Consistency and accuracy of the data is very important. "Integrity Constraint" is the protocol (set of rules) to implement this. Integrity in the database takes 3 forms - Entity Integrity, Referential Integrity and Domain Integrity. [expect their explanation and implementation in some future post]. RDBMS allows to implement these integrity rules at the database level in the form of "database constraints" definable as objects (may be as part of definition of the tables) or in the form of programs (called triggers) written within the database (and there existence be part of metadata in "data dictionary" (also called Catalog). Also it must be possible to change such constraints as and when appropriate without affecting existing applications.

Codd's Rule #11 (Distribution independence rule) -
Supports distributed operations. Users can join the data from tables residing physically on different servers (and/or different databases). They should be allowed to seamlessly manipulate the data in those tables withing one transaction without having to know where the data actually resides (i.e. in which database and on which server). Also the integrity must be maintained across the whole operation. Moreover the existing application should continue to operate successfully (and without requiring any change) when existing distributed data are redistributed around the system.

Codd's Rule #12 (Non-subversion rule) -
Data integrity and security can not be subverted. There should not be any other path to subvert the data integrity. This means that the users should not be allowed to circumvent the data integrity (and security too) by following a different path and manipulating the data. If a relational system provides low level single-record-at-time interface then such interface can not be used to circumvent the integrity and security constraints defined in the higher level language (dealing with sets of records at a time).

Hope you had pleasure reading this pair of posts on Codd's Rules. Please visit again for some more interesting stuff. Please don't forget to leave your comments / technical addition to the posts.

No comments:

Post a Comment