Cool Database Fundas: Integrity Constraints

Please Click Here for the explanation and background on the Data Integrity Constraints.

Note : The set of commands / statements used here and the explanations thereof are with respect to Oracle as a model database. Readers may experience minor differences in the other database systems.

Constraint creation with the table -
Constraints may be created along with the table. For example -

create table employee (empid    number(5) constraint pk_employee primary key,
ename varchar2(15),
job    varchar2(15),
salary number(5) not null,
cardno    number(5),
deptno    number(3) constraint fk_employee references
                                                                                    department (deptno),
constraint chk_employee check (salary between 1500 and
                                            12000),
constraint uk_employee unique(cardno)
   );

Explanation : The constraint definitions in red are called column level definitions and those in blue are called table level definitions. Here the word level should not be construed as any hierarchy but only places where those definitions are integrated. The constraints ultimately apply to the columns and their behavior is uniformly same irrespective of their level (read place) of definition. Only one constraint definition may be integrated with column (but a column may have multiple constraints), whereas there is no limitation to the number of constraints definition that can be included after all the columns have been defined (table level). So if a column is supposed to have multiple constraints then one of the constraints' definition may be integrated with column and remaining at table level. Or table level constraint definitions must be used if the constraint has reference to multiple columns, for example a composite primary key (primary key consisting of multiple columns). But some developers are of the opinion to define all the constraints at the table level, just to keep the table definition neat and clean. However remember that table level definitions of constraints are not possible for NOT NULL constraints. A constraint definition contains optional constraint clause, for example "constraint pk_employee", which is used to provide user defined name for the constraint, used conveniently for fetching the constraint metadata from catalog. In absence of "constraint <name>" clause, for example "not null" in above case, the system will implicitly provide the name for the constraint, and will have to be every time known from the metadata as such names are very difficult to remember. The check constraint in above example has been defined as table level because the column had to have two constraints and "not null" could not be defined at table level. The example of unique constraint as table level indicates that the constraint could be defined at table level too, and it may be a matter of convenience as aforesaid; we could do this with other constraints as well.

Adding constraints to an existing table -

It is possible to add the constraint to an existing table, if a certain constraint was not created with the table, or some developers also prefer to create the tables without constraint definitions and add the constraints afterwards.

Add Primary Key : alter table employee add constraint pk_employee primary key(empid);

Add Not Null : alter table employee modify salary not null;

Add Check : alter table employee add constraint chk_employee check (salary between
1500 and 12000);

Add Unique : alter table employee add constraint uk_employee unique(cardno);

Add Foreign key : alter table employee add constraint fk_employee foreign key (deptno)
references department (deptno);

Explanation : Observe that for "Not Null" constraint we are using "modify" instead of "add"; for actually not null constraint is defined as column specification as explained above. Also observe that "foreign key (deptno)" clause had to be added to the definition, because we have to tell here which column has to become FK, that was not necessary in the previous example as the constraint was being defined as column level. Here also "constraint <name>" part is optional in all the above statements.

Adding constraints when table has rows -

The ambit of constraints is not to allow any invalid data to remain in the column(s) on which they are defined. Hence, if the data is present then, it will be validated by the constraint while getting created, for the default status of the constraint is "enabled". The constraint creation fails if the data does not conform to the rule of the constraint.

Some database systems like Oracle allow to use "novalidate" key word to ignore the existing invalid data. However, this keyword is not allowed to be used with primary key constraint or unique constraint. This is because, PK or UK may be referenced by the FKs, and it may result into identity crisis as the referential integrity may be affected by it, with the possibility that the child row may result into referring multiple parent rows (important for job seekers, a possible question in the interview!!!) which has to be strictly avoided in any case. The use of "novalidate" is demonstrated for a check constraint in the example below -

alter table employee add constraint uk_employee
check(salary between 1500 and 12000) novalidate;

Disabling/enabling a constraint -

A constraint, when created, is in the enabled state by default. However, sometimes like, during bulk data load operation, or for other reasons the constraint may be required to be disabled. Disabling action is associated with declaring the data as "NOT VALIDATED" in the data dictionary, and obviously while enabling, the data will be verified before the constraint state is changed to enabled.

alter table <TableName> disable constraint <ConstraintName> [cascade];

alter table<TableName> enable [novalidate] constraint <ConstraintName>;

"Novalidate" option does not apply to primary key or unique constraint for the reasons already explained, but it may be optionally used for other constraints if you want the constraint to ignore any invalid data that was put in the column while the constraint was in the disabled state. "cascade" option may be only used while disabling the primary key. It is used/useful when there are foreign keys referencing to it, which are also supposed to be disabled. The primary key constraint is not allowed be disabled with an enabled foreign key constraint referencing it. However, "cascade" option is not available with enable action on primary key, which means that Foreign Key constraint will have to be enabled explicitly after enabling the primary key.

Dropping a constraint -
A constraint may be dropped using the following statement :

alter table <TableName> drop constraint <ConstraintName> [cascade];

Here the "cascade" option may only be used while dropping the primary key constraint. It is used/useful when there are enabled foreign key constraints referencing to it and they (FKs) are also required to be dropped. The primary key constraint is not allowed to be dropped with an enabled foreign key constraint referencing it.

Dropping a Parent Table -
A table that has a primary key constraint being referenced by an enabled foreign key constraint set on another table (child table), is called as Parent Table. It is not possible to drop a parent table, irrespective of whether the tables contain any data. However, this default behavior may be overruled by using "cascade constraints" clause in "drop table" statement as given below -

drop table <ParentTableName> [cascade constraints];

The "cascade constraints" clause not only allows to drop the parent table but also drops implicitly the referencing foreign key constraints enabled on the child tables, while retaining the child tables and its data.

Foreign key constraint options -
Foreign key constraints enforce the referential integrity. The default behavior enforces that, the parent row(s) (in parent table) can not be deleted if there are any child row(s) dependent on it (in child table). This behavior may be altered by using "on delete cascade" or "on delete set null" options in the definitions of foreign key constraints as given below -

...constraint fk_employee [foreign key (deptno)] references department (deptno)
on delete ....;

The options are mutually exclusive, as former option cascades the delete of parent row(s) on related child row(s), while the later option will set (update) the corresponding foreign key value to null without affecting the remaining part of the child row. The "foreign key (deptno)" option (without square brackets) is required when the constraint is defined as table level or while adding the constraint on an existing table.

Deferrable Constraints -
By default the constraint rules are implemented at statement level, i.e. the data is validated immediately after the statement finishes its action. Such constraints are classified as "immediate" with reference to their data validation implementation. However, some situations require the validation of the data against the rule of constraint be delayed until the end of the transaction rather than checked at statement level. Such constraints are called as "deferrable". Following examples demonstrates such a situation -

create table employee (empid     number(4)     constraint pk_employee primary key,
ename    varchar2(15) not null,
   job    varchar2(15),
   salary number(5)     not null,
   mgrid number(4) constraint fk_employee
                                                                                  references employee (empid) deferrable,
   .......
);

Explanation : In the above partial definition of an employee table, we have a column "mgrid" as foreign key referencing to "empid" column of the same table which is a primary key. This is because the "mgrid" is the "empid" of the employee whom the corresponding employee reports. This depicts an organizational hierarchy of the employees within the same table (because managers and subordinates are all employees). This case may require multiple insert statements until the subordinate and manager relationship is established among them. So the constraint checking will have to be delayed until all such rows are inserted (end of the transaction). "deferrable" constraints however are implemented as "immediate" by default and they require explicit statement to "defer" as under -

alter session set constraints = deferred ;

This above statement will "defer" (for every transaction) all the deferrable constraints in the particular session. If it is required to create the constraints as "deferred" only to start with, then replace the key word "deferrable" with "initially deferred".

Constraints are most convenient way of implementing the data integrity as we have seen from the above, but they are not without limitations. There are situations when the constraints may not be able to implement some custom business rules and we have to resort to other techniques like triggers or views. Click Here for data integrity implementation using triggers. Don't forget to leave your comments and ask your doubts in the comments section please.

Cool Database Fundas

Wednesday, 2 May 2012

Integrity Constraints - Creation

No comments:

Post a Comment

Followers