In a previous post we discussed about why we need our databases to allow for concurrent transactions and the related problems and different transaction isolation levels, their outcome with respect to data integrity/consistency, as part of basics of "Transaction Concurrency". This post is supposed to take it further by considering how databases use data locks for providing such concurrency, and the implications of such locks, if any, for writing multi-user applications, multi-version read consistency and granularity to which locks may be implemented. Again wherever required, this discussion will refer to implementation in Oracle DBMS as an example.
Locking methods prevent unserializable schedules by keeping away more than one transaction from accessing the same data elements. Time-stamping methods assign timestamps to each transaction and enforce serializability by ensuring that the transaction timestamps match the schedule for transactions. Validation methods maintain a record of transaction activity and the changed data is "validated" against the changed items of all currently active transactions to eliminate any unserialized schedules, before committing a transaction. Most of popular commercial databases like Oracle use a combination of available methods.
There are three possible outcomes of this sequence of operations in "Pessimistic Locking" -Concurrency Control -
There are different methods database systems may use in combination to implement the concurrency; they include locking mechanisms to guarantee exclusive use of a table by a transaction, time-stamping methods that enable serialization of transactions, and the validation-based scheduling of transactions.Locking methods prevent unserializable schedules by keeping away more than one transaction from accessing the same data elements. Time-stamping methods assign timestamps to each transaction and enforce serializability by ensuring that the transaction timestamps match the schedule for transactions. Validation methods maintain a record of transaction activity and the changed data is "validated" against the changed items of all currently active transactions to eliminate any unserialized schedules, before committing a transaction. Most of popular commercial databases like Oracle use a combination of available methods.
What are Locks ? -
Simply put, a "lock" is a mechanism used to regulate concurrent access to a "shared resource". The term "shared resource" has a very wide meaning - it not only includes the table rows but everything that should be required to be accessed concurrently at any level. There are two methods of locking - Pessimistic Locking and Optimistic Locking - depending upon how the applications would like to go about changing the data.Pessimistic Locking -
"Pessimistic Locking" method allows for declaring an intention to "update" a row(s) from a set of preselected rows fetched without locking. The sequence of operations is thus -- Application simply queries a set of rows from a table (a plain select statement).
- Then user picks a row(s) which he/she wants to update and then declares intent for update by some means (may be hitting some relevant button in the form interface).
- The system re-queries the picked row(s) from the database with a difference this time that locks are requested on the row(s) for protection against update by other sessions.
- An update statement actually updates the row(s).
- The re-query (step 3) as above is successful and now our user locks the rows (to prevent others from updating).
- The re-query is blocked for other session has already started with a transaction on the same row(s) and system makes us wait for other session to finish.
- If the row(s) have been changed (and committed) by other session meanwhile which our user re-queried, then no rows will show up (the data queried in step 1 is stale) and our user has to start over again from step 1 as listed above.
Thus using "Pessimistic Locking" our user could safely change the row(s), without any possibility of overwriting some other user's changes as it has been already verified in the session of our user that the data did not change meanwhile before requesting for the lock (in step 3 of sequence of operations).
Optimistic Locking -
"Optimistic Locking" method is based on the belief that data in the rows is not changing / changed by other session while our user is trying to update. The chances are that our user may or may not have luck to update the row(s) - in later of the cases our user will have to re-key the transaction.
Oracle DOES NOT use lock escalation. However it does use "lock conversion" or "lock promotion", the terms which are considered synonyms. This allows Oracle to make a "less restrictive" lock to "more restrictive". A case of "lock conversion" has been described below as an example -
Pessimistic v/s Optimistic -
The question is difficult to answer and will be answered by DBMS vendor as each have different approach. However wherever possible, to use "Pessimistic" locking mode would work better. It provides the users with the confidence that the data they are modifying are on their screen and hence currently "owned" by them. However the chances in this case may be a user may not further update and finish the transaction immediately after the locks have been obtained, which has to be taken care by killing such idle (non-performing) session after a reasonable time. Oracle favors the applications to use this approach as the locking of a row does not prevent any normal activity like reading of the same rows etc. The other databases are known to have NO favor with Pessimistic, but this needs to be authenticated by long term users of other databases in the comments to this post or posting a link to a similar blog post explaining the same.Lock Escalation -
Concept of "Lock Escalation" relates to "granularity". Granularity is the level at which the data is locked in the database / tables. For example Oracle implements "row level" locking, which means that locks are placed on rows, NOT on page (block) or table. So in this case granularity is "row". "Lock Escalation" is the name given to decrease in the granularity. In some databases locks are considered to be scarce resource and hence considered as overhead to be avoided if they happen to be many in number. For example if for users operation say 100 rows are locked then the system may prefer to lock the entire table instead of each of 100 rows. So this is the case of "lock escalation".Oracle DOES NOT use lock escalation. However it does use "lock conversion" or "lock promotion", the terms which are considered synonyms. This allows Oracle to make a "less restrictive" lock to "more restrictive". A case of "lock conversion" has been described below as an example -
- You select a row(s) from a table with "FOR UPDATE" clause.
- One "exclusive lock" is placed on the row while "row share table" lock is placed on the table to allow to perform any operation on the table except changing its structure (which will lock the entire table itself).
- When the actual command to update the locked row(s) is issued Oracle will convert "row share table" lock to more restrictive "row exclusive table" lock.
In this post we have discussed about general theory behind the "locking" mechanism. Visit again for explanation on some more database related concepts. Coming back shortly.
Very informative post. You have indeed covered topic in good detail. By the way I have also shared my experience as How to manage transaction in database you may like.
ReplyDeleteyes I have gone through your post. It's excellent too. thanks for subscribing to this blog.
DeleteWell described overall. Just one detail between Pessimistic Locking and Optimistic Locking which you mention in passing but don't make explicit is the timing of the locking in each case.
ReplyDeleteIn the Pessimistic case the lock is taken as soon as the user selects a row to work on from those offered. This means that the row is locked for a relatively long time, and as you note, if the users dawdles over the changes, gets distracted by other priorities, or just goes to lunch or home before committing or rolling back the transaction, other users will be locked out of the row (or in databases that lock at the page level, perhaps many rows) for an extended and perhaps unacceptable period of time.
In the Optimistic case, the lock is only taken after the user has already modified the row's data in memory and elected to commit the transaction. At that point a lock is taken only long enough to refetch the original data and verify that the row was not modified by another user in the interim.
Thanks for the post. I am planning a post in my own Blog (http://informix-myview.blogspot.com) in April on how IBM Informix Dynamic Server features enable simple, fast, low overhead verification of changes during Optimistic Locking procedures in applications. So this is a subject near and dear to my heart.