Concurrency issues in Distributed Databases

BlueAdmiral.com

Concurrency issues in Distributed Databases

In a previous module you studied the problem of transaction processing in a
centralised database. Ensuring the consistency of transactions in a distributed system is even more difficult. The aim of concurrency control mechanisms is to find a
suitable compromise between maintaining the consistency of the database and maintaining a high level of concurrency.

Remember the key idea of serialisability - concurent transactions must be processed in a way which gives the same result as if they were processed according to a serial schedule.

There are many concurrency control mechanisms but they can be grouped into optimistic and pessimistic classes.

Optimistic algorithms are based on the view that not too many transactions will conflict with each other and they delay synchronizing transactions until they end.

Pessimistic algorithms are based on the view that many transactions will conflict with one another and therefore the concurrent execution of transactions is
synchronized early in the execution cycle.

The two main concurrency control mechanisms - locking and time-stamping - can
belong to either the optimistic or the pessimistic class depending upon how they' are carried out.

Locking and deadlock

Time-stamping
Time-stamping algorithms do not attempt to maintain serialisability by excluding transaction processing. Instead, they select a serialization order and execute
transactions accordingly. The transaction manager assigns a unique time-stamp to each transaction when it is initiated.

The timestamp is not only unique. It should also be part of an ordered series and this permits ordering of the transactions. Ideally, they should be generated by a global counter but such counters are not easy to maintain in a distributed system. Consequently, each site usually assigns a timestamp based on its own local counter and includes an identifier so that it can be distinguished from a timestamp from another site which might have been assigned the same counter value. Local system clock times could be used instead of a counter.

Since each transaction has a unique timestamp, the transactions can be ordered by a simple rule: if two operations from two transactions conflict, the operation belonging to the older transaction is processed before that of the younger. The older transaction is defined as being the one with the earlier timestamp. When a new operation comes along, it is accepted if its timestamp is younger than all the conflicting transactions that have already been scheduled. If this is not so, the entire transaction is rejected and has to start again with a new time-stamp. Although time-stamping means there will never be deadlock, it may result in a transaction being restarted several times.

A timestamp ordering scheduler operating in this way is guaranteed to generate a serialisable schedule. However, so far we have only considered the situation when all the operations to be scheduled have been received by the scheduler and their timestamps compared. In practice, it is more likely that operations will come to the scheduler one at a time and so it will be necessary to detect whether an operation has arrived out of sequence. This is done by assigning each data item to be altered both a read time-stamp and a write timestamp - these are both the largest (i.e. latest) time-stamps of the transactions which have read or updated the data item. Now the timestamp of a transaction can be compared with the read and write time-stamps of the data items it wishes to access - if the latter are later than the former, the transaction will have to restart since it cannot be allowed to alter a data item which has already been altered by a transaction with a later timestamp.

So far, we have considered pessimistic time-stamping - the validation is done before the transaction is allowed to start and the sequence is:

Validation-----------> Read-----------> Compute-----------> Write

However, of we assume that conflicts between transactions are not frequent. this order can be changed:

Read-----------> Compute-----------> Validate-----------> Write

The read, compute and write operations are processed without updating the actual database - just the local copies are updated. The validation phase checks whether these changes would affect the consistency of database and, if not. the changes are made global (i.e. the actual database is updated). If the changes would impair the consistency of the database, the transaction is aborted and has to restart. So far, optimistic algorithms have not been implemented in a commercial DBMS.

Back