Database Systems: The Complete Book: Solutions for Chapter 19

Database Systems: The Complete Book

Solutions for Chapter 19

Solutions for Section 19.1
Solutions for Section 19.2
Solutions for Section 19.3
Solutions for Section 19.4
Solutions for Section 19.5
Solutions for Section 19.6
Solutions for Section 19.7

Solutions for Section 19.1

Exercise 19.1.1

The difference between strict and nonstrict locking for this example is only in the matter of whether the lock on A is released prior to the write of B. That is, the lock on A must be taken before r₁(A). The lock on B can occur in any of the three positions prior to r₁(B) (3 choices).

If locking is strict, then the two unlocks must occur, in either order, after the second write action. Thus, there are 6 orders for strict, 2-phase locking.

If locking is not strict, the unlock of B must occur after w₁(B), but then the unlock of A can occur in any of the three positions after w₁(A). Thus, there are 9 orders if strictness is not required, and 3 of these must be nonstrict.

Exercise 19.1.2(a)

T₁ wrote only B, but that value was later read by T₃. Thus, T₃ must be rolled back. T₃ wrote D, but no transaction has read D, so no further rollbacks are needed.

Exercise 19.1.4(a)

Revised 5/3/04.

Each transaction reads only one element, so the question is, does it read from the other? If so, then in order for the schedule to be recoverable, it must commit after the other as well. That is, the two constraints on the schedule can be expressed:

If r₁(C) follows w₂(C) then c₁ must follow c₂.
If r₂(B) follows w₁(B) then c₂ must follow c₁.

The partial orders that violate (1) look like:

o->o->o->o
   |      ^
   v      |
o->o->o->o

With a little effort, we can find that there are 12 orders for the eight nodes involved.

The orders that violate (2) are similar; they have the constraints:

o->o->o->o
      |   ^
      v   |
o->o->o->o

There are 10 of these. Since there are {8 choose 4} = 70 orders in all, 48 of them are recoverable.

Return to Top

Solutions for Section 19.2

Exercise 19.2.1(a)

                _____________________
                |                   |
           A    |                   v    B
     T_0 ----> T₁       T₂ ----> T₃ ----> T_f
     | |          A       ^         ^
     | |__________________|         |
     |              A               |
     |______________________________|

Above is the polygraph. We have labeled with A the arcs due to the reading of A, and likewise for B. The two unlabeled arcs are second-phase requirements that the other writers of B --- T₁ and T₂ --- not interfere with the reading of B. by the final ``transaction'' T_f.

Clearly, T₃ must come last in an view-serializable schedule, but there is no required ordering of T₁ and T₂. Notice that conflict-serializability would require T₁ to precede T₂, because their writes of B cannot swap places.

Exercise 19.2.2(a)

For conflict-equivalent schedules, the three writes must remain in order, but the three reads may be moved anywhere, as long as they precede the write by the same transaction.

Start with the three writes. We can only place r₁(A) prior to w₁(B). Next, we can place r₂(A) in any of the three positions prior to w₂(B). At the last step, there are now five places where we can put r₃(A) prior to w₃(B). Thus, there are 15 conflict-equivalent schedules.

For view-serializable orders, there are the above 15, and another 15 in which w₁(B) is placed after w₂(B). Intuitively, we can do so because neither T₁ nor T₂ have their value of B read, so the order in which they write doesn't matter.

Return to Top

Solutions for Section 19.3

Exercise 19.3.1(a)

All locks are granted until w₃(B). Now, T₃ has to wait for T₂, which has a shared lock on B, and the waits-for graph is

     T₃ ----> T₂

At the next step, T₂ must wait for T₁, which has a lock on C. The waits-for graph is:

     T₃ ----> T₂ ----> T₁

Next, w₄(A) causes T₄ to wait for T₁. There is still no cycle, and the waits-for graph is:

     T₃ ----> T₂ ----> T₁ <---- T₄

Finally, T₁ tries to write D. It cannot get the exclusive lock on D that it needs, because T₃ holds a shared lock on D. If we allowed T₁ to wait, the graph would be:

     T₃ ----> T₂ ----> T₁ <---- T₄
      ^                   |
      |___________________|

Thus, T₁ must abort and relinquish its locks. At that time, T₂ and T₄ can each get the locks they need. When T₂ finishes, T₃ can proceed.

Exercise 19.3.7

Suppose T₁ is a transaction that tries to lock elements A and B in that order. There are also an indefinitely long sequence of transactions T₂, T₃,... that lock B and then A.

Initially, T₁ locks A and T₂ locks B and then requests a lock on A. T₂ therefore waits for T₁. Meanwhile, T₃ starts and requests a lock on B. Thus, T₃ waits for T₂, and the waits-for graph is:

     T₃ ----> T₂ ----> T₁

When T₁ requests a lock on A, it would cause a cycle, so T₁ is rolled back. Now, T₂ completes and releases its locks. The lock on A is given to T₁ and the lock on B is given to T₃. Next, T₃ requests a lock on A, but has to wait for T₁. Also, T₄ starts up and requests a lock on B, thus being forced to wait for T₃. The waits-for graph is now:

     T₄ ----> T₃ ----> T₁

The above graph is just like the first, except T₄ and T₃ have replaced T₃ and T₂, respectively. As long as there is a supply of transactions like T₂, T₃, and T₄, they can prevent T₁ from finishing ever.

Notice that the transactions in the example above request locks on A and B in different orders. Perhaps forcing transactions to request locks in a fixed order will solve the problem. Indeed, if we follow the policy that when a lock becomes available, it is given to a waiting transaction on a first-come-first-served basis, then any transaction that is waiting for a lock will eventually become the transaction that has been waiting for the lock the longest. At that point, the next time the lock becomes available, it will be granted to that transaction, which thus makes some progress. The above observation is the germ of an inductive proof that requesting locks in order plus first-come-first-served will allow every transaction eventually to complete.

However, if the scheduler is able to give locks to any transaction it wishes, then it is easy to make a transaction starve. Suppose T₁ wants a lock on A when some other transaction has that lock. As long as there is always another transaction besides T₁ waiting for a lock on A, and the scheduler always chooses to give the lock to some transaction other than T₁ when the lock becomes available, T₁ never makes any progress.

Last, let us consider deadlock prevention by timeout. If the duration before a timeout can vary even slightly, then the example given initially for deadlock prevention by avoiding cycles will work. That is, even though T₂ has been waiting slightly longer for a lock than T₁, if we allow that T₁ might timeout first, then the sequence of events described above could also occur with timeout-based deadlock prevention.

If timeouts occur for any transaction at exactly t seconds from when it first started to wait, then we need a simple modification of the above example. Assume that T₁ is faster at requesting its second lock than any of the other transactions. Then, after T₁ gets a lock on A and T₂ gets a lock on B, T₁ next requests its lock on B and starts to wait. Slightly later, T₂ requests a lock on A. T₃ requests its lock on B just before T₁ times out, so when that timeout occurs, T₂ completes, gives its A-lock to T₁ and its B-lock to T₃. When T₂ next requests a lock on B and starts to wait, this cycle can repeat with T₃ and a new T₄ in place of T₂ and T₃.

Return to Top

Solutions for Section 19.4

Exercise 19.4.1

Define U to be the sum over all sites i of 10du_i. That is, U is the cost of sending one second's worth of updates to one site other than the site at which the update originated.

Now, consider the value of creating a copy of R at site i. On the positive side, the queries generated at site i that used to have to be answered remotely, at a cost of 10cq_i, can now be answered for cq_i, a positive benefit of 9cq_i.

However, the additional copy of R has to be updated. The cost is du_i for the updates generated at site i and U - 10du_i for the updates generated at all other sites. Recall that U represents the cost of a remote update for all generated updates, so if we subtract 10du_i we get just the cost of the updates that do not originate at i.

In order for the benefit of creating a copy at i to outweigh the additional update cost, we must have

9cq_i + 9du_i >= U

Thus, the optimal selection of sites at which to maintain copies of R is all sites for which the above inequality holds. Roughly, we favor those sites at which there is a lot of activity, but the larger U is (i.e., the greater the update rate), the less likely we are to want to create many copies.

There is one special case: what if the inequality fails for all i; that is, it is not cost effective to create any copies. As long as all the q_i's are zero, that's OK. The proverbial ``write-only database'' needs no copies! However, in realistic cases with a heavy update rate, we shall be forced to create one copy. The correct choice is whichever i has the largest value of cq_i + du_i.

Return to Top

Solutions for Section 19.5

Exercise 19.5.1(a)

There is a component, call it A at the home computer, and components at each of the banks --- components B and C. Component B receives the directive from A. B checks that there is adequate money in the account, and aborts if not. Otherwise, B subtracts $10,000 from the account at B and signals C to deposit $10,000 into the designated account at C. Component C checks that the account exists, and aborts if not. Otherwise, C adds $10,000 to the account and informs A of a successful conclusion.

Exercise 19.5.2(a)

(0,1,P), (0,2,P), (1,0,R), (2,0,D), (0,1,A), (0,2,A)

Exercise 19.5.2(b)

In the first phase, the coordinator exchanges the messages (0,1,P), (1,0,R) with site 1 and the messages (0,2,P), (2,0,R) with site 2. These messages may occur in any of (4 choose 2) = 6 possible interleavings. That is, there are four positions in which these messages occur, and we can pick any two of them to devote to the messages involving site 1.

In the second phase, the coordinator can send commit messages to the other two sites in either order. There are thus 6 * 2 = 12 possible message sequences.

Return to Top

Solutions for Section 19.6

Exercise 19.6.1(a)

Suppose that s, x, and i are the numbers of local shared, exclusive, and intcrement locks that a transaction needs to have a global lock of that type. Since no two transactions can hold an exclusive lock on the same element, we need:

2x > n

just as for the shared + increment system discussed in the section. We also cannot allow a shared and exclusive lock at the same time, which tells us:

x + s > n

Likewise, we cannot have an exclusive and increment lock at the same time, so:

x + i > n

Finally, we cannot have a shared and increment lock at the same time, so:

s + i > n

For example, we could require that any type of lock requires a majority. We cannot allow both s and i to be n/2 or less, because of the last inequality. But we could let one of s and i be 1, while the other, and x, are both n.

Exercise 19.6.2(a)

The 90% of the accesses that are read-only require no messages, since there is a lock table and a copy at each site. The remaining 10% of the accesses require exclusive locks. Thus, each requires three messages between the originating site and each of the four other sites, a total of 12 messages. The average number of messages is 1.2.

Return to Top

Solutions for Section 19.7

Exercise 19.7.1

The task of the compensating transaction is first to determine whether the file f is still the present one with that name, or whether it has been replaced with a later version.

If f is still in place, then the compensating transaction must
1. Restore f' if the latter existed, or
2. Delete f if not.
If f has been overwritten, then there is no need to change the file with that name.

To see that this compensation works, we can consider all the possible cases above, and check that the file system, after compensation, is the same as would be the case had f never been written. If f was overwritten before compensation, then we surely leave the file system in the same state as if f were never written. If f is not overwritten, then we have correctly left the file with that name being either f', if it existed, or left the file system with no file of that name.

There are two interesting issues. First, in order for there to be a compensating transaction, it seems we must both remember the time at which the installation occurred (so we can tell whether the file f has been overwritten when the uninstallation occurs), and we must also remember the file f', at least until such time as the file is overwritten, and the new value is itself guaranteed never to be uninstalled (i.e., the saga that wrote it has finished). These features are not commonly available.

Second, note that file systems do not support serializability. Thus, it is entirely possible that the uninstalled file f has been read, and its contents have influenced other files or the behavior of the system.

Return to Top