# CS145 Final Exam Solutions

### Index

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

### Problem 1: (d)

I threatened to put this on the exam, and here it is problem #1. In general, ALL is a more stringent requirement than ANY, so we expect that Q1 is contained in Q2. However, if SELECT c FROM S is empty, then it is impossible to satisfy ANY, while ALL is trivially satisfied. Then, Q2 is contained in Q1. Since both containments could be proper, the queries are different.

### Problem 2: (d)

Notice that the FROM- and WHERE-clauses of the queries are the same when you swap r1 and r3. Thus, they will produce the same answers but with the columns and column-headers swapped. To see that the relations are different, we have only to find an example where the result is not the same when columns are swapped. For example, if we take R(a,b) = {(0,1), (2,1), (2,3)}, then the answer to Q1 is {(0,2), (2,2), (2,2)} and the answer to Q2 is {(2,0), (2,2), (2,2)}. When computing the answer, don't forget to consider the possibility that some of r1, r2, and r3 may refer to the same tuple.

Note: several people answered (a) but pointed out that the results differed in the order of columns. While this difference is significant (try substituting one of these queries for the other in an INSERT statement, for example) I accepted that answer, since it indicated that you had understood what the queries were doing.

### Problem 3: (b)

Q1 produces those (x,y) pairs from R such that x is not in S and y is not in T. Q2 produces those (x,y) pairs from R such that it is not true that both x is in S and y is in T; i.e., either x is not in S or y is not in T. Thus, everything produced by Q1 is also produced by Q2.

### Problem 4: (c)

Q1 produces a-values that appear in some tuple of R and some tuple of S. Q2 produces a-values that appear in the same tuple in R and S. Thus, everything in Q2 is surely in Q1.

### Problem 5: (a)

These both produce the set of C's that are related to a B that is related to an A named Sue.

### Problem 6: (d)

This one is so tricky that I actually had the wrong answer, and didn't discover my error until I surveyed a random sample of answers and saw that people were preferring another answer. Q1 produces the set of C's that are related to a B that is related to an A named Sue. However, because of the way it is stated, a C that is related two n B's that are each related to an A named Sue will appear in the output bag n times. Notice that aName is not a key for A, so there is nothing wrong with there being many A's named Sue.

Q2 is more liberal than Q1 in that it will produce a C in the answer as long as there is any B related to an A named Sue. Thus, as sets, Q1 is contained in Q2. However, OQL produces bags, and Q2 can never produce the same C object more than once. Q1, however, can produce it many times. To see that Q1 and Q2 are different, consider the following objects:

1. There are A objects a1 and a2.
2. a1 is named Sue; a2 is not.
3. There are B objects b1, b2, and b3, and C objects c1 and c2.
4. b1 is related to a1 and c1; so is b2. b3 is related to a2 and c2.

Then Q1 produces c2 not at all, but produces c1 twice --- once for the combination cc = c1 and bb = b1, the second time for the combination cc = c1, bb = b2. On the other hand, Q2 produces c1 once and c2 once.

### Problem 7: (c)

Another subtle one. Roughly, each query produces the bag of integers that are the counts of the sets of B's that are related to a single A. The difference is that Q1 also may produce some 0's if there are A's that are related to empty sets of B's. However, Q2 never looks at those A's and only produces the counts of the nonempty groups of B's.

### Problem 8: (a)

These queries each produce the set of pairs of nodes between which a path exist. It is more traditional to write the recursive rules with the order of subgoals reversed, e.g.,

```     Path(x,y) < Arc(x,z) & Path(z,y)
```

in Q1.

### Problem 9: (a)

Each of these modifications has the effect of replacing every tuple of the form (a,3) for some a by the tuple (a,2). If such a tuple appears more than once in R, the count is preserved as well.

### Problem 10: (c)

Q1 finds tuples where the a-component has the substring NULL. However, Q2 only finds tuples where that substring appears at either the beginning or the end.

### Problem 11: (c)

Q1 produces the largest element of R as many times as it appears in R. Q2 produces the same element, but only once.

### Problem 12: (a)

These are the same. Each groups by a and produces the set of all sizes of these groups. A count is produced only once, even if it is the size of several groups. Notice that it doesn't matter whether we count tuples, as in Q1, or count the values of a component of those tuples, as in Q2. Even if a group has duplicate values of b, each such value is counted. If we wanted duplicate b's counted only once, we would have to say

```     SELECT DISTINCT COUNT(DISTINCT b)
```

### Problem 13: (d)

In the E/R approach, the relation for B has only the attributes associated with B itself plus the key attribute(s) for the isa-hierarchy, which must be found at the root.

### Problem 14: (b)

Answer (a) is the schema for objects of type B; (c) is the schema for objects that are both B's and C's, and (d) is the schema for objects of type A that are neither B's nor C's.

### Problem 15: (b)

The relation for B has only the attributes of B, that is, B(c,d). The relation for A has the key for A, including the attribute it ``borrows'' from B, plus the other attributes for A, yielding A(a,b,c). There is no relation constructed for the supporting relationship R.

### Problem 16: (d)

It is entirely possible that (0,1) pops up in the middle of T2's execution, because although the world has to look serializable from T1's point of view, the same is not true of T2. In fact, since T2 does not modify the database, the world will look serializable to T1 no matter when the two transactions execute.

However, REPEATABLE READ guarantees that if T2 has seen a tuple once, then it will always see it. In particular, having seen (2,3) in answer to the first query of T2, that tuple must be there in response to the second query (which it also satisfies), even if T1 has deleted it and committed. Answer (d) is the only one where the second response is not a superset of the first.

### Problem 17: (a)

An easy one: when the privilege is revoked from B, the CASCADE requires that it be revoked from C as well, because the only way C got that privilege was through B.

### Problem 18: (c)

Answer (a) is wrong because an attribute-based check cannot prevent deletions. Answer (b) is wrong because the foreign-key constraint is in the wrong direction. However, (c) is exactly what is needed, since the required constraint is a statement of what this foreign-key constraint means.

### Problem 19: (d)

Answer (a) doesn't make sense; you cannot apply SUM in the WHERE-clause (except in subqueries). Answer (b) doesn't even address the limit of \$1,000,000. Answer (c) is not quite right. If an employee is the manager of two departments, then in the group for the department of that employee in Emps, that employee's salary counts twice. Answer (d) is exactly what is needed. It groups employees by department, sums their salary by departments, and checks for groups with a sum over \$1,000,000.

### Problem 20: (a)

Nothing much to this question. UNKNOWN is the truth-value of any comparison involving NULL.

### Problem 21: (c)

Notice that the INSERT statement is executed 11 times, for i = 1,2,...,11. The exit occurs after the insertion on the 11th round. Thus, (a) is wrong and (c) is a correct statement. Clearly (b) is wrong, since only teams with scores at most 11 are ever extracted from Teams or put into NewTeams. Answer (d) is wrong because the effects of committed modifications on the database are permanent. Note that any successful PL/SQL statement is committed at the end, even if you don't say COMMIT anywhere.

### Problem 22: (c)

While it is helpful if score is a key for teams, all that is really needed is that there be exactly one team for each of the first 11 scores (1 through 11). We could, for example, have several teams with a score of 12 and yet the PL/SQL would execute successfully.

### Problem 23: (a)

The five (a,b,c,d) tuples in the natural join are (0,1,2,3), (4,5,2,3), (4,5,6,7), (4,5,10,11), and (4,5,10,3).

### Problem 24: (b)

First, let us compute the outer join of R and S. The natural join has tuples (0,1,2), (4,5,2), (4,5,6), and (4,5,10). Now, (8,9) is dangling in R, and (13,10) is dangling in S, so we add (8,9,NULL) and (NULL,13,10).

Then, we must take the outer join of this relation of six tuples with relation T. We get the five tuples in the answer to Question 23, of course. However, tuple (NULL,13,10) joins with (10,11) and (10,3) of T to give us tuples (NULL,13,10,11) and (NULL,13,10,3). In addition, (8,9,NULL) is dangling, since the NULL cannot match any value in T, even another NULL. Thus, we add (8,9,NULL,NULL), a total of eight tuples.

### Problem 25: (d)

Answers (a) and (b) are wrong, since REF cannot be applied to an attribute name by itself. Rather, it must be applied to a value, which is a component of a specific tuple. Answer (c) is wrong because THE only applies to values of table type. Answer (d) works; I tried it.

### Problem 26: (c)

Think about the five attributes arranged in a circle, with A following E. Then the five FD's say that any two adjacent attributes functionally determine the next around the circle (clockwise, say). That tells us that any three attributes must have two adjacent ones on the circle, those determine the attribute clockwise, which (together with one of the original three) determines the next, and so on. That is, any three or more attributes is surely a superkey.

However, the five pairs of attributes that are not adjacent on the circle (the five pairs that are not left sides of given FD's) are not keys; in fact they don't determine any attributes except themselves. Surely the empty set and the five singletons are not superkeys either. Thus, there are 11 subsets of {A,B,C,D,E} that are not superkeys, and the remaining 32 - 11 = 21 are superkeys.

### Problem 27: (b)

Here's the trick. No FD derived from the given FD's can have a singleton left side. Thus, any projected FD will have to have at least two attributes on the left, and in order to be nontrivial it will have to have a third attribute on the right. Thus, in any three-attribute schema, the left side of a nontrivial FD is surely a superkey, since it determines three attributes.

### Problem 28: (d)

Answer (a) isn't even true for sets. In fact, the left side is the complement of the right side. Answer (b) is true for both sets and bags. For bags, both sides contain an element x as many times as it is in S plus twice the number of times it is in R.

Answer (c) likewise holds for both sets and bags. For bags, each side contains x the minimum of the number of times it appears in R and S. That count is true for the left side by definition of bag intersection. To see it holds for the right side, let x appear r times in R and s times in S. If r <= s, then x appears r - 0 = r times on the right. If r > s, then x appears r - (r - s) = s times on the right. In either case, the number of appearances of x is the smaller of r and s.

However, (d) is true for sets but not bags. It is easy to check for sets using Venn diagrams or similar reasoning. However, for bags, let x appear once in R and once in S. Then x appears twice on the left and only once on the right.

### Problem 29: (b)

If two tuples that agree on A and disagree on B remain in the relation, then there is sure to be a violation of the MVD. To see why, not that if we swap different B's, we change by one the number of 1's in each tuple. Thus, since all tuples of R have an even number of 1's, the tuples with B's swapped cannot be in R. Therefore, there can be only two of the four tuples with A = 0 remaining and only two of the four tuples with A = 1. Should there be three from either group, then two would have to differ in B. We conclude that 4 tuples is the maximum we can leave.

### Problem 30: (b)

If we put one attribute on the left side of a FD, say A -> B, then we can surely find two tuples in R that agree on A and disagree on B. All we need to know is that after fixing A, we can choose C to be either 0 or 1 so that the sum of the number of 1's among the A, B, and C components is even. Then choosing D = 0, we can guarantee that the resulting tuple is in R. Note that this argument applies equally well to any three attributes, so we see that given any attribute on the left and any other on the right, we can find in R two tuples that agree on the left and disagree on the right. That is, this FD is not satisfied.

Similarly, if we pick any two attributes on the left and a third on the right, we can fix the two on the left and find two tuples that disagree on the right. For each tuple, pick the fourth attribute to be 0 or 1, whichever makes the number of 1's in the tuple even.

However, if there are three attributes on the left, then R has only one tuple with any given combination of values for those attributes, and therefore the FD must be satisfied. Thus, only the four FD's with three attributes on the left are satisfied.

### Problem 31: (d)

You don't use the colon in the WHEN clause, but you must in the body of the trigger.

### Problem 32: (d)

This turned out to be surprisingly hard. Answer (a) is incorrect, because it is not a model. If you set x = 3 in the second rule, you violate that rule. That is, the subgoals of the body, B(3) and NOT P(3), are both true, yet the head, Q(3), is false. Similarly, (b) is incorrect. If you set x = 2, you violate the first rule. Answer (c) is the stratified model, so that is not correct.

However, (d) is a model other than the stratified model. Even though Q(2) makes no ``sense,'' it doesn't violate anything. The second rule, with x = 2, has a true head, and thus cannot be false. The third rule with x = 2 has a false body, and thus also cannot be false. You may check that all the rules are rendered true by each possible value of x.

### Problem 33: (a)

The last problem also fooled a lot of people. I suspect you were calculating averages so fiercely that you forgot to check for a primary-key violation. The sequence of events is as follows:

1. The insert of Fred succeeds, because his salary is less than the current average of 20,000. The new average salary is 18,000, and the total is 72,000.

2. The update of Sue's salary is rejected because it is higher than the average.

3. The insertion of a tuple for Sally is rejected because there is already a tuple with the key value 'Sally'.

4. The deletion of Joe's tuple succeeds; neither of the constraints on Foo affects a deletion. The total salary is decreased by Joe's 20,000 salary, to 52,000.