# CS145 Final Exam Solutions

### Problem 1

Here is an E/R diagram. The error codes used were as follows:

A: No roles on the relationship ``connected continents.'' (-1) This was an amazingly common error.

B: Creating entity sets for ``oceanic islands'' and ``islands near a strait.'' (-1) These concepts don't explicate anything and can be considered unnecessary complexity.

C: Use of rounded arrows in situations where they are not supported by the problem statement. (-1)

### Problem 2

The following penalty applies to all queries:

A: Query not succinct -1

```(a)
FROM Drinkers AS D, Frequents AS F, Likes AS L, Sells AS S
WHERE D.name = L.drinker AND
D.name = F.drinker AND
S.beer = L.beer AND
S.bar = F.bar AND
S.bar = 'Joe''s Bar'

This got full credit, but a more succinct query would look like

FROM Drinkers AS D,
((Frequents NATURAL JOIN Likes) NATURAL JOIN Sells) AS F
WHERE D.name = F.drinker AND
F.bar = 'Joe''s Bar'

(b)
DELETE FROM Drinkers
WHERE phone LIKE '(650) ___-____'

phone LIKE '(650)%' also got full credit

(c)
SELECT price, COUNT(DISTINCT bar)
FROM Sells
GROUPBY price

B: Using SELECT in the FROM clause -2
C: Not using DISTINCT -1
D: Unnecessary join with an irrelevant relation -2

(d)
CREATE ASSERTION DoesnotDrinkAtHome CHECK
(NOT EXISTS
(SELECT *
FROM Drinkers AS D, Frequents AS F, Bars AS B
WHERE D.name = F.drinker AND
F.bar = B.name AND

(e)
CREATE TABLE Bars (
name VARCHAR(20) PRIMARY KEY,
license CHAR(4) CHECK (licence IN ('full', 'beer'))
)

B: Using something other than CHAR(4) for license -1
C: Not marking PRIMARY KEY or UNIQUE -2
D: Not putting the constraint for license -2

(f)
INSERT INTO Bars(name)
SELECT DISTINCT bar
FROM Frequents
WHERE bar NOT IN (SELECT name FROM Bars)

B: Not using DISTINCT for bar -1
C: Adding own default values -1
```

### Problem 3

NOTE: Due to the html format, we use 'SIGMA_{C}(R)' to express select from R on condition C. Similarly, 'PI_{A}(R)' projects R on A. 'R JOIN S' returns the natural join of R and S. 'R INTERSECT S' returns the intersect of R and S. The correct symbols are in the text book.

(a) [5pt]
Ans(name1, name2) := SIGMA_{name1<name2}(D1 JOIN D2)

(b) [5pt]

BBars(bar) := PI_{name}(Bars)
Ans(bar) := PI_{bar}(Frequents) INTERSECT PI_{bar}(Sells) - BBars

(c) [5pt]

SallyLikes(beer) := PI_{beer}(SIGMA_{drinker = 'Sally'}(Beers JOIN Likes))
NotSallyLikes(beer) := PI_{beer}(Sells) - SallyLikes
Ans(bar) := PI_{bar}(Sells) - PI_{bar}(Sells JOIN NotSallyLikes)

Error Codes:

A (-1): For question (b), no renaming for PI_{name}(Bars)
B (-4): For question (c), using 'Sell JOIN SallyLike'. It returns "bars that sell some beer that Sally likes".
C (-1): For question (c), using 'PI_{name}(Bars) - PI_{bar}(Sells JOIN SallyLike)'. It returns "bars that don't sell any beer that is not what Sally likes", and may include bars that doesn't sell beers at all)
D (-5): Logic confusion
E (-1): For question (b), using 'UNION' instead of 'INTERSECT'.
F (-3): For question (c), using theta join on Sells.beer <> SallyLike.beer. This returns the right answer only when sally only likes one beer.
G (-3): For question (c), querying for bars selling beers that only Sally (no other)likes.
H (-5): Writing SQL instead of relational algebra
I (-2): For question (b), using '...Bars... - ... Frequent....' It returns "bars mentioned in Bars, not in Sells and Frequents".
J (-1): For question (a), using 'name1 <> names or name1 <= name2'
K (-1): Unnecessary complicated and inefficient answer (e.g., include multiway join when not necessary)
M (-4): For question (b), using theta join on Frequents.bar <> Bars.name
N (-3): For question (c), using 'Sells - Sells JOIN SallyLikes' to express "bars selling beers other than Sally like". It actually returns "bars that doesn't sell any beer that Sally likes".

### Problem 4

(a) [5pt]
Ans(n1, n2) <- Drinkers(n1, a, p1) AND Drinkers(n2, a, p2) AND n1 < n2.

(b) [5pt]

BBars(b) <- Bars(b, _, _).
Ans(b) <- Frequents(_, b) AND Sells(b) AND NOT BBars(b).

(c) [5pt]

SellOtherBeer(bar) <- Sells(bar, beer, _) AND NOT Likes('Sally', beer).
Ans(bar) <- Sells(bar, _, _) AND NOT SellOtherBeer(bar).

Error Codes:

A (-1): For question (b), using 'NOT Bars(bar, _, _)' or 'NOT Bars(bar, v1, v2)'. It is a violation of rule safty.
B (-4): For question (c), writing a query for "bars that sell some beer that Sally likes".
C (-1): For question (c), writing a query for "bars that don't sell any beer that is not what Sally likes". It include bars that doesn't sell beers at all.
D (-5): Logic confusion
E (-2): In correct logic for negation in datalog, even if it is safe.
F (-3): For question (c), using 'Sells(_, beer1) AND Likes('Sally', beer2) AND beer1 <> beer2'. It returns the right answer only when sally only likes one beer.
G (-3): For question (c), querying for bars selling beers that only Sally (no other) likes.
J (-1): For question (a), using condition on 'name1 <> name2 or name1 <= name2'
M (-4): For question (b), using 'Ans(bar1) <- Frequents(bar1, _) AND Sells(bar1, _) AND Bars(bar2, _, _) AND bar1 <> bar2'. It returns the wrong answer as long as Bars mentions more than one bar.

### Problem 5

(a) AC, AD, and AE are the keys.

(b) They all do!

(c) Only A->B.

(d) E->C and AC->E. Error code A (-3) was for forgetting the latter, which is the only hard part of the problem.

### Problem 6

The key is NYYYN.

### Problem 7

I had intended that people would observe A->->B and, with A->->C and the union and complementation rules discover that A->->X, where X is any subset of BCD. Of course the empty set and X=BCD are trivial, and A->->C was already given, leaving five more multivalued dependencies to discover.

However, some people were too smart for me. They realized that the problem statement allowed left sides that had more than one attribute and came up with dependencies like AB->->C or AC->->B. Those answers were accepted, of course. However, some people offered multivalued dependencies that did not have A on the left, such as B->->C. These are not correct. In proof, here is a relation that satisfies the given dependencies A->B and A->->C, but no multivalued dependency whose left side is B. A similar example can refute any other claim of a nontrivial MVD that doesn't have A on the left.

ABCD
a1bc1d1
a1bc1d2
a1bc2d1
a1bc2d2
a3bc3d3

### Problem 8

```(a)

(b)

(c)
ManyMany(x,y) <- SomeCon(x,y) AND NOT ManyOne(x,y)
AND NOT ManyOne(y,x)

(d)
ManyMany: 1, SomeCon: 0, ManyOne: 0
```
Since the strata are all finite, my answer to (c) is stratified.

Error Codes:

A: Using LRT as a subgoal in rules (-1). Link already guarantees that its first and third arguments are LRT's, so the appearance of the LRT predicate on the exam was an intensional red-herring.

B: Using OR as an operator in rules (-2).

C: In part (b), forgetting to include reverse links (-1 each occurrence).

D: Only allowing in part (b) paths that had all forward or all backward links (-3).

E: In part (c), omitting the second AND NOT (-2).

F: In part (b), allowing only paths that consist of forward links followed by only backward links (-3). In some cases, the limitation was all backward followed by all forward. Most people who did this probably won't believe they made a mistake until they try it out. Intuitively, if none of your recursive rules have a variable that is shared (like z in my solution) and appears in the same position (first or second) in two subgoals, then you have this limitation. Test out your rules on a Link relation that consists of the following two tuples: (a,l1,b) and (c,l2,b). Do you get SomeCon(a,c) and SomeCon(c,a)? You should. If that succeeds, try the same test on the Link relation with just the two tuples (a,l3,b) and (a,l4,c).

### Problem 9

Here is the Stratum graph. The error codes:

A: Missing minus sign (-2). The most common error was not realizing that SQ3 depends nonmonotonically on SQ2. To see why, suppose that some new tuples were somehow added to the result of SQ2 (line 6). Then the test x<ALL... on line 5, which might have succeeded for a given x, no longer succeeds, because one of the new values from SQ2 is less than x.

B: Interchanging SQ2 and SQ3 (-1).

### Problem 10

EXEC SQL BEGIN DECLARE SECTION; int a, b; EXEC SQL END DECLARE SECTION; EXEC SQL DECLARE c CURSOR FOR SELECT * FROM R; EXEC SQL OPEN CURSOR c; while (1) { EXEC SQL FETCH FROM c INTO :a, :b; if (NO_MORE_TUPLES) break; printf("%d\n", (a>b) ? a : b); } EXEC SQL CLOSE CURSOR c;

Common errors and key for codes:

• A: Forgot to close the cursor. (-2)
• B: Errors in declaring variables, such as not putting a and b between EXEC SQL BEGIN/END. This is used for variables that are accessible to both SQL and C. (-3)
• C: FROM is missing in the FETCH line. (-2)
• D: Forgot to open the cursor. (-2)
• Forgot CURSOR in the OPEN or CLOSE cursor line. (-1)

### Problem 11

(a)
SELECT DISTINCT a.ownedBy.name, a.ownedBy.addr FROM Autos a WHERE a.brand = "Chevrolet"
(b)
SELECT DISTINCT a.brand From Owners o, o.owns a WHERE o.name = "Sally"
(c)
SELECT a1.serial#, a2.serial# FROM Owners o, o.owns a1, o.owns a2 WHERE a1.serial# < a2.serial#

Common errors and key for codes:

• A: Not a collection in FROM or quantifier clause. For instance, in "FROM Autos a, a.ownedBy o" a.ownedBy is not a collection. (-2)
• B: Can be simpler / more efficent. (-1)
• C: Is a collection. For instance, in "SELECT a.owns.name", where a is an Autos object, a.owns is a set of Owners, so a.owns.name doesn't make sense. (-2)
• D: Use " instead of ' for quotes in OQL. (-1)
• E: Forgot object and just used attribute. For instance, "SELECT name FROM OWNERS o". (-2)
• F: Not legal OQL (-3)
• G: Misnamed an attribute or relationship. (-1)
• H: used != instead of < or > in part c. (-1)
• I: Forgot DISTINCT keyword in part b. (-2)