Question 4 Solutions

(a) The table for FinalAnswer should look like:
idNumbertypecageNumbername
1 Zebra 10 Ann
3 Monkey 10 Bob
4 Kangaroo 30 Chris
6 Monkey 40 Bob

(b) The query in part (a) asks for all animals (their id, type, and cageNumber) such that the animal has the same person as both its type keeper and its cage keeper (include that person's name in the tuple for the animal in the result).

Notes: Many people answered this along the lines of "All people who..." but this is not quite correct -- a query asking for all people that fulfill a certain condition would have each person appearing at most once in the result. This query could have the same person appear many times, but each animal could only appear at most once. Thus it was looking for all animals that fulfilled a certain condition.

(c) One possible correct sequence of relational algebra expressions is:

Animals2(idNum2, type2, cageNumber) := Animals
AllPairs := Animals2 JOIN Animals
Answer := SELECT_{type2 < type} (AllPairs)

Notes: Alternatively, you can use a theta-join, but then you have to be sure to project onto the correct attributes because otherwise you will get cageNumber twice in your result. Also, the select condition can alternatively be {type2 != type and idNum2 < idNumber}. A common mistake was to only have {type2 != type}, which will not prevent duplicate pairs from appearing in the results, of the form:

idNumbertypeidNum2type2cageNumber
1 Zebra 2 Monkey 10
2 Monkey 1 Zebra 10

The delta operator does not help in this case since the tuples are different syntactically. The problem is that they give the same semantic information.

(d) One possible correct sequence of relational algebra expressions is:

Counts(cageNumber, numAnimals) := GAMMA_{cageNumber,
                                         COUNT(idNumber)->numAnimals}
                                        (Animals)
Temp1 := Counts JOIN Cages
Temp2 := SELECT_{numAnimals > maxAnimals} (Temp1)
Answer := PROJECT_{cageNumber} (Temp2)

Notes: Like in part (c), you can use various alternatives involving a theta-join instead of a natural join. The important thing was the grouping and the join with cages. A few people did the join before the grouping, which only works if you then including maxAnimals as one of your grouping attributes. This is ok because each unique cageNumber will be associated with exactly one value for maxAnimals, so you get the same groups in your result. However, semantically this approach is a little odd, and it was easy to mess up by forgetting to include maxAnimals and then trying to use it later.

Error Codes:

Note: Parts (c) and (d) were supposed to be written in relational algebra. There were several clues to this in the problem statement. At the beginning of problem 4, it stated that answers could be written as a sequence of assignment statements or a complex expression. These terms do not apply to SQL. Furthermore, part (a) being in relational algebra should have been a large clue. Finally, in part (d) it stated that you could use the extended relational algebra. Clearly we would not have said that unless we expected the answer to be relational algebra. We took off 3 points if the answer was in SQL instead of relational algebra, but if both (c) and (d) were in SQL, we did not deduct twice.

Part a:
4A -1 Mistake in single row
4B -2 Missing entire row
4C -3 Missing two rows
4U -5 Only has schema
4V -4 WAY more tuples than should be in result

Part b:
4D -1 Partially incorrect answer
4E -2 Wrong answer

Parts c and d:
4F -3 Wrote SQL. Only took off once if did SQL on both.
4M -5 Hardly anything there (just a join or just a project, etc)
4I -1 Incorrect schema for result
4S -1 Mixes a bit of SQL syntax in with the relational algebra, or just incorrect relational algebra syntax

Part c:
4G -1 No renaming
4H -2 Doesn't ensure different types
4J -2 Doesn't ensure no duplicate pairs
4K -3 If has 4H and 4J (b/c related - if solve one could solve other)
4L -2 Doesn't ensure same cage

Part d:
4N -2 No join between Cages and Animals
4P -1 Have less than max instead of greater than
4Q -1 Incorrect application of gamma/count/select
4R -2 No attempt to select for correct condition
4T -2 No use of gamma (or subquery if in SQL)