# Assignment #8 Due Wednesday December 4

This assignment will be graded out of 40 points.

Important note: We plan to distribute a sample solution for this assignment as soon as possible in order to help students prepare for the final exam. Therefore, for on-campus students, the late deadline for this assignment is Thursday at 5:00 PM. For SITN students, there is no late deadline for this assignment--SITN assignments must be sent with the Thursday courier at the latest.

1. (2 points)
Consider a relation R(A,B,C). Suppose R contains the following four tuples:
 A B C 1 2 3 1 2 4 5 2 3 5 2 6
List all nontrivial multivalued dependencies that hold on R.

2. (3 points: 2 for part (a) and 1 for part (b))
Consider a database for a university that may include information about students, courses, professors, etc.

(a)
Specify an example relation for this database and a set of functional and multivalued dependencies over the relation such that the relation is in Boyce-Codd Normal Form (BCNF) but is not in Fourth Normal Form (4NF). Your relation need not be extensive (i.e., it need not include many attributes and it may represent only a very small part of the complete university information), but the schema and the dependencies should be realistic in their modeling of the real world. You should make no assumptions besides those captured by the schema and the dependencies.

(b)
Decompose your relation from part (a) so that the new schema is in 4NF.

3. (3 points)
The course reader erroneously states that the combining rule for functional dependencies does not hold for multivalued dependencies. In fact it does hold, and in this problem you are to formally prove it. The combining rule for multivalued dependencies is stated as follows:
If A1,...,An -->> B1,...,Bm and A1,...,An -->> D1,...,Dj hold for R
then A1,...,An -->> B1,...,Bm,D1,...,Dj holds for R.
You are to provide a formal proof of this rule based on the formal definition of multivalued dependency, roughly in the spirit of the proof you gave for Problem 5 on Assignment #7. The formal definition of a multivalued dependency is repeated here for your convenience:
A1,...,An -->> B1,...,Bm holds for R iff:
for all tuples t and u in R such that t[A1,...,An] = u[A1,...,An]
there exists a v in R such that:
(1)
v[A1,...,An] = t[A1,...,An],
(2)
v[B1,...,Bm] = t[B1,...,Bm], and
(3)
v[C1,...,Ck] = u[C1,...,Ck]
where C1,...,Ck are all attributes in R except those in {A1,...,An} UNION {B1,...,Bm}.
4. (1 point)
There is a rule (page 117 of the course reader) that states ``every functional dependency is a multivalued dependency.'' Prove that the converse is not true - it is not true that every multivalued dependency is a functional dependency. Your proof should follow roughly the same form as your proof for Problem 6 on Assignment #7: you should provide the simplest counterexample you can find for which the statement does not hold. Be sure to give a relation schema, a multivalued dependency A1,...,An -->> B1,...,Bm, and a relation instance (set of tuples) for which the multivalued dependency holds but the corresponding functional dependency A1,...,An --> B1,...,Bm does not hold. As an additional requirement, the multivalued dependency in your example should be a nontrivial one.

5. (10 points: 2 per part)
Consider the following ODL schema, which is a modified version of the schema used in Problem 5 on Assignment #2.
```  interface MedStudent {
extent MedStuds;
key ID;
attribute integer ID;
attribute Struct<string first, string last> name;
relationship Set<Internship> applied-to
inverse Internships::applied-for; }

interface Internship {
extent Positions;
key (hospital-name, city);
attribute string hospital-name;
attribute string city;
relationship Set<MedStudent> applied-for
inverse MedStudent::applied-to; }```
Write OQL queries for each of the following.

(a)
Find the ID of all medical students whose first name is Mary.

(b)
Find the ID and last name of all medical students who have applied to an internship at a hospital in San Francisco. Do not repeat (ID,last-name) pairs in the result, even if the student has applied to many internships in San Francisco.

(c)
If you used distinct in your answer for part (b), rewrite the query so you don't need to use distinct. Conversely, if you didn't use distinct in your answer for part (b), rewrite the query so you do need to use distinct in order to guarantee that duplicates are eliminated.

(d)
Find the name of all hospitals in San Francisco such that a medical student with ID=25 has applied for an internship at that hospital, and all internships the student has applied for are in San Francisco or San Jose.

(e)
Recall that the result of an OQL query or subquery is a set or a bag. OQL allows two sets (bags) to be compared using =, where two sets (bags) are equal if they contain exactly the same objects. Find all pairs of medical student ID's such that the two students have applied to internships at the same set of hospitals in San Francisco. (The students may have applied to different internships at hospitals in other cities.) Return each pair of ID's exactly once.

6. (8 points: 2 per part) Consider for one last time our favorite relational schema, almost unrecognizable now because it's been converted to use SQL3 row types.
```   create row type AddressType (street string, city string)
create row type CompanyType (name string, city string)

create table lives of type PersonType
create table located of type CompanyType
create table works (person ref(PersonType), company ref(CompanyType),
salary integer)
create table manages (employee ref(PersonType), manager ref(PersonType))```
Write SQL3 queries for each of the following. You may assume that person and company names are unique, that all people have exactly one address, one job, and one manager, and that all companies are located in exactly one city.

(a)
Find the names of all people who live in Palo Alto.

(b)
Find the names of all people who work for IBM.

(c)
Find the names of all people who live in the same city as the company they work for and earn a salary of at least \$50,000.

(d)
Find the names of all IBM employees who live in the same city and on the same street as their manager.

7. PDA: Written (4 points: 2 each for parts (b) and (c))

(a)
Remind us of the relational schema S you're using for your PDA.

(b)
For each relation R in S: Are there any nontrivial multivalued dependencies that hold on R? If so, decompose R into smaller relations so that each relation is in 4NF.

(c)
Modify your schema to use SQL3 row types. Use at least one instance of a row type in order to create a structure-valued attribute, and at least two instances of references to row types.

8. PDA: Programming (9 points: 3 for part (a) and 6 for part(b))

This week you'll experiment with using stored procedures for interaction with your personal database.

(a)
Pick your favorite SQL query or modification over your PDA database from Problem 4 on Assignment #4. Using sqsh, create a parameterized stored procedure for the SQL statement and invoke it several times with different parameters. For guidance, please see Handout #23: ``Embedded SQL, Library SQL, and Stored Procedures in Sybase,'' the section ``Using sqsh'' under ``Stored Procedures.'' Turn in a script showing that you have done the required work.

(b)
Convert your application program from Problem 8(b) on Assignment #7 to use stored procedures to interact with the database server, as described in the handout ``Embedded SQL, Library SQL, and Stored Procedures in Sybase,'' the section ``Using DBLib'' under ``Stored Procedures.'' Convert at least one SQL query and one SQL modification in your code to a stored procedure. Turn in your C code along with a script showing an interaction with your program in which stored procedures are invoked.