CS145 - Spring 2002
Introduction to Databases

Assignment #6 -- Due Wednesday May 22

The procedure for turning in this assignment, the late policy, and the grading is exactly the same as for Assignment #1.

Exercises

This problem is a restatement and extension of Exercise 8.6.3 in the textbook (page 409). Consider the four operations in parts (a), (b), (c), and (d) of Exercise 8.6.1 in the textbook (page 409). For this problem you need only consider the English descriptions of the operations - you need not rewrite them in SQL unless it helps you.
(a) Suppose a transaction T executes the operation in part (a) of Exercise 8.6.1, and any number of other transactions may at the same time be executing any number of any of the four operations (a), (b), (c), and (d). Briefly describe in English a behavior of T's operation that can occur if all transactions are running with isolation level READ UNCOMMITTED that cannot occur if all transactions are running with isolation level SERIALIZABLE. Note that in coming up with such a behavior, if it helps you may specify a scenario in which some transactions commit and others rollback.
(b) Same as part (a) except suppose transaction T executes the operation in part (b) of Exercise 8.6.1.
(c) Same as part (a) except suppose transaction T executes the operation in part (c) of Exercise 8.6.1.
(d) Same as part (a) except suppose transaction T executes the operation in part (d) of Exercise 8.6.1.
(e) through (h) Same as parts (a) through (d) except suppose the transactions are running with isolation level READ COMMITTED instead of isolation level READ UNCOMMITTED.
Answer any number of the subparts (including zero) of Exercise 8.7.1 in the textbook (page 421). These problems are very easy; the hardest part is finding the figures or exercises referred to in the problem.
Give an example of two simple relation schemas R1 and R2 where the PRIMARY KEY of R1 is also a FOREIGN KEY referencing the PRIMARY KEY of R2, and the PRIMARY KEY of R2 is also a FOREIGN KEY referencing the PRIMARY KEY of R1. You may choose any application domain you like, but the relations you choose and the pair of referential integrity constraints should make sense as representations of the real world. Do you foresee any problems loading your relations? How about updating them?
Recall that materialized views are an alternative to the standard virtual views supported by most database systems. In a materialized view, the contents of a view are computed and the result is stored in a database table, so that a reference to the view in a query can simply access the stored table. However, when contents of a base relation referenced in the view change, then the contents of the materialized view must be modified accordingly. For example, consider a base relation R(A,B), where A and B are of type integer and A is a key for R. A materialized view V that contains those tuples of R satisfying R.A > 5 can be created as follows:
```
  CREATE TABLE V (A int PRIMARY KEY, B int)
```
Initially, V is populated using the following SQL statement:
```
  INSERT INTO V
    SELECT * FROM R WHERE A > 5
```
Now when we refer to view/table V in queries we obtain the desired result.
If an INSERT statement is executed on R, then V must be modified accordingly. This behavior can be implemented using a trigger:
```
  CREATE TRIGGER VinsR
  AFTER INSERT ON R
  REFERENCING NEW TABLE AS NT
  INSERT INTO V
    SELECT * FROM NT WHERE A > 5
```
(a) Write another trigger VdelR to modify V after tuples are deleted from R.
(b) Write another trigger VupdR (or multiple triggers, if appropriate) to modify V after tuples in R are updated.
(c) Write SQL statements and triggers to create, populate, and maintain a materialized view V that projects columns A and B from a base relation R(A,B,C). You may assume that all three attributes are of type integer and that A is a key for R.
(d) (Optional - answer is long) Write SQL statements and triggers to create, populate, and maintain a materialized view V that is the natural join of base relations R(A,B) and S(B,C). You may assume that all four attributes are of type integer and that R.A and S.C are keys for R and S respectively.
Write a realistic row-level trigger that uses all four REFERENCES options - OLD ROW, NEW ROW, OLD TABLE, and NEW TABLE - to achieve its desired effect. Can the same effect be achieved using a row-level trigger with OLD ROW, and NEW ROW only? Using a row-level trigger with OLD TABLE and NEW TABLE only? Using a statement-level trigger?

Challenge Problems

This assignment is a bit long, so feel free to pick and choose from the challenge problems. Finding a good solution to 2 of the 3 challenge problems is easily worthy of a plus.

There's a type of database security that was not covered in class or in the textbook called "statistical authorization." With statistical authorization, some users may be permitted to retrieve only aggregate information from the database, e.g., average salaries but not individual salaries. Furthermore, to prevent users from applying aggregates to very small numbers of tuples (such as the average of one salary!), the system requires that a certain minimum number of tuples contribute to each aggregate result. Finally, to prevent the user from using intersection properties to deduce a single value (e.g., the user could ask for X=sum(A1,A2,A3,A4,A5), then ask for Y=sum(A2,A3,A4,A5), then compute X-Y to deduce the value of A1), the system may require that the tuples participating in multiple queries issued by the same user have a small intersection. In this problem you will explore how, even with such security measures, specific information can still be deduced from the database.
Here's the problem. Consider the simple relation Student(ID,GPA), where ID is a key. Suppose that, for security reasons, the following restrictions are made on user U's set of queries against this relation:
- The result of every query must be a single aggregate value - a SQL aggregate function applied to one of the attributes of the relation.
- At least 4 different tuples must be used in the aggregate to produce each query's result.
- For any two queries issued by user U, the sets of tuples used to produce the two query results must have an intersection no larger than 2.
Give a set of queries that satisfies the above restrictions, and from whose answers you can determine the GPA of the student with a given ID I. You may assume that I is in the range 1-10, that there are at least 50 students with IDs in the range 1 to 50, and that attribute GPA is of type float. Write the queries in SQL, then show the computation that produces the GPA for the student with ID=I from the query results. Use as few queries as you can.
In Project Part 5 you are using Oracle triggers to enforce static constraints on the database (i.e., constraints over individual database states, rather than constraints over how the data evolves over time). Triggers often are used in this way, although they also can be used to enforce dynamic constraints. In this open-ended problem, you are to give a method for inferring, from the WHEN clause of a trigger, what static constraint the trigger is enforcing. You may assume the action of the trigger is simply to raise an error.
More specifically, give a generic method for taking a trigger WHEN clause and translating it into a SQL-99 constraint that specifies the constraint being enforced by the trigger. A few notes:
- This problem is about SQL-99 triggers as covered in class and the textbook, not about Oracle triggers. Nevertheless, you may find it helpful to make a good start at the triggers portion of your project work in this assignment before attempting this problem.
- Your method need not work on every possible WHEN clause. In fact, there's a certain "style" to trigger WHEN clauses that are being used to enforce a constraint, and we will be looking for algorithms that effectively identify and work on WHEN clauses satisfying this style.
- The constraint specifications produced by your algorithm can be any type of SQL-99 constraint covered in class or in the textbook, including arbitrary CHECK constraints and general assertions.
- While this problem may appear simple at first, the complexity comes from the presence of OLD ROW, NEW ROW, OLD TABLE, and NEW TABLE variables used in the WHEN clause, the difference between statement-level and row-level triggers, the presence of subqueries, and so on.
- As you will discover in your project, often multiple triggers are required to correctly enforce a constraint through all possible database update types. If it is helpful for you to consider groups of triggers rather than single triggers in this problem, you may do so.

In class on Monday May 13 we discussed a simple implementation protocol to help motivate the various transaction isolation levels. Specifically, modifying and filling in the "Transactions" lecture notes for May 13 under "Implementation mechanisms," we have:

  Weaker isolation levels can be tricky.  Sometimes it helps to think
  about implementation mechanisms.  That's a serious topic for CS245,
  but briefly:

    READ LOCK on tuple t:   Records that a transaction reads tuple t
    WRITE LOCK on tuple t:  Records that a transaction updates/deletes tuple t
    INSERT LOCK on table T: Records that a transaction inserts into table T

  Assume each transaction obtains locks as it needs them and releases them
  all after commit.

  Serializable transactions:
    To perform any operation on a table: no other INSERT lock
    To read a tuple: no other WRITE lock, set READ lock
    To update or delete a tuple: no other READ or WRITE lock, set WRITE lock
    To insert a tuple: set INSERT lock

  "Read uncommitted" transactions:
    No global checking for INSERT locks
    To read a tuple: no checking or setting
    To update or delete a tuple: no other READ or WRITE lock, set WRITE lock
    To insert a tuple: set INSERT lock

  "Read committed" transactions:
    No global checking for INSERT locks
    To read a tuple: no other WRITE lock
    To update or delete a tuple: no other READ or WRITE lock, set WRITE lock
    To insert a tuple: set INSERT lock

  "Repeatable read" transactions:
    No global checking for INSERT locks
    To read a tuple: no other WRITE lock, set READ lock
    To update or delete a tuple: no other READ or WRITE lock, set WRITE lock
    To insert a tuple: set INSERT lock

Is this implementation protocol pretty much correct? If so, for each isolation level argue in brief why the protocol works. If not, for each mistake in the protocol show one or more counterexamples: Show specific transactions performing specific operations on a specific database such that the protocol is followed but the corresponding isolation level is not guaranteed.

Project Part 5

Warning: This is the first project part in which significant numbers of students will be putting significant load onto the Oracle system at the same time. Unfortunately it is likely that the system will become much, much slower as the deadline approaches. We strongly suggest that you start early for your own benefit, and please be aware that we will not extend the on-time deadline because the system is slow.

Part A: Current Time

The original auction data that we provided for you in XML, which you translated into relations and loaded into your AuctionBase database in Project Part 3, represents a single point in time, specifically one second after midnight on December 20th, 2001 ("Dec-20-01 00:00:01"). In the final part of the project - Part 6, outlined below - you will develop full auction functionality: users will be able to browse items, enter and retrieve bids, create new auctions, run statistics, etc.

To fully test your functionality, and to simulate the true operation of an online auction system in which auctions close as time passes, we suggest that you maintain a fictitious "current time" in your database. Add a new one-attribute table to your AuctionBase schema. This table should at all times contain a single row (i.e., a single value) representing the "current time," which can be updated to represent time passing. (It's up to you whether you also want to permit backward time-travel.) Initialize the table by inserting the current time for the initial state of your database: Dec-20-01 00:00:01.

Note: If you want to input and output dates in Oracle using the same format as in the original XML auction data files, issue the command:

  ALTER SESSION SET NLS_DATE_FORMAT='MON-DD-YY HH24:MI:SS';

For more information see the Oracle Dates and Times help document.

Part B: Constraints and Triggers

You will want to read the Oracle help document Constraints and Triggers for this project part. Two additional Oracle documents that may help you with triggers are:

Please remember that constraints and triggers in Oracle do not conform to the SQL-99 standard, the lecture notes, or the textbook.

If the data in your AuctionBase system at a given point in time represents a correct state of the real world, a number of constraints are expected to hold. To get you started, here are a few possible examples, some of which depend on a particular schema:

In every auction the number-of-bids field (if included) corresponds to the actual number of bids.
In every auction and every bid the quantity (if present) must be greater than 0.
The item-id in every bid corresponds to an actual item.
No auction may have a bid before its start time or after its end time.
There are no bids after the current time.
The quantity in a bid must not exceed the quantity available.
A user may not bid on an item he or she is offering. (This one is a judgment call.)
All sellers and bidders must exist as users. (Whether this one makes sense depends on your relational schema.)
Etc.

Your job is to:

Specify in English every constraint you believe should hold in a correct AuctionBase database, i.e., in a database that represents a viable state of the real world. Consider only constraints that go beyond what is already enforced in the schema, e.g., you do not need to specify that ratings are integers, if indeed you implemented them as such. However, other than simple schema-enforced type constraints, you should list every constraint you can think of.
For each constraint in (1), implement constraint-checking using a key constraint, referential integrity, Oracle's attribute- or tuple-based CHECK constraints, or Oracle triggers. Use the simplest possible mechanism to enforce each constraint, e.g., do not use triggers to enforce a unique key constraint or a simple attribute-level check.
For all cases other than triggers and referential integrity with an ON DELETE option, an error is raised automatically whenever a constraint is violated. For triggers, you may as part of the trigger(s) program a way of "repairing" a violated constraint automatically, or you may simply raise an error with the function RAISE_APPLICATION_ERROR as described in the Constraints and Triggers help document.
Add the constraints and triggers to your AuctionBase database. The three steps we suggest you take are:
1. Use the ALTER TABLE command to add key (UNIQUE), referential integrity, and CHECK constraints. See the Oracle 9i SQL and Constraints and Triggers help documents for examples. The ALTER TABLE command will fail if you try to add a constraint that does not hold on the current database.
2. Unlike constraints, when you add a trigger to the database, the condition part of the trigger is not checked (i.e., there is no "retroactive" firing of the new trigger). Thus, any of your constraints that are enforced by triggers must be checked separately at the time you install your triggers, in order to ensure that the database begins in a consistent state. From that point on, the triggers should ensure that the constraints continue to hold whenever the database is modified. Write and execute a set of queries to check that all constraints being enforced by triggers hold in the current state of the database. Often the easiest way to write constraint-checking queries is to create queries that return an empty result if and only if the corresponding constraint holds.
  At this point, if any constraints do not hold, then either your constraints are inappropriate or your database is in an inconsistent state. Stop now and fix the problem!
3. Create all of your triggers using the CREATE TRIGGER command described in the Constraints and Triggers help document. (If you get an error message "Warning: Trigger created with compilation errors.", type "SHOW ERRORS" to see what's wrong.)
Note that if you recreate your entire database from scratch, you may create your tables with the key, referential integrity, and CHECK constraints already in place, in which case these constraints will be checked on data load. You probably will need to be careful that you load your tables in the correct order for referential integrity checking, or you can use deferred checking as described in the Constraints and Triggers help document. If you take this approach, you still need to follow steps 2 and 3 above for triggers.
For each constraint, demonstrate at least two database modifications relevant to the constraint: at least one that violates the constraint and one that doesn't. Similarly, for each trigger, demonstrate at least two database modifications that activate the trigger (i.e., that match the trigger's ON event): at least one where the trigger condition is satisfied and the trigger action executes, and one where the trigger condition is not satisfied so the trigger action does not execute.
Are there any database modifications that can violate more than one of your constraints simultaneously? If so, demonstrate what happens.

A note about transactions

This is the first part of the project in which you are doing significant updating to a large Oracle database. Assuming you want your database updates to persist, you may need to think a little bit about transactions. Please read the section on Transactions in the Oracle 9i SQL help document, which should provide sufficient information.

What to submit

For this part of the project your submission directory should contain a text README file and nothing else. The README file should describe each constraint you defined and implemented. Choose five representative constraints and for each one list all items (a) through (e) below, carefully labeled. For your remaining constraints, you may list all of (a) through (e), or you may list just (a) and (b).

(a) The English description of the constraint (e.g., "There are no bids after the current time.") and a brief English description of its implementation in terms of your relations (e.g., "Two triggers, one on relation...").

(b) The Oracle constraint definition(s) (CREATE TRIGGER command(s), ALTER TABLE command(s), etc.). If the constraint is specified in the schema definition then include the entire CREATE TABLE command. List the commands in this part, but do not run them.

(c) One or more sample database modifications (SQL statements) that are relevant to the constraint but do not violate it. Include any commands you used to defer constraint checking or to disable/enable triggers. List the commands in this part, but do not run them.

(d) One or more sample database modifications (SQL statements) that are relevant to the constraint and do violate it. Include any commands you used to defer constraint checking or to disable/enable triggers. List the commands in this part, but do not run them.

(e) A transcript showing:

The successful definition of the constraint using your part (b) commands.

That the existing database satisfies the constraint, or that you were able to reload the database with the constraint in place. (For example, for a constraint that all bid prices are positive, you might run the query "SELECT * FROM Bids WHERE Price <= 0" and show that the result is empty.)
The execution of the updates in parts (c) and (d).

Thus, your README file should look like this (although you are free to include (c)-(e) for all constraints if you wish):

[Constraint 1]
(a) [...]
(b) [...]
(c) [...]
(d) [...]
(e) 1. [...]
    2. [...]
    3. [...]

[Constraint 2]
(a) [...]
(b) [...]
(c) [...]
(d) [...]
(e) 1. [...]
    2. [...]
    3. [...]

  ...

[Constraint 5]
(a) [...]
(b) [...]
(c) [...]
(d) [...]
(e) 1. [...]
    2. [...]
    3. [...]

[Constraint 6]
(a) [...]
(b) [...]

[Constraint 7]
(a) [...]
(b) [...]

  ...

Submission instructions are as usual, with only one file in your submission directory. Once again it is crucial to follow our submission file format exactly, and points may be deducted if you do not follow the procedures as specified. Also as usual if you submit more than once, all submissions except your last will be ignored.

Project Part 6 Preview (due May 29)

The final part of the AuctionBase project, Part 6, is not due for two weeks and officially is part of the next assignment. However, it is very open-ended and some students may want to get started early.

In the sixth and last part of the project you are to (finally) create a fully functioning AuctionBase system accessed by users exclusively through a Web interface. You may use your Web-database code for Project Part 2 as a starting point for this project part, or you may start from scratch.

Functionality

The functionality of your AuctionBase system is quite flexible and open-ended. However, we do expect all students to implement some basic capabilities:

Ability to manually change the "current time."
Automatic auction closing. An auction is "open" after its start time and "closed" when its end time is past or its buy price is reached for its entire quantity. Your design may be such that an auction closes implicitly with high enough bids or a time update, or you may have chosen to represent open/closed status with an explicit data field.
Ability for new auction users to provide their information to be entered into the database (name, initial rating if not assigned automatically, optional location and country), if relevant in your schema.
Ability to browse auctions of interest based on a variety of input choices. Possible parameters include open/closed status, category, date, price, substring match in description, etc. Use your imagination.
Ability to see the winner(s) of a closed auction.
Ability for auction users to enter bids on open auctions.
Ability for auction users to add new items up for auction.
Ability to retrieve auction or bidding history for a given user, including current auctions or bids.
Ability to run various statistics over the auctions. Possibilities include average number of bids per user, highest selling price over initial bid, average time to reach buy-price, etc. Use your imagination.

Notes:

All constraints and triggers from Project Part 5 should continue to be active in the "live" AuctionBase system. In particular, constraint violations due to data entry errors or bad input values should be managed gracefully: it should be possible for users to continue interacting with the system after a constraint violation is detected, and the database should not be corrupt. For information on constraint error-handling in your Web interface please see the Constraints and Triggers help document (under Constraint Violations).
You will need to automatically generate unique itemIDs for new auction items. You may do so any way you like.
Don't forget to think a little bit (not much) about transactions, particularly if you're using Pro*C. See the section on Transactions in the Oracle 9i SQL help document.
You don't have to implement user authentication. For example, it's OK to ask the user to enter his username when bidding, without asking for a password.
For Java users: when you generate dynamic HTML pages from your program, please try to use relative paths for URLs, i.e., do not hard-code the machine name and port number into your program (as we may be testing your program on a different machine). If you use relative paths then the browser will automatically find the right machine and port number. This does not apply to C/CGI users, who may continue to hard-code the machine name (cgi-courses) and user name into the program.
For C/CGI users: if you use the cgiparse routines we provided, and have called CleanUp() somewhere, please be sure there is no code that accesses CGI parameters after CleanUp(), since CleanUp() releases memory pointed to by CGI parameters. To be safe, you can simply remove CleanUp() altogether.

User interface

While the functionality of your AuctionBase system is quite open-ended, the interface itself is extremely open-ended. CS145 is not a user interface class and you can certainly get full credit for a solid system with simple input boxes, menus, and/or radio buttons, and simple HTML output tables, similar in style to what you implemented in Project Part 2. (However, as in Project Part 2, you definitely should not be exposing SQL to the end user.) Of course we welcome much snazzier interfaces, with the zenith being a near-replica of eBay itself.

We will select a small number of AuctionBase systems to be demonstrated to the class and the world (via SCPD and Stanford Online) as part of the last class meeting on June 5. Students will not receive extra credit, but they will receive extra recognition and lunch with Prof. Widom. The criteria for selection will be some combination of beyond-the-basics functionality and a good user interface.