CS145 Lecture Notes (8) -- Constraints and Triggers



Integrity Constraints

Integrity constraints impose restrictions on the allowable data in the database, in addition to the simple structure and type restrictions imposed by the basic schema definition.

Question: We've already seen a few kinds of integrity constraints. What are they?

   1.

   2.

   3.

Why use integrity constraints?

  1. To catch data-entry errors
  2. As correctness criteria when writing database updates
  3. To enforce consistency across data in the database
  4. To tell the system about the data - it may choose to store the data or process queries accordingly
-> Many useful constraints cannot be expressed with the kinds of constraints we've discussed so far.

Example schema:

   Student(ID, name, address, GPA, sizeHS)
   Campus(location, enrollment, rank)
   Apply(ID, location, date, major, decision)
Example constraints:

Types of Integrity Constraints for Relational Databases

  1. Non-null
  2. Key
  3. Referential integrity
  4. Attribute-based
  5. Tuple-based
  6. General assertions

Declaring and Enforcing Constraints

Two times at which constraints may be declared:
  1. Declared with original schema. Constraints must hold after bulk loading.
  2. Declared later. Constraints must hold on current database.
After declaration, if a SQL statement causes a constraint to become violated then (in most cases) any effects of the statement are undone and a run-time error is generated.


Non-Null Constraints

Restricts attributes to not allow NULL values
Example:
  CREATE TABLE Student (ID integer NOT NULL,
                        name char(30) NOT NULL,
                        address char(100),
                        GPA float NOT NULL,
                        sizeHS integer)

Key Constraints

Example:
ID is key for Student =>
No two tuples in Student can have the same value for their ID attribute

There are two kinds of keys in SQL:

There are two ways to define keys in SQL: Example:
   CREATE TABLE Student (ID integer PRIMARY KEY,
                         name char(30),
                         address char(100),
                         GPA float,
                         sizeHS integer,
                         UNIQUE (name,address))

Referential Integrity

Very important and common kind of constraint
Examples:








Question: Does the reverse direction in these examples make sense?



Terminology:

Referential Integrity in SQL

Two ways to define referential integrity in SQL: Example:
  CREATE TABLE Apply(ID integer REFERENCES Student(ID),
                     location char(25),
                     date char(10),
                     major char(10),
                     decision char,
                     FOREIGN KEY (location) REFERENCES Campus(location))
Can omit referenced attribute name if it's the same as referencing attribute:
  ID integer REFERENCES Student, ...
Can have multi-attribute referential integrity.

Can have referential integrity within a single table.
Example:

   Dorm(first-name,
        last-name,
        room-number,
        phone-number,
        roommate-first-name,
        roommate-last-name,
        PRIMARY KEY (first-name,last-name),
        FOREIGN KEY (roommate-first-name,roommate-last-name)
          REFERENCES Dorm(first-name,last-name))

Referential Integrity Enforcement

Consider R(P, ...) and S(..., F references R(P), ...)

Question: What modification operations can violate constraint?

   1.

   2.

   3.

   4.
Inserts/updates to S that violate constraint are disallowed. Deletes from R with referencing values:

(figure)









Four options:
  1. Disallow (default)
  2. SET NULL - set all referencing values to NULL
  3. SET DEFAULT - set all referencing values to default for that column (error if default value not in R.P)
  4. CASCADE - delete all tuples with referencing values
Updates to R.P: same four options (CASCADE mirrors update)

See textbook for syntax.

Question: Why is referenced attribute required to be PRIMARY KEY or UNIQUE?








Attribute-Based Constraints

Constraints on values for Relation.Attribute SQL Syntax: Add "CHECK(<condition>)" to attribute.
<condition> is like a SQL Where clause, can refer to attribute itself.
Examples:
   CREATE TABLE Student (...
                         GPA float CHECK(GPA <= 4.0 AND GPA > 0),
                         ...)

   CREATE TABLE Apply (...
                       decision char CHECK(decision in ('Y','N','U'))
                       ...)
Conditions can be more complex, e.g., with subqueries.
Example:
   Apply (...
          major char(10) CHECK(major NOT IN
                                  (SELECT dept-name FROM Dept
                                   WHERE status = 'full'))
          ...)
Question: Can we encode referential integrity constraints this way?



-> Attribute-based constraints are only checked on inserts and updates to that relation, NOT on modifications to relations referenced in subqueries -- significant "hole" in constraint-checking.


Tuple-Based Constraints

Constraints on values for Relation.Tuple

Similar to attribute-based constraints except CHECK applies to entire tuples.
=> Specified separately in table definition
Example:

  CREATE TABLE Campus(location char(25),
                      enrollment integer,
                      rank integer,
                      CHECK(enrollment >= 10,000 OR rank > 5)
(Example encodes enrollment < 10,000 => rank > 5)

As with attribute-based constraints, can have subqueries but constraint is only checked when the relation itself is modified


General Assertions

Constraints on entire relation or entire database

In SQL, stand-alone statement:

  CREATE ASSERTION <name> CHECK(<condition>)
Example: Average GPA is > 3.0 and average sizeHS is < 1000
  CREATE ASSERTION Avgs CHECK(
    3.0 < (SELECT avg(GPA) FROM Student) AND
    1000 > (SELECT avg(sizeHS) FROM Student))
Example: A student with GPA < 3.0 can only apply to campuses with rank > 4.
  CREATE ASSERTION RestrictApps CHECK(
     NOT EXISTS (SELECT * FROM Student, Apply, Campus
                 WHERE Student.ID = Apply.ID
                 AND Apply.location = Campus.location
                 AND Student.GPA < 3.0 AND Campus.rank <= 4))

Assertions checked for each modification that could potentially violate them.

Question: What operations could violate assertion Avgs?






Question: What operations could violate assertion RestrictApps?







Q: Can we encode referential integrity constraints this way?











Triggers

General form:
  CREATE TRIGGER <name>
  BEFORE | AFTER | INSTEAD OF <events>
  <referencing clause>                // optional
  FOR EACH ROW                        // optional 
  WHEN (<condition>)                  // optional
  <action>
<events> can be:
  INSERT ON R
  DELETE ON R
  UPDATE [OF A1, A2, ..., An] ON R
AFTER <events> are the most common and useful. Don't worry too much about BEFORE and INSTEAD OF, they can be messy.

Example: If an application tuple is inserted for a student with GPA > 3.9 and sizeHS > 1500 to Berkeley, set decision to "Y".

  CREATE TRIGGER AutoAccept
  AFTER INSERT ON Apply
  REFERENCING NEW ROW AS NewApp
  FOR EACH ROW
  WHEN (NewApp.location = 'Berkeley' AND
        3.9 < (SELECT GPA FROM Student WHERE ID = NewApp.ID) AND
        1500 < (SELECT sizeHS FROM Student WHERE ID = NewApp.ID))
  UPDATE Apply
    SET decision = 'Y'
    WHERE ID = NewApp.ID
    AND location = NewApp.location
    AND date = NewApp.date
Question: Can we achieve the same effect using assertions?




Example: Same trigger without FOR EACH ROW
  CREATE TRIGGER AutoAccept
  AFTER INSERT ON Apply
  REFERENCING NEW TABLE AS NewApps
  UPDATE Apply
    SET decision = 'Y'
    WHERE (ID,location,date) IN (SELECT ID,location,date FROM NewApps)
    AND location = 'Berkeley'
    AND 3.9 < (SELECT GPA FROM Student WHERE ID = Apply.ID)
    AND 1500 > (SELECT sizeHS FROM Student WHERE ID = Apply.ID)
Example: If campus enrollment increases from below 7,000 to above 7,000, delete all applications to that campus dated after 2/15/05 and set all "Y" decisions for applications before 2/15/05 to "U".
  CREATE TRIGGER TooMany
  AFTER UPDATE OF enrollment ON Campus
  REFERENCING OLD ROW AS OldVal NEW ROW AS NewVal
  FOR EACH ROW
  WHEN (OldVal.enrollment <= 7,000 AND NewVal.enrollment > 7,000)
  DELETE FROM Apply
    WHERE location = NewVal.location AND date > 2/15/05;
  UPDATE Apply
    SET decision = 'U'
    WHERE location = NewVal.location
    AND decision = 'Y'
Question: Can we achieve the same effect using assertions?




Triggers and assertions:


What Oracle Supports