Student(ID, name, address, GPA, SAT) // ID is key Campus(location, enrollment, rank) // location is key Apply(ID, location, date, major, decision) // (ID,location) is key
Relational algebra operators used to combine, filter, etc. the relations:
(Example: Students with GPA > 3.7 and SAT < 1300)
(Example: Applications to S.C. geology major)
** General case: SELECT[C] (R)
where C can include attribute names, constants, comparisons (=, <, etc.), and connectives (AND, OR, NOT)
PROJECT eliminates columns while SELECT eliminates rows.
To do both, combine ("compose") operators:
(Example: name and address of Students with GPA > 3.7 or SAT > 1400)
SELECT produces a relation, PROJECT operates on that relation.
** General case: PROJECT[A1, A2, ..., Am] (E)
where E is any expression producing a relation
Question: Is it ever useful to compose two projection operators?
Example: PROJECT[enrollment] (PROJECT[location, enrollment] (Campus))
Question: Is it ever useful to compose two selection operators?
Example: SELECT[SAT < 1300] (SELECT[GPA > 3.7] (Student))
Schema of result is schema(Campus) union schema(Apply)
Formally: R1 X R2 = {t | t = <t1,t2> and t1 in R1 and t2 in R2}
Question: Looks odd to glue unrelated tuples together. Why use it?
(Example: Names and addresses of all students with GPA > 3.7 who applied to CS major and were rejected)
Question: Can we write it in a different way?
(Show schema of Campus JOIN Apply)
(Example: Names and addresses of all students with GPA > 3.7 who applied to CS major and were rejected)
(Example: Names and addresses of all students with GPA > 3.7 who applied to CS major at campus with enrollment < 15,000 and were rejected)
** General case:
Need to be careful -- suppose we have:
Student(ID, name, address, GPA, SAT) Campus(name, enrollment, rank)"Student JOIN Campus" doesn't make sense.
Student(ID, name, address, GPA, SAT) OutofStateStudent(ID, name, address, GPA, SAT, state) ForeignStudent(ID, name, address, GPA, SAT, country)
(Example: List name and address of all students)
Question: Can we do it with cross-products or joins?
For union operator:
Student(ID, name, address, GPA, SAT) // ID is key Campus(location, enrollment, rank) // location is key Apply(ID, location, date, major, decision) // (ID,location) is key(Example: find ID's of all students who didn't apply anywhere)
Question: What if want name of students?
For difference operator:
Student(ID, name, address) Parent(child-ID, parent-name)(Example: want list of all student and parent names)
(Example: want names of all pairs of students who live at same address)
** General case: RENAME[R(A1, A2, ..., Am)] (E) or RENAME[R] (E)
E ::= R | SELECT[C] (E) | PROJECT[A1, A2, ..., An] (E) | E1 X E2 | E1 U E2 | E1 - E2 | RENAME[R(A1, A2, ..., Am)] (E)Abbreviations:
| E1 JOIN E2 | E1 JOIN[C] E2 | E1 INTERSECT E2