(Diagram of how the SQL standard evolves over time)
We'll cover primarily SQL92 (SQL2), some features of SQL-99 (SQL3).
SQL is a big language, not just queries and updates.
Components of the language:
In lectures:
CREATE TABLE name (att1 type1, att2, type2, ..., attN typeN) DROP TABLE nameSee textbook for details of types.
SELECT, INSERT, DELETE, UPDATESchema for examples:
Student(ID, name, address, GPA, sizeHS) // ID is key Campus(location, enrollment, rank) // location is key Apply(ID, location, date, major, decision) // (ID,location) is key
SELECT A1, A2, ..., An FROM R1, R2, ..., Rm WHERE conditionEquivalent to relational algebra query:
PROJECT[A1, A2, ..., An] (SELECT_[condition] (R1 X R2 X ... X Rm))Returns unnamed relation whose schema is (A1, A2, ..., An)
(Example: names and addresses of all students with GPA > 3.7)
(Example: names and application dates of all students applying to CS at Berkeley with sizeHS < 1000)
(Example: locations plus ranks of all campuses with applicants to "bioinformatics" major)
(modify last query to use DISTINCT)
** This is a significant difference between SQL and relational algebra.
(Example: ID, name, GPA, and application date of students applying to S.C. sorted by application date then descending GPA)
(Example: ID's for all applicants with a major that includes "bio")
(change previous two queries to use "SELECT *")
(Example: Student relation except scale GPA by sizeHS)
Also note the WHERE clause is optional.
(Example: rename attributes in previous query)
AS keyword is optional.
(Example: student ID pairs who live at same address)
(Example: list of all names and majors in the database)
(Example: ID's of students with GPA > 3.7 that did not apply to a campus with rank <= 3)
Question: How does what we've covered so far compare in expressiveness to relational algebra?
(Example: ID's and names of students applying to "CS" major)
Question: Can we write it without a subquery?
Question: Is the result the same?
Question: Why do we care so much about duplicates?
(Example: ID's and names of students who did not apply to Berkeley)
Note: Can use "A NOT IN ..." and "NOT (A IN ...)" interchangeably
Question: Can we write it without a subquery?
(Example: students such that some other student has same name and GPA, using EXISTS)
(Example: rewrite using multi-attribute IN)
(Example: student(s) from the biggest high school, using "NOT EXISTS")
(Example: rewrite using ">= ALL")
Uses of ALL:
SELECT ... WHERE ... att < ALL (subquery) <= ALL > ALL >= ALL = ALL <> ALLCan also precede expression with NOT, e.g.:
SELECT ... WHERE ... NOT (att < ALL ...)Same table for ANY:
SELECT ... WHERE ... att < ANY (subquery) <= ANY > ANY >= ANY = ANY <> ANYCan also precede expression with NOT, e.g.:
SELECT ... WHERE ... NOT (att < ANY ...)
(Example: Find all applications that are not earliest, using "> ANY")
Questions: Are there any redundancies in the above tables including NOT?
(Example: get ID and scaled GPA when the scaled GPA lies outside of some range)
(Example: average GPA of all Students)
(Example: average GPA of students applying to "CS")
Question: Is every student's GPA counted exactly one time?
(Example: amount by which average "CS" applicant GPA exceeds overall average)
(Example: number of applicants to Berkeley)
Question: What if students can apply to campuses multiple times?
SQL aggregate functions are: MIN, MAX, SUM, AVG, COUNT
(Example: number of applicants to each campus)
(show picture)
=> If there's a GROUP BY clause then only grouped attributes and aggregates may appear in the SELECT clause.
(Example: maximum applicant GPA for each major at each location)
(Example: for each student -- ID, name, and number of campuses applied to)
Question: What if also want to include students who have applied to 0 campuses?
(Example: all majors at Berkeley with >500 applicants)
=> WHERE clause performs filtering before the grouping, HAVING performs filtering after the grouping.
Question: Can we write the query above with a subquery instead of HAVING?
Question: Are there HAVING queries that can't be rewritten?
SELECT attributes, aggregates FROM relations WHERE condition GROUP BY grouping attributes HAVING conditions on aggregates ORDER BY attributes, aggregates=> All optional except SELECT and FROM
("Flowchart" of clause evaluation)
Examples:
(Example: all students with "sizeHS > 1000", then "NOT sizeHS > 1000")
(Example: add "AND GPA > 3.7", change to "OR GPA > 3.7")
(Example: all students whose sizeHS is NULL but GPA is not NULL)
INSERT INTO R VALUES (A1, A2, ..., AN)Example: INSERT INTO Campus VALUES ('Irvine', 30,000, 8)
Can omit attribute values, assigned to default or NULL
Second form:
INSERT INTO R subquery
(Example: create relation Visit for students with GPA > 3.9 and sizeHS < 1000)
DELETE FROM R WHERE conditionJust like SELECT except over one relation
(Example: delete all students who did not apply anywhere)
UPDATE R SET A1 = E1, A2 = E2, ..., An = En WHERE condition
(Example: accept everyone applying to Berkeley with GPA > 3.9)
( ... and make them into CS majors)