========================================================================= LECTURE NOTES - SQL: QUERIES AND UPDATES ========================================================================= SQL ("S.Q.L." or "sequel") - Structured Query Language - Query language of all commercial relational DBMS's We'll cover primarily SQL92 (SQL2), some features of SQL:1999 (SQL3) Components of language: - Schema definition - Data retrieval - Data modification - Indexes - Constraints - Views - Triggers - Transactions - Authorization - etc. DDL = Data Definition Language DML = Data Manipulation Language "Interactive SQL" versus "Embedded SQL" - Assume interactive for most lectures Basic DDL commands ------------------ CREATE TABLE name (att1 type1, att2, type2, ..., attN typeN) DROP TABLE name -> See book for details of types DML commands ------------ SELECT, INSERT, DELETE, UPDATE Example schema: Student(ID, name, address, GPA, SAT) // ID is key Campus(location, enrollment, rank) // location is key Apply(ID, location, date, major, decision) // (ID,location) is key SELECT Statement ---------------- SELECT A1, A2, ..., An FROM R1, R2, ..., Rm WHERE condition equivalent to relational algebra query: PROJECT_{A1, A2, ..., An} (SELECT_{condition} (R1 X R2 X ... X Rm)) -> Returns unnamed relation whose schema is (A1, A2, ..., An) 3.7> 1400> Duplicates ---------- Retained unless use keyword DISTINCT -> This is a significant difference between SQL and relational algebra ORDER BY -------- Sorts result LIKE ---- For string matching in the WHERE clause SELECT * -------- Returns all attributes in cross-product SELECT with Arithmetic ---------------------- - Also note that WHERE clause is optional Renaming Attributes ------------------- - AS keyword is optional Tuple Variables --------------- - Can use tuple variables anytime for convenience and readability - AS keyword is optional Union ----- Union eliminates duplicates, use UNION ALL to retain duplicates INTERSECT [ALL] and EXCEPT [ALL] (MINUS in Oracle) are similar 3.7 that did not apply to a campus with rank <= 3> ** Q: How does what we've covered so far compare in expressiveness to relational algebra? Subqueries ---------- SELECT statements appearing in WHERE clause - Treated as value if returns a single tuple result - Otherwise treated as relation and compared to values using IN, EXISTS, ALL, ANY Q: Can we write it without a subquery? Q: Is the result the same? Q: Why do we care so much about duplicates? Note: Can use "A NOT IN ..." and "NOT (A IN ...)" interchangeably Q: Can we write it without a subquery? = ALL> Uses of ALL: SELECT ... WHERE ... att < ALL (subquery) <= ALL > ALL >= ALL = ALL <> ALL Can also precede expression with NOT, e.g.: SELECT ... WHERE ... NOT (att < ALL ...) Same table for ANY: SELECT ... WHERE ... att < ANY (subquery) <= ANY > ANY >= ANY = ANY <> ANY Can also precede expression with NOT, e.g.: SELECT ... WHERE ... NOT (att < ANY ...) ANY> Q: Are there any redundancies in the above tables including NOT? Subqueries in FROM ------------------ Not covered in book Aggregates ---------- - Clearly goes beyond relational algebra in expressiveness Q: Is every student's GPA counted exactly one time? SQL aggregate functions are: MIN, MAX, SUM, AVG, COUNT GROUP BY -------- -> If there's a GROUP BY clause then only grouped attributes and aggregates may appear in the SELECT clause Q: What if students can apply to a campus multiple times? Q: What if also want to include students who have applied to 0 campuses? HAVING Clause ------------- Conditions on aggregates 500 applicants to "CS"> -> WHERE clause performs filtering before the grouping, HAVING performs filtering after the grouping. Q: Can we write the query above with a subquery instead of HAVING? Q: Are there HAVING queries that can't be rewritten? SQL Summary ----------- SELECT attributes, aggregates FROM relations WHERE condition GROUP BY grouping attributes HAVING conditions on aggregates ORDER BY attributes, aggregates -> All optional except SELECT and FROM Null Values ----------- - Every type implicitly includes NULL Ex: - No decision information yet - Student with no SAT scores 1400, then NOT SAT > 1400> 3.7, change to OR GPA > 3.7> Data Modification: INSERT Statement ----------------------------------- INSERT INTO R VALUES (A1, A2, ..., AN) Ex: INSERT INTO Campus VALUES ("Irvine", 30,000, 8) Can omit attribute values, assigned to default or NULL INSERT INTO R subquery 3.7 or SAT > 1400> DELETE Statement ---------------- DELETE FROM R WHERE condition -> Just like SELECT except over one relation UPDATE Statement ---------------- UPDATE R SET A1 = E1, A2 = E2, ..., An = En WHERE condition 3.7 and SAT > 1400>