Problems created by concurrency => need concurrency control
Client 1 - UPDATE Student SET address = CONCAT(address,:zip) WHERE ID = 123 Client 2 - UPDATE Student SET address = CONCAT(address,:phone) WHERE ID = 123For each client, DBMS reads
address
value, updates it, and
writes it back. Possible outcomes without concurrency control: one
change or both.
Client 1 - UPDATE Apply SET decision = 'Y' WHERE location = 'SB' AND ID IN (SELECT ID FROM Student WHERE GPA > 3.2) Client 2 - UPDATE Student SET GPA = 1.2 * GPA WHERE HSname = 'Paly'Possible outcomes without concurrency control: some Paly students get into SB on scaled GPA, others don't.
Client 1 - INSERT INTO Archive (SELECT * FROM Apply WHERE decision = 'N') DELETE FROM Apply WHERE decision = 'N' Client 2 - SELECT COUNT(*) FROM Apply SELECT COUNT(*) FROM Archive
INSERT INTO Archive (SELECT * FROM Apply WHERE decision = 'N') DELETE FROM Apply WHERE decision = 'N'System crashes in the middle - what now?
COMMIT
"
command is issued or session ends.
AUTO COMMIT
" mode turns each statement into a
transaction.
Isolation obtained through serializability: operations within transactions may be interleaved but execution must be equivalent to some sequential (serial) order.
Question: How is this guarantee achieved?
Solves Examples 1,2,3 above
Client 1 - INSERT INTO Archive (SELECT * FROM Apply WHERE decision = 'N') DELETE FROM Apply WHERE decision = 'N' Client 2 - UPDATE Apply SET decision = 'U' WHERE Campus = 'Irvine'Serialization order can make a big difference. This is the application's problem to solve, not the DBMS.
Question: Seems obvious, but all DBMS's manipulate the data in memory, so how is this guarantee achieved?
"Transaction rollback" = "transaction abort"
Client-initiated rollback: BEGIN TRANSACTION; <get input from user> SQL commands based on input <confirm results with user> IF input = confirm-OK THEN COMMIT; ELSE ROLLBACK;Note: Rollback only undoes database changes, not other changes (e.g., program variables) or side-effects (e.g., printing to screen, delivering cash).
Question: No self-respecting database programmer would write the above transaction. Why?
Idea: Assume all constraints are true at the start of every
transaction. Clients are to guarantee, under this assumption and
isolation, that all constraints are still true at the end of every
transaction. (Similar to program invariants)
"SET TRANSACTION READ ONLY"Many, many transactions and applications fall into this category.
Question: If there are five read-only transactions and no other transactions, what does the system need to do to guarantee serializability?
Three weaker isolation levels:
Client 1 - BEGIN TRANSACTION; ... UPDATE Student SET GPA = .99 * GPA ... COMMIT; Client 2 - BEGIN TRANSACTION; ... SELECT AVG(GPA) FROM Student ... COMMIT;Client 2 may only care about approximate average - dirty reads okay. Use:
"SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED"Note: Isolation level of Client 1 transaction is irrelevant unless another transaction updates GPA.
"SET TRANSACTION ISOLATION LEVEL READ COMMITTED"
SERIALIZABLE
.
Client 1 - BEGIN TRANSACTION; // serializable ... UPDATE Student SET GPA = .99 * GPA ... COMMIT; Client 2 - SET TRANSACTION ISOLATION LEVEL READ COMMITTED; BEGIN TRANSACTION; ... SELECT AVG(GPA) FROM Student // executes before Client 1 ... SELECT MAX(GPA) FROM Student // executes after Client 1 ... COMMIT;
"SET TRANSACTION ISOLATION LEVEL REPEATABLE READ"Modified Example 5: get same average both times
=> But still doesn't guarantee global serializability!
Client 1 - BEGIN TRANSACTION; // serializable ... UPDATE Student SET GPA = .99 * GPA UPDATE Student SET SAT = 1.01 * SAT ... COMMIT; Client 2 - SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; BEGIN TRANSACTION; ... SELECT AVG(GPA) FROM Student // executes before Client 1 ... SELECT AVG(SAT) FROM Student // executes after Client 1 ... COMMIT;The following example is more realistic for repeatable read but not globally serializable, and is based on the fact that repeatable read does not apply to inserted tuples.
Client 1 - BEGIN TRANSACTION; // serializable ... INSERT INTO Student <100 new students> ... COMMIT; Client 2 - SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; BEGIN TRANSACTION; ... SELECT AVG(GPA) FROM Student // executes before Client 1 ... SELECT MAX(GPA) FROM Student // executes after Client 1 ... COMMIT;New inserted tuples are called phantoms.
Weaker isolation levels increase performance by eliminating
overhead and increasing concurrency. From weakest to strongest and
the read behaviors they permit:
isolation level | dirty reads | nonrepeatable reads | phantoms |
---|---|---|---|
read uncommitted | Y | Y | Y |
read committed | N | Y | Y |
repeatable read | N | N | Y |
serializable | N | N | N |
Remember that the isolation level is in the eye of the beholding
transaction: For true global serializability, every transaction must
have isolation level SERIALIZABLE
.
=> Much, much more on transactions in CS245 and CS347