CS145 Lecture Notes (10) -- Indexes
- Primary mechanism for users/applications to get improved performance on a database
- Many interesting implementation issues (CS245, CS346)
Singular: "Index"
Plural: "Indexes" or "Indices"
Index on attribute R.A:
- Creates additional persistent data structure stored with the database
- Can dramatically speed up certain operations:
- Find all R tuples where R.A = v
- Find all R and S tuples where R.A = S.B
- Find all R tuples where R.A > v (sometimes, depending on index type)
(picture of unordered relation and indexed attribute)
Example
SELECT *
FROM Student
WHERE name = 'Mary'
- Without index: Scan all
Student
tuples
- With index: Go "directly" to tuples with
name='Mary'
Indexes are built on single attributes or combinations of
attributes.
Question: What data structures are used for indexes?
1.
2.
Example
SELECT *
FROM Student
WHERE name = 'Mary' and GPA > 3.5
Could use:
- Index on
Student.name
- Index on
Student.GPA
- Index on (
Student.name
,Student.GPA
)
Indexes can also speed up joins.
Example
SELECT *
FROM Student, Apply
WHERE Student.ID = Apply.ID
Could use:
- Index on
Apply.ID
- Index on
Student.ID
- Both indexes together (in certain cases)
Question: What are the disadvantages of creating an index?
1.
2.
3.
Index Selection
- Choosing which indexes to create is a difficult and very
important design issue. The decision depends on size of tables, data
distributions, and most importantly query/update load.
- DBMS vendors are introducing "physical design advisors."
- Input: database and workload
- Output: suggested indexes
Generally work very well.
(diagram of how a design advisor works)
SQL Syntax
For one index on R.A
:
CREATE INDEX IndexName ON R(A)
For one index on (R.A1, R.A2, ..., R.An
):
CREATE INDEX IndexName ON R(A1, ..., An)
To destroy an index:
DROP INDEX IndexName
Will enforce R.A
as a key:
CREATE UNIQUE INDEX IndexName ON R(A)