CS145 Lecture Notes (14) -- Security & Data Privacy
SQL Injection Attacks
Say you have gone beyond the required features of the project and implemented password-protected login to your auction site. You store user credentials in the table Users(userid, password).
When a user logs in to your site, they provide a username and
password and you run the following query to check the credentials:
SELECT count(*) FROM Users WHERE userid = '<user-provided-id>' AND password = '<user-provided-password>'
What happens if the user types in the following: userid: admin password: blah' OR 'a'='a
Other dubious password strings:
These
strings can go in anywhere a SQL query is composed based on
user-provided data. When the proper security configuration is not
in place, it's possible to probe a database to discover it's structure,
gain unauthorized access, or destroy data.
To secure against SQL injection
Sanitize
user input: no weird characters, no NULL, in anything from a
user: form input, URL parameters, cookie values, etc.
Setup good access permissions and run your application with minimal privileges.
Suppress error messages that give hints about system configuration and contents.
Remove or disable unused stored procedures.
Privacy Protection
Interest in database technology for protecting privacy has grown a lot recently.
Some technical approaches to protecting privacy in databases
Authorization
Encryption
One-way functions (hashes) & Negative Databases
Statistical Databases
Authorization
We
spent and entire lecture on this one. Policy is set up within the
DB specifying which users are allowed to see and change data.
Authentication is almost always by username/password pairs, but
more secure (biometric) methods exist too.
What are the benefits?
What are the limitations?
Encryption
Encryption can happen at many points:
on the disk, with keys managed by the DBMS (oracle supports this)
on its way into and out of the database
at the user level
What are the benefits?
What are the limitations?
One-way functions
There
are functions with the property that they are hard to reverse.
Given a string, you can compute its hash efficiently, but given a
hash, it is intractable to compute the corresponding string.
Instead of using names or other personally identifying info in a database, just store the hashes instead.
Example: health research database
Negative
databases provide some of the same functionality but rely on
NP-hardness for their security rather than the hardness of factoring
large numbers. The idea: don't store what you know, store what you don't know. Once
created, a negative database can be disclosed publicly. You can check
to see if a given record is stored in the database, but it is
intractable to construct a list of records (the corresponding positive database).
What are the benefits?
What are the limitations?
Statistical Databases & Privacy-preserving data mining
Provide
statistical informaton (sum, count, average, maximum, minimum,
percentiles, etc.) without compromising sensitive information about
individuals.
Two classes of techniques:
Query
restriction. restrict the size of a query result, control the
overlap amongst successive queries, don't return small values (you have
a challenge problem about this)
Data perturbation: Add random noise to the database or to query results, swap field values among different records
What are the benefits?
What are the limitations?
Limitations of Technology
No technical solution is completely secure. All have vulnerabilities and limitations.
Even when they work perfectly, these systems are ultimately used to implement policy
about access to data. The question of what that policy should be
will not be answered by cryptographers or DB researches. At best,
technology can only expand the space of possible policies.
There
is vigorous debate going on now about the importance of privacy and how
to balance privacy against competing interests like security, commerce,
and scientific research.
You now know more about data
management technology than 99.9% of the country. You are in a
unique position to inform and contribute to this debate.
Some
of you will design or administer systems that store personal
information. You have a responsibility to be aware of the privacy
implications of your work.