CS145 - Spring 2002
Introduction to Databases
Assignment #1   --   Due Wednesday April 17
|
- Written work, on-campus students: Please turn in your
written work during class, or to the course administrator (Sarah Weden
in Gates 419) between 9:00 AM and 4:00 PM. Please put your name and
preferred email address on every page that you turn in - you'd be
surprised how many times assignments mysteriously come apart. The
on-time deadline is 4:00 PM on the due date. If you don't turn in
your assignment during class then Sarah will timestamp it for you when
you give it to her. Written work turned in after the deadline but
less than 24 hours late (i.e., by 4:00 PM on Thursday) will be
accepted but penalized 20%. No written work will be accepted more
than 24 hours late since solutions may be made available at that
time. Please remember that written work will not be accepted
electronically.
- Written work, SCPD students: Written work due on
Wednesday must be timestamped by the Thursday courier at the latest.
Please put your name, preferred email address, and SCPD company site
name on every page that you turn in - you'd be surprised how many
times assignments mysteriously come apart. No late written work is
permitted for SCPD students sending their work by courier, and
consequently no late penalties apply. Please see the Assignments page for a discussion of
courier timestamping issues. Written work not sent or timestamped by
an SCPD courier is subject to the policies for on-campus students
specified in the previous bullet. Please remember that written work
will not be accepted electronically.
- Programming work: Programming work will be submitted
electronically (for both on-campus and SCPD students), and must be
submitted by midnight on the Wednesday that it is due. Programming
work submitted after the deadline but less than 24 hours late (i.e.,
by Thursday midnight) will be accepted but penalized 10%, and
programming work submitted more than 24 hours but less than 48 hours
late (i.e., by Friday midnight) will be penalized 30%. No programming
work will be accepted more than 48 hours late.
- For emergencies: Since emergencies do arise, each
student is allocated four "chits" for turning in work with no penalty.
Please see the Assignments page for a
detailed description of chits.
- Honor Code reminder: For more detailed discussion of
the Stanford Honor Code as it pertains to CS145, please see the Assignments and Project pages under Honor Code. In
summary: You must indicate on your written and programming assignments
any assistance (human or otherwise) that you received. Any
assistance received that is not given proper citation will be
considered a violation of the Honor Code. In any event, you (or you
and your partner in the case of project teams) are responsible for
understanding and being able to explain on your own all material that
you submit.
- Partners reminder: Remember that while partners are
permitted for project work, partners are not permitted for written
problems. Written work must be completed individually.
Assignment Components and Grading
|
- The written problems are worth 15% of your final grade in CS145.
Each assignment contains two types of written problems:
- Exercises are intended primarily to help you learn
the basic material (and prepare for the exams).
- Challenge problems are more open-ended and
difficult, intended to help you think more deeply about the topics.
Due to limited staffing for grading, we will not grade all of
the exercises, but will simply spot-check them. We will grade the
challenge problems. We will provide sample solutions for all
exercises and challenge problems. There are four possible overall
marks for your written work on each assignment:
- Check-plus means you appear to have done many of the
exercises and you did well on the challenge problems.
- Check means you appear to have done many of the
exercises and you may have attempted the challenge problems but you
didn't do well on them.
- Plus means you missed many of the exercises but did
well on the challenge problems.
- Minus means you missed many of the exercises and you may
have attempted the challenge problems but you didn't do well on them.
Check-plus is the best, check and plus are
roughly equivalent although we're more impressed with plus, and
minus is the worst although better than not turning in the
assignment at all.
- The project is worth 35% of your final grade in CS145.
Project parts will be graded out of a varying number of points.
- Do parts (a)-(i) of Exercise 5.2.4 in the textbook (pages
210-212). Write the relational algebra expressions only - you do not
need to show the results on the sample data although you are free to
do so if you wish. Note that the ships in the problem are all
"capital" ships (as stated in the first sentence), and when a problem
asks for a ship your expression need only return the ship name.
- Design an XML document format for encoding data about items
and users in an online auction. You need not be as comprehensive as,
say, eBay, but you
should assume the data includes (at least): items for sale including
itemID, description, minimum bid, and ending time; categories that
items belong to; sellers and buyers (bidders) including userID, name,
and contact information; and bidding histories including time, bidder,
and price. Specify a DTD and give one small example XML document
conforming to that DTD. Your encoding should use at least two levels
of nesting and should include at least one attribute of type ID and at
least one attribute of type IDREF or IDREFS.
- Consider the following relational schema:
Name(ID, name) // ID is a key
GPA(ID, gpa) // ID is a key, gpa is a key
Let's keep things simple by assuming gpa's are unique, i.e., no two
students have the same gpa.
(a) Can you write a relational algebra expression to find
the name of the student with the highest gpa in the database? If so,
write it. If not, explain why not. You may only use the relational
algebra operators covered in lecture.
(b) Can you write a relational algebra expression to find
the name of the student with the median gpa in the database? (Again,
you may only use the relational algebra operators covered in lecture.)
If so, write it. If not, explain why not. (Recall that the median of
a set of numbers is the number in the middle, i.e., there are an equal
number of higher and lower numbers. To keep things simple you may
assume that there is an odd number of gpa's.)
- Your task is to create a schema for storing XML in a relational
database system. You may consider "basic" XML only: elements,
attributes, and text. You don't need to consider DTDs or ID and
IDREF(S) attributes. You may assume that all data to be loaded is in
a single XML document, and you may assume that the relational DBMS can
generate unique values for you to use as identifiers if needed.
(a) Describe a generic relational schema that can be used to
store any XML data. Specify the schema of your relations (including
keys), and briefly explain how data is translated from an XML document
into the relations.
(b) Describe how you might specify a set of relations for an
XML document when you can examine the document itself in order to
create the relational schema. Of course your answer to part (a) would
still work, but the idea is to specify a set of relations whose schema
attempts to capture any regularity that may be present in the XML
document. Warning: Depending on how seriously you take it, this
one can be an extremely open-ended problem. Many research papers have
been published on the topic.
Partners
The CS145 project may be completed individually or in teams of two,
the choice is up to each student. Details of partnership rules can be
found in the Project page. Summarizing:
- An identical amount of work is expected and the same grading
scale is used for individual and team projects.
- If you work in a team, choose your partner carefully.
Teams are permitted to "divorce" at any time during the course, and
individual students may choose to team up as the project progresses,
however students from divorced teams may not form new teams or join
other teams.
- If you work in a team, your work must be turned in jointly, as
one submission. Both partners in a team will receive exactly the same
grade for each project part submitted jointly, and late penalties will
be applied to both partners.
Please make sure to submit your work jointly - do not turn your
project in twice, once for each partner.
Scope
If you find this first part of the project a snap, you're not
overlooking something, and don't worry - it will get much more
interesting as time goes by. The primary purpose of this first
"warm-up" part is for us to provide you with a whole bunch of basic
information, and to get everyone up-to-speed on our computing systems
and the languages and tools we will be using. Those of you who have
done some Web programming before, especially anyone who has used a
DBMS behind a Web site, will find this project part nearly trivial.
Those of you who haven't will find it merely straightforward. With
all that said, please don't start at the last minute - as with
all programming and systems work, everything takes a bit of time, and
unforeseen snafus do crop up.
Part A
In this portion of the project you will experiment with Oracle's
sqlplus command-line interface and its bulk loader. You will
create a database table to contain information about Stanford courses,
load the table with a small amount of (real) data we are providing,
and run a few commands to get familiar with Oracle.
Part B
In this portion of the project you will build a small Web interface
that connects to Oracle, queries the course information database table
you created in Part A, and displays the result in HTML.
First you must decide which programming language you will use. You
may choose either C (with some optional C++) or Java. You should make
the choice carefully, as you will want to stick with it for the entire
project. If you choose C, you will use Pro*C to interact with Oracle
and CGI to implement the Web interface. If you choose Java, you will
use JDBC to interact with Oracle and Servlets to implement the Web
interface. You will experiment with various details of Pro*C and JDBC
later in the project; in this part of the project you will focus
primarily on learning CGI (for C users) or Servlets (for Java users).
If you are interested in using other languages or tools for building
your Web interface and interacting with Oracle, please see the
important discussion and ground-rules on this topic in the Project page under Programming and
System Issues.
- Step 0: Read the document Web-Database
Programming: CGI and Java Servlets (available through the course
Web site Project page) for an overview of CGI or Servlets.
- Step 1: Set up your account/directories.
- C users: If you filled out our course registration form
by April 8 then a CGI account has been created for you. Create a
directory cgi-bin from your home directory and set
the following permission (replace "username" with your actual user
name):
fs setacl ~/cgi-bin username.cgi write
Your CGI script will be run from
cgi-courses.stanford.edu. The URL for your CGI executable
will be:
http://cgi-courses.stanford.edu/~username/cgi-bin/scriptname
- Java users:
Compiling Servlets in Unix requires a few changes to your
PATH and CLASSPATH environment variables. These
changes have been made for you in the source file
/usr/class/cs145/all.env. Just type "source
/usr/class/cs145/all.env". Alternatively, you can add this command
to your .cshrc file.
Within your home directory you must set up the following
directory structure:
servletdir
webpages
WEB-INF
servlets
A shell script to build this hierarchy is provided at
/usr/class/cs145/code/bin/buildServletDirectory. (Note: If
you add "source /usr/class/cs145/all.env" to your .cshrc
file, you can run buildServletDirectory by just typing the
command.)
You should store .html pages within this structure, in the
webpages directory. All Servlets must be located in the
servlets directory.
-
Step 2: Copy sample code. (If you do not want to work off our
sample code, you are free to write your own.)
We are providing sample programs to demonstrate the basic
functionality of CGI and Servlets, located at
/usr/class/cs145/hw1/samples/ under the C or
Java subdirectory as appropriate.
Please note that the instructions, code, and support we are
providing are only tested and "guaranteed" on the leland Solaris
machines.
- C users: Copy the appropriate files to your own directory.
In all .html files, change "yourname" to your own username. Move all
.html files to a place accessible on the Web (such as your WWW
directory).
Type "make cgidemo1" and "make cgidemo2" to compile. When finished,
move the executable files cgidemo1 and cgidemo2 to
your ~/cgi-bin/ directory. You should now be able to open
and view cgidemo1.html and cgidemo2.html in your Web
browser.
- Java users: Copy the appropriate files to your own
directory. All .html files should be placed in the webpages
directory from Step 1. ServletDemo.java should be placed in
the servlets directory.
First you must compile your .java file. Next, once your
directory structure is set up and your Servlets are compiled, you can
run the Java JSDK 2.1 Web server manually on any leland Solaris
machine (e.g., saga's, elaine's, myth's) in
order to provide these documents over the Web. The steps involved in
starting the server are:
-
Choose a random port number in the range 5000-65000. This will bind your
server application to that port for the machine on which you're
running your server. Try to choose a random number and remember it -
you will be the only person on that machine who can use that
port, and you will need it to have access over the Web.
-
Change to the servletdir directory, which is the root of
the directory structure you created in Step 1. From there, start the
server by calling "startserver -port portnum" from the Unix
command line, where "portnum" is the port number you chose in the
previous step. The server will begin in the background, and you can
see it using the "ps -elf | grep username" command. If you do
not enter a port number, the default port number 8080 will be chosen
for you. (You can actually set the default yourself: After you've
run the server once, it will create a configuration file called
"default.cfg" for you; it finds the default port number there.)
-
From your browser, enter the URL of a Web page or Servlet
contained in your servletdir hierarchy. Static HTML
documents placed in the webpages directory are accessible
from the Web at the address
http://machineXX:portnum/page.html, where "machineXX"
refers to the Solaris machine from which you're running the Web server
(e.g., saga22, elaine12, myth7,
etc.), "portnum" is a specific port (see above), and "page.html"
is the name of the HTML page that you are serving.
- To stop the server, issue the command "stopserver".
Warning: If you log into a leland machine and cannot start a server
because one is already running, do NOT type "stopserver" or you will
stop someone else's server! Instead, find a different machine with no
servers running on it.
-
If you want to recompile your Servlets you must stop the server,
recompile, then restart the server. Static HTML pages that you
are hosting from the webpages directory can be changed
at will.
-
Step 3: Understand the sample code.
- C users: Read cgidemo1.c, cgidemo2.c,
cgiparse.h and the .html files. Make sure you understand how
they work. You may also read cgiparse.c if you wish.
- Java users: Read ServletDemo.java and the
.html files provided. Make sure you understand how they
work. For detailed documentation you may want to refer to the
Java Servlets Class Docs from Sun, available through the course
Web site Project page.
-
Step 4: Write your own code to display a database table.
In this step, you will write a Web interface that retrieves and
displays the contents of the course information table stored in
Oracle. Your Web interface should ask for a username and password
through a Web form, retrieve the contents of the table from the
database using that username and password, and display the data in an
HTML table format. We are providing code that interacts with Oracle
and retrieves the data, and you do not need to modify this part of the
code.
- C users: The file readDB.pc contains code to
retrieve all tuples of the table "Courses" from Oracle. The main
function is GetData and can be called by:
data = GetData(username, password, &numRecords);
where "username" and "password" are character strings of an Oracle
account's username and password. "numRecords" is an integer and will
contain the number of retrieved tuples upon return. The return value
"data" is a pointer to a "CourseInfo" structure (defined in
readDB.pc), and can be accessed through data[i].course,
data[i].units, and data[i].title, where i is between 0 and
numRecords-1. All code outside the functions GetData and
sql_error is written in regular C. To compile the program,
type "make readDB".
Your task is to modify readDB.pc (which currently contains an
empty main()) and to write a corresponding .html file
to create a Web interface that asks for username and password through
a Web form, then retrieves and displays the "Courses" table using the
provided username and password. You should format the results page to
display the data in an HTML table.
- Java users: The method getData in the
ServletDemo.java file is responsible for connecting to Oracle
and executing a query. This method is called within the
ServletDemo class by passing in three arguments: the query
string to send to Oracle, the Oracle username, and the password:
queryResults = getData(String query, String user, String passwd);
The queryResults variable is of type ResultSet, and
can be manipulated as shown in the while loop of the
doGet method. Connecting to Oracle requires specifying the
appropriate Oracle driver, then establishing a connection with the
getConnection method. (For further information on JDBC you
can refer to our document Introduction
to JDBC or the JDBC
Tutorial from Sun, both available through the course Web site
Project page.) Executing any valid SQL query is simply a matter of
passing a different query string to the getData method.
In this assignment you need to process the retrieved data and
format the results page (returned by the Servlet) to display the data
in an HTML table. Take a close look at the doGet method and
in particular at the way it extracts tuples one at a time from the
retrieved results.
- C and Java users please note: The special characters
&, <, and > need to be replaced in HTML by the escape
strings "&", "<", and ">" (without the quotes)
respectively so they can be displayed properly in your result table.
You must escape all three characters even if you are testing on a
browser that does not require it.
Browser compatibility
Before submitting your project Part B please ensure that it operates
correctly using the Netscape browser on the Sweet Hall Solaris
workstations (e.g., saga's, elaine's,
myth's). If there is a compelling reason you cannot make
your Web interface work in the Solaris Netscape browser environment
(e.g., you really want to exploit certain features in Internet
Explorer), you must get "preapproval" from the course staff to use a
different browser environment. Send an email message to cs145-staff@cs telling
us precisely what browser environment you wish to use. The message
must be sent by Sunday April 14 so that we have time to work things
out if your browser environment poses a problem for us. You will
receive a reply within 24 hours of your message, and you do need to
receive a positive confirmation message before assuming that your
alternate environment is okay.
When the preapproval process is not followed, projects that have
problems on the Sweet Hall Solaris Netscape browsers may lose points,
possibly all points if we cannot run your Web interface at all.
What to submit
Prepare a "submission directory" that contains a text file called
README and contains your C and HTML files (for C users) or
your entire servletdir directory (for Java
users). Specifically:
- C users should submit their modified readDB.pc and the
.html file itself. Thus the submission directory should
contain README, readDB.pc, and one .html
file.
- Java users should submit their entire servletdir
directory structure, i.e., the directory created by our script
buildServletDirectory, or an equivalent directory created by
you. The servletdir directory should include all static
.html documents in the webpages directory and the
Java Servlet code in the servlets directory. Thus the
submission directory should contain README and a subdirectory
servletdir.
In the README file, copy-and-paste the transcript that you
generated for Part A of this assignment. C users must also specify,
in their README file, a URL for the .html file where
a grader can test the project.
You will submit your entire submission directory. The instructions
for doing so are the same for all project parts (although directory
contents will differ) and are detailed
here. While we strongly prefer that you submit only once, if
you do submit multiple times then all submissions except your last
will be ignored.
Please note: You must follow the directory and file naming
schemes and submission instructions exactly. Because we've had
significant problems in past years with students not following the
submission procedures (causing immeasurable TA trauma), points may be
deducted if you do not follow the submission procedures exactly as
specified.