 Database Systems: The Complete
Book
 Solutions for Chapter 20 

Solutions for Section 20.1
Solutions for Section 20.2
Solutions for Section 20.3
Solutions for Section 20.4
Solutions for Section 20.5
Solutions for Section 20.6
Solutions for Section 20.1
Exercise 20.1.1(b)
One of the problems that needs to be handled is that there is no natural
id for a system composed of a computer and monitor from company
A.
We shall thus leave this field null in Systems, although there
might be some reasonable way to treat integers as strings and
concatenate them to invent system identifiers.
The insert operation is:
INSERT INTO Systems(processor, mem, disk, screenSize)
SELECT speed, memory, hd, screen
FROM Computers CROSS JOIN Monitors;
Remember that Systems.processor is analogous to
Computers.speed, not Computers.proc.
Exercise 20.1.2
The simplest approach is for the global schema to mimic company A's
schema.
The systems sold by company B can be represented by using the same
identifier for both the computer and the monitor.
However, since the attributes proc of Computers and
maxResX and maxResY of Monitors are not known
for systems of company B, we shall have to use NULL as the
value for each of these components.
Exercise 20.1.7(a)
At Dealer 1, the query would be:
SELECT serialNo
FROM Cars
WHERE autoTrans = 'yes';
while at Dealer 2 the query would be:
SELECT serialNo
FROM Options
WHERE option = 'autoTrans';
Return to Top
Solutions for Section 20.2
Exercise 20.2.1
SELECT *
FROM AutosMed
WHERE color = '$c';
=>
SELECT serialNo, model, color, autoTrans, 'dealer1'
FROM Cars, GtoL
WHERE Cars.color = GtoL.globalColor AND
GtoL.localColor = '$c';
Exercise 20.2.2(a)
At Company A we need:
SELECT *
FROM PCMed
WHERE speed = $s;
=>
SELECT 'A', speed, memory, hd, screen
FROM Computers CROSS JOIN Monitors
WHERE speed = $s;
At B we use:
SELECT *
FROM PCMed
WHERE speed = $s;
=>
SELECT 'B', processor, mem, disk, screenSize
FROM Systems
WHERE processor = $s;
Exercise 20.2.3(a)

Use the templates described in the solution to Exercise 20.2.2(a) to
get all tuples for systems with a 400 megahertz speed.

Then, select only those disk attribute has the value 12.0.
Return to Top
Solutions for Section 20.3y
Exercise 20.3.1(a)
u'c[PIV, G4, Athlon]bfu at the source for Computers.
Exercise 20.3.2(a)
We cannot query Monitors first, because a binding for
screen is not enough to match either adornment.
Moreover, we have only bindings for the last two attributes of
Computers, so the only adornment we can use there is
uuubb.
Thus, there is only one plan: query the Computers source for
machines with 128Mb of memory and an 80Gb hard disk, then use the
retrieved system numbers to query Monitors (using the
bfff adornment) for each number, to see if the system has a 19"
screen size.
Return to Top
Solutions for Section 20.4
Exercise 20.4.1
For part (a):
in the usual meanings of the terms, quant and price
would be dependent attributes, and the others would be dimension
attributes.
For part (b):
There is no one right answer.
Here is a reasonable choice.
We have assumed that there is an ID attribute for each of the attributes
of Orders; the ID's appear in the tuples of Orders and
link those tuples to tuples of the dimension tables.
We use the same names for attributes of Orders and their
corresponding dimension tables.
Cust(ID, name address, phone, creditCard)
Date(day, month, year)
Proc(ID, manf, model, name, speed)
Disk(ID, manf, model, cylinders, capacity, surfaces, rotSpeed)
CD(ID, manf, type, speed)
Return to Top
Solutions for Section 20.5
Exercise 20.5.1(a)
The ratio is (11/10)^10, or about 2.59.
Exercise 20.5.2(a)
We would use the tuples of the form
Sales(*, 'blue', *, d, val, cnt)
where d ranges over all dealers.
Exercise 20.5.3
In addition to the suggested processor type, manufacturer, and speed, we
might choose to split the CD dimension into type (CD or DVD, e.g.),
model number, manufacturer, and speed.
The harddisk component might be split into manufacturer, model,
capacity, and (rotational) speed.
Finally, we may wish to analyze customers by including such dimensions
as their zipcode and their method of payment.
Zip codes may be part of a hierarchy of locations, including cities,
states, and perhaps regions.
The dependent attributes will be quantity ordered, and total price of
the order.
Exercise 20.5.8(b)
Regions are placed into the diagram ``in parallel'' with States.
That is, the diagram looks like:
All
/ \
States Regions
\ /
Cities

Dealers
Exercise 20.5.10
Really large.
For example, suppose that there are n independent attributes.
Let F have only one tuple, say (1,1,...,1), with a nonzero
dependent attribute.
Every point in the cube that consists of *'s and 1's only will have a
nonzero dependent attribute.
There are 2^n such points, so the ratio of the size of
CUBE(F) to F can be as high as 2^n.
It is easy to see the ratio can be no higher, since each point in an
ndimensional space contributes to only 2^n aggregations.
Return to Top
Solutions for Section 20.6
Exercise 20.6.1(a)
Beer and juice appear togther only in B_{6} and B_{7}.
Thus, the support for the pair is 2/8 = 25%.
Exercise 20.6.1(c)
Beer appears in 6 of the baskets  all except B_{2} and
B_{4}.
Of these 6 baskets, milk appears in 4 of them: B_{1}, B_{3}, B_{5}, and
B_{6}.
Thus, the confidence of milk given beer is 4/6 = 67%.
Exercise 20.6.1(f)
(Revised 6/4/02)
A pair must appear in 3 out of the 8 baskets to have a support of 35%.
Apriori is of no help in this little example, because each item by
itself appears in at least three baskets.
Thus, we must count all 10 pairs.
When we do, we find that {milk, pepsi}, {milk, beer}, and {beer, pepsi}
are the only pairs that appear at least three times.
Here is a table of all the occurences:
milk  3  
coke  0  1  
beer  3  4  2  
juice  2  2  2  2 
 pepsi  milk  coke  beer 
Return to Top