[an error occurred while processing this directive]
[an error occurred while processing this directive]
Assignment
[an error occurred while processing this directive]
Overview
Topic: SQL Queries
Related Reading: Ch. 3.4, 5.1-5.5, as well as details on the
sqlzoo.net database
Due: 12:45pm Thursday, 25 September 2003
[an error occurred while processing this directive]
For this assignment, you may either work alone or with one other
person. If you do work as a pair, both of you should submit a copy of
the solutions and you should make sure that both names are at the top
of the file.
You are also encouraged to seek help from the instructor, whether you
are working alone or as a pair.
The goal of the assignment is to create SQL queries that
generate each of the desired views of the data. You do not need to
submit the generated output, only the SQL query which can be used. We
will use the tables from sqlzoo.net
and assume you are using the choice of Oracle engine there.
Your answers should be typed in a single file and submitted
electronically in the folder
/home/cs341/Submit/_username_/sqlhw/ on the machine
patel2.
Please follow the instructions precisely as to the desired result and
the format of the user's external view. The most likely mistakes
are those caused by misreading the goal.
Also, your queries must be designed to work independently of any other
knowledge about the data. For example, if we asked you to give a
query which identifies the name of the country with the largest gdp,
the following query happens to generate the proper output on this dataset:
SELECT C.name, C.gdp
FROM cia C
WHERE C.gdp >= 9000000000000
but it would be an incorrect answer for this homework. It is not
general, as it relies on the significance of value 9000000000000.
- Produce an alphabetized list of all regions that contain a
country which includes the letter "z" in the country name (consider
both upper and lower case z's). Make sure a chosen region is not included
more than once.
- Produce a table with columns (name,area) which includes all
countries in either North or South America. Furthermore, make sure
that all North American countries appear earlier in the table then the
South American countries, and within each continent ensure that
countries appear from largest to smallest areas.
-
Generate a table with columns (region,people) where the people column
should be the cumulative population of all countries in that region.
Order your table from most populated region to least populated region.
- Generate a table identifying the name and gdp of the country with the
largest gdp of all countries. (Gee, I wonder what country that will be!)
-
Generate a table with columns (name,density) that contains all
countries which have a population density (i.e. people per square
kilometer) which lies between 300 and 400.
Hint: some countries are listed with an area of 0, wich makes the
density calculation undefined. You'll have to make sure to remove
those to get an executable query. Worse yet, SQL standards do not
mandate that engines "short circuit" boolean expression evaluation.
That is if you type something such as "expr1 AND expr2" both
expressions may get evaluated, even if expr1 is false. So try to form
a query which does not rely on short circuiting.
-
Generate a table listing the titles and years, chronologically, of all
movies which included 'Dan Aykroyd' in the cast.
-
Generate a table listing the titles and years, chronologically, of all
movies which included both 'Dan Aykroyd' and 'Chevy Chase' in the cast.
-
Generate a table which includes the names of all actors who were in
both 'Ghostbusters' as well as 'Ghostbusters II'.
Hint: INTERSECT
-
Generate a table with columns (title, releases)
listing all movie titles which have been used for two or more
independent releases, as well as the total number of such releases for
each title. Please order the titles alphabetically.
-
If you got the previous query to work, you would have noticed
that no movie title has been used more than twice in the database.
For this problem, you may assume this fact, namely that no title has
been released three or more times.
Your goal here is to generate a table with columns (title, original, remake)
that lists the same titles as with the previous problem, together with
the respective years of the two releases.
Hint: self-join
Extra Credit
-
Generate a table with columns (actor1,actor2), for every pair
of actors who have worked together in ten or more different movies.
Note: depending on how your write such a query, it could be very
computationally expensive for the database engine (in fact, sqlzoo may
even timeout your query). If you want to really impress, design a
query which is evaluated more quickly by the engine.
- Generate a query which identifies the specific pair
(actor1,actor2) who have appeared in the greatest number of movies
together (or in case of a tie, all possible such pairs).
(I should note that I have not yet made the effort to try this
myself, so I'm not sure of the difficulty. But in spirit, the data is
there so there should be a way to phrase the query).
[an error occurred while processing this directive]
[an error occurred while processing this directive]