Supplemental Subquery Exercises

CIT225 Subquery Supplemental Exercises

Written by Tanner Crook using Sakila sample data on MySQL


The book subquery exercises are a little awkward so it might be cleary and more fun to try out your knowledge of subqueries against real world scenarios.

Because sakila data is from 2005 and 2006, pretend that the current date (today) is 2006-01-01.

Scenario 1: Monthly Segments

It’s the end of the year and we want to see how the video store trended. A good way to see how performance is trending over time is to look at monthly averages in segments:

  • L12M Avg Monthly Revenue
  • L6M Avg Monthly Revenue
  • L3M Avg Monthly Revenue
  • L1M Avg Monthly Revenue

We hope to see that our avg revenue goes up in each segment, which means that on avg we are trending upwards.

FYI Tip: Using the Avg Monthly by segment vs Monthly Total with smooth out any outliers.

Step 1: Write Single Query for Segment


Write the query to pull the Monthly Total Revenue for the Last 12 Months (L12M).

You can use the DATE_SUB function to find payments in that period:

WHERE payment_date >= DATE_SUB('2006-01-01', INTERVAL 12 MONTH)

Your results should look like this:

+----------------+----------+----------+
| period_segment | mnth     | revenue  |
+----------------+----------+----------+
| L12M           | May      |  4824.43 |
| L12M           | June     |  9631.88 |
| L12M           | July     | 28373.89 |
| L12M           | August   | 24072.13 |
| L12M           | February |   514.18 |
+----------------+----------+----------+
5 rows in set (0.04 sec)

Step 2: Write the Report Query to show Avg Monthly for all L12M


Write the outer query using the query from step 1 as a subquery in the FROM clause.

Your results should look like the following:

+----------------+-----------------+
| period_segment | avg_monthly_rev |
+----------------+-----------------+
| L12M           |    13483.302000 |
+----------------+-----------------+
1 row in set (0.06 sec)

We can now see that in the last twelve months what the average monthly revenue is.

Step 3: Repeat Step 1 for All Segments and put in Report Query


Using Step 1 as a template, create a query for each segment listed in the scenario and integrate into the reporting query.

You can use a UNION ALL to link your queries into a single table:

(
SELECT
'L12M' AS period_segment,
...
FROM
...
UNION ALL
SELECT
'L6M' AS period_segment,
...
FROM
...
)

Final Results


Your final results should show the following:

+----------------+-----------------+
| period_segment | avg_monthly_rev |
+----------------+-----------------+
| L12M           |    13483.302000 |
| L6M            |    17653.400000 |
| L3M            |      514.180000 |
| L1M            |      514.180000 |
+----------------+-----------------+
4 rows in set (0.06 sec)

Uh Oh! Looks like maybe we should have invested in Netflix instead of starting a brick-and-mortar video rental.

Scenario 2: Avg Monthly Rentals by Rating and Quarter

Challenge Scenario: This exercise won’t be as detailed or contain detailed step-by-step walkthroughs, but use the previous scenario for hints.

We rearrange our shelves on a quarterly basis. We want to analyze how rentals by ratings are trending by quarters so we can organize accordingly if there are any noticable trends.

Your Mission should you choose to accept it.

Get a dataset containing:

Dimensions: year, quarter, rating 
Metrics: average monthly rentals

The result set should look like the following:

+-------------+----------------+--------+---------------------+
| rental_year | rental_quarter | rating | avg_monthly_rentals |
+-------------+----------------+--------+---------------------+
|        2005 |              2 | G      |            302.5000 |
|        2005 |              2 | PG     |            337.5000 |
|        2005 |              2 | PG-13  |            391.0000 |
|        2005 |              2 | R      |            353.0000 |
|        2005 |              2 | NC-17  |            349.5000 |
|        2005 |              3 | G      |           1067.5000 |
|        2005 |              3 | PG     |           1251.5000 |
|        2005 |              3 | PG-13  |           1380.5000 |
|        2005 |              3 | R      |           1223.0000 |
|        2005 |              3 | NC-17  |           1275.0000 |
|        2006 |              1 | G      |             33.0000 |
|        2006 |              1 | PG     |             34.0000 |
|        2006 |              1 | PG-13  |             42.0000 |
|        2006 |              1 | R      |             29.0000 |
|        2006 |              1 | NC-17  |             44.0000 |
+-------------+----------------+--------+---------------------+
15 rows in set (0.06 sec)

Tips


Start by creating a subquery to get the monthly, rating total rentals.

We need the rental_month_start in date form so we can use the QUARTER() function when we aggregate in the main query. The YEAR() and MONTH() functions can be used with CONCAT() to fabricate this value.

CONCAT(YEAR(rental_date),'-',MONTH(rental_date),'-01') AS rental_month_start,

Subquery Data:

+-------------+--------------+--------------------+--------+-----------------+
| rental_year | rental_month | rental_month_start | rating | monthly_rentals |
+-------------+--------------+--------------------+--------+-----------------+
|        2005 | July         | 2005-7-01          | PG     |            1348 |
|        2005 | August       | 2005-8-01          | PG     |            1155 |
|        2005 | May          | 2005-5-01          | PG     |             216 |
|        2005 | June         | 2005-6-01          | PG     |             459 |
|        2005 | August       | 2005-8-01          | G      |             946 |
...

You will not need to leverage UNION ALL in this scenario.

Solutions

Don’t read further if you you don’t want to see the solutions!

Scenario 1


SELECT 
period_segment,
AVG(revenue) AS avg_monthly_rev
FROM
(
    SELECT 
    'L12M' AS period_segment,
    MONTHNAME(payment_date) AS mnth,
    SUM(amount) AS revenue 
    FROM payment 
    WHERE payment_date >= DATE_SUB('2006-01-01', INTERVAL 12 MONTH)
    GROUP BY period_segment, mnth
    UNION ALL
    SELECT 
    'L6M' AS period_segment,
    MONTHNAME(payment_date) AS mnth,
    SUM(amount) AS revenue 
    FROM payment 
    WHERE payment_date >= DATE_SUB('2006-01-01', INTERVAL 6 MONTH)
    GROUP BY period_segment, mnth
    UNION ALL
    SELECT 
    'L3M' AS period_segment,
    MONTH(payment_date) AS mnth,
    SUM(amount) AS revenue 
    FROM payment 
    WHERE payment_date >= DATE_SUB('2006-01-01', INTERVAL 3 MONTH)
    GROUP BY period_segment, mnth
    UNION ALL
    SELECT 
    'L1M' AS period_segment,
    MONTH(payment_date) AS mnth,
    SUM(amount) AS revenue 
    FROM payment 
    WHERE payment_date >= DATE_SUB('2006-01-01', INTERVAL 1 MONTH)
    GROUP BY period_segment, mnth
) am
GROUP BY period_segment;

Scenario 2


SELECT 
rental_year,
QUARTER(rental_month_start) AS rental_quarter,
rating,
AVG(monthly_rentals) AS avg_monthly_rentals
FROM
(
    SELECT 
    YEAR(rental_date) AS rental_year,
    MONTHNAME(rental_date) AS rental_month,
    CONCAT(YEAR(rental_date),'-',MONTH(rental_date),'-01') AS rental_month_start,
    f.rating AS rating,
    COUNT(rental_id) AS monthly_rentals
    FROM rental r 
    INNER JOIN inventory i 
    ON r.inventory_id = i.inventory_id
    INNER JOIN film f 
    ON i.film_id = f.film_id
    GROUP BY rental_year, rental_month, rental_month_start, rating
) mr
GROUP BY rental_year, rental_quarter, rating
ORDER BY 1, 2;

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>