Rockbuster Stealth LLC

1. Project Goals

Motivation
Rockbuster Stealth LLC is a fictional movie rental company that used to have stores around the world. Facing stiff competition from streaming services, the Rockbuster Stealth management team is planning to use its existing movie licenses to launch an online video rental service to stay competitive.

Objective
Extract key insights from RockBuster’s operational database to help with the launch strategy for the new online video service.

Zoomed view of Entity Relationship Diagram
Zoomed view of ERD with fact tables, payment and rental

2. Key Questions

RockBuster Stealth LLC management had some questions that wanted answers for:

Besides these, and after some exploratory data analysis, I raised a set of extra questions:

3. Process

Rockbuster provided the Dataset with their operational data. Therefore, with exception to sourcing data, I developed a full life cycle of data analysis, from data cleaning to final presentation to stakeholders, including documenting the database itself, producing a data dictionary (link).
Using SQL for this process, I started from scratch installing Postrgesql - a relational database management system - and pgadmin4 which acts as user interface to deal with the database itself. There, I made all actions related with cleaning and summarizing data, and on a later stage developed the queries that were needed to answer the relevant business questions, using techniques such as table joins, subqueries, common table expressions and window functions.

Pgadmin4 web interface
Pgadmin4 view with a Query using CTEs and Window functions

All the queries and corresponding outputs were compiled in an excel file (insert link), to be shared with technical peers.

4. Key Insights from Data

Distribution of revenue across the film catalog

Distribution curve films vs revenue
This is probably one of the most interesting insights of the project!

Rental rates performance

.

Revenue Distribution per rental rate Rental rates legend On the left it's shown the revenue distribution for each of the rental rates defined in the catalog.
Rate in this context is simply the price at which the films are rented for the agreed rental duration. Since the number of rentals on each rate are well balanced, the average revenue per rental follows a very similar distribution. But the immediate question was:

With a similar number of rentals, shouldn’t a rate of $4.99 produce a revenue approximately 5 times better than $0.99?

Normalized Average Revenue In fact, if we normalize the average revenue per rental, i.e., if we divide the average revenue per rental by the value of the rental rate itself, we can quickly see that 0.99$ is the one that performs better.

But why does this happen?

The secret lies in the penalty policy. The customers keep their films in completely different patterns from what would be expectable just by looking at what is defined in the catalog, and they incur in penalties in case of delays. Looking deeper into the data, I’ve discovered that the penalties are a fixed amount (1$) per day of delay, which in turn influences more the lowest rental rate.

Payments versus delays
How amounts paid evolve with the days in delay

Ocupancy Ratio

Another metric developed was the Occupancy Ratio. It was measured the timespan between the first rental date and the last return date on the database, serving as the denominator for the ratio (100% of time). Then, I measured for each existing copy of the films, how much time they were effectively on rental (or on the other hand they were catching dust in our stores shelves). Finally, I've computed an average time of the several copies of each film. The division of the latter by the denominator gave us the film's Occupancy Ratio.

Occupancy Ratio distribution
The movies could be rented way more time than they are

Geographical Distributions

Several distribution maps were created to show:
Customers Map

Customers Distribution Map

Revenue Map

Revenue Distribution Map

Revenue per customer Map

Revenue per Customer Distribution Map

Top5% spenders Map

Top 5% spenders Distribution Map

5. Results

Recommendations to the online video service launch strategy

Project Final Presentation (web view)

The full project documentation can be found on Github Repository, including both pdf and ppt versions of the presentation.