A Complete 15 Week Curriculum to Master SQL for Data Science
ntroduction
As I work more and more in the corporate world as a data scientist, I am increasingly convinced that mastering SQL is essential to have a successful career. That’s why, if you’ve been following my articles, I’ve been writing a lot about SQL recently.
SQL is not a hard skill to learn (i.e. SELECT FROM WHERE), but it is certainly a hard skill to perfect. One of the reasons I found it so difficult to master SQL was that I had to learn everything on my own — I didn’t know what I didn’t know, so it was hard to be proactive about potential mistakes when writing SQL code.
That’s why I wanted to write this curriculum. This curriculum aims to cover not just the basics, but other very important things that are not talked about as much.
By completing this curriculum, you will have learned the following things:
- How to write simple queries and also advanced queries
- How to apply your SQL skills to real-life business case studies
- Best practices for writing professional SQL code
- Several resources to help you with interview prep
The general structure of this curriculum is as follows:
- Basic SQL (Weeks 1–3)
- Intermediate and Advanced SQL (Weeks 4–9)
- Writing Professional SQL code (Week 10)
- SQL Practice Problems (Week 11)
- Business Case Studies (Weeks 12–14)
- Interview Prep (Week 15)
If you are already familiar with the basics of SQL, feel free to skip weeks 1–3, but make sure you start at Week 4: SQL Query Order of Execution.
Is SQL that Important?
YES.
SQL is arguably the most important skill to learn for any data professional, not just data scientists.
SQL is used to extract specific data from a database, so that you can do things like analyze data, visualize data, model data, etc. Therefore, developing strong SQL skills will allow you to take your analyses, visualizations, and modeling to the next level because you will be able to extract and manipulate the data in advanced ways.
In order to be successful in this curriculum, you should first familiarize yourself with using SQL in Mode and then you’ll be able to go through the topics below.
With that said, let’s dive into it!
Week 1: Basic SQL
In the first week, you’ll learn all of the building blocks of a query so that you can write the most fundamental SQL queries.
Week 2: LOGICAL and COMPARISON Operators
Now that you’ve learned the basics of SQL, we’re going to learn intermediate to advanced concepts over the next few weeks so that you can beef up your queries.
This week, we’re going to cover logical operators and comparison operators, which are used to filter data:
Week 3: AGGREGATES
In week 3, you’ll learn about aggregate functions, which are operations that are performed across rows of data to return a single value.
- Aggregate Functions (COUNT, SUM, MIN/MAX, AVG)
- GROUP BY clause
- HAVING clause
Week 4: SQL Query Order of Execution
This is a very important topic that most SQL guides and courses glance over. The order of execution of a SQL query refers to the order in which the clauses of a query are conducted. By understanding this, you’ll be able to debug a lot more problems and write more efficient queries.
To learn about the SQL order of execution, check out this page from SQLBolt.
Week 5: Conditional Expressions
Similar to IF/THEN/ELSE statements in Python or JavaScript, SQL has its own version of conditional expressions, which we dive into in week 5:
Week 6: JOINS and UNIONS
Now that you’ve learned all of the building blocks for writing basic queries, filtering data, aggregating data, and writing conditional expressions, you’ll learn how to combine different tables with each other:
Week 7: Subqueries and Common Table Expressions
This is one of the most important weeks in my opinion. These two concepts allow you to write complex queries and manipulate data in profound ways. Make sure you have a strong understanding of these topics before moving on!
Week 8: String Manipulations
What makes a good SQL coder is the ability to manipulate data however he/she likes. In order to do this, you must have a strong understanding of string functions, which we’ll cover this week:
Week 9: Date-time manipulation
Similarly to the previous week, it’s essential that you learn how to manipulate date-time data, which we’ll cover this week:
- EXTRACT
- DATE_ADD()
- DATE_SUB()
- DATE_DIFF()
- See here for more functions (on the left of the webpage)
Week 10: Window Functions
Lastly, you’re going to learn a more advanced topic called window functions (or analytics functions). Window functions are similar to aggregate functions except that they return the same number of rows as what was inputted.
Check out the links below to learn more about them:
- Windows Functions (ROW_NUMBER(), RANK(), DENSE_RANK(), LAG, LEAD, SUM, COUNT, AVG)
- See here for advanced window functions.
Week 11: Writing Professional SQL Code
It’s one thing to write SQL code, but it’s another to write good SQL code. Writing clean and professional SQL code is one of the biggest differences between senior data analysts and junior data analysts. Thus, it’s important that you learn some basic rules to write good code. Check out the article below:
Week 12: SQL Practice Problems
Now it’s time to put your SQL knowledge to the test.
Leetcode and Hackerrank are great resources that host an array of practice problems for various programming languages, including SQL! These will be great resources for you to practice what you’ve learned so far.
Here are some good questions that you can try out below:
- Find Duplicate Emails
- Rank Scores
- Employees Earning More Than Their Managers
- Rising Temperature
- Trips and Users
Week 13–15: Case Studies
Case studies are the best way to simulate real-life problems as a data scientist. Below are three SQL case studies that are representative of problems that you would have to solve in a corporate setting:
To open Mode’s SQL editor, go to this link and click on the hyperlink where it says ‘Open another window to Mode’.
Week 11: Case Study 1 — Investigating a Drop in User Engagement
For this case study, your goal is to find the cause for a drop in user engagement for Yammer’s project. First, you should read the overview of what Yammer does here.
Check out how I approached this case study here if you’d like guidance.
Week 12: Case Study 2 — Understanding Search Functionality
This case is more focused on product analytics. Your goal is to determine whether the user experience is good or bad. What makes this case interesting is that it’s up to you to determine what ‘good’ and ‘bad’ mean and how the user experience will be evaluated.
Week 13: Case Study 3 — Validating A/B Test Results
One of the most widely applied data science applications in the business world is performing A/B tests. In this case study, your goal is to validate or invalidate the results of an A/B test where there was a 50% difference between the control and treatment groups.
What’s Next?
If you made it to the end, congrats! Committing to learning a new skill for 15 weeks is no easy feat. Don’t undermine your accomplishment, you should feel proud of yourself and confident in your SQL ability!
Next, take a look at the following two articles and make sure that you have a strong understanding of these concepts:
If you still want more resources to learn SQL, I’ve consolidated 6 incredible resources to learn advanced SQL:
Thanks for Reading!
If you enjoyed this, I would greatly appreciate it if you gave this a follow! As always, I wish you the best in your learning endeavors :)
Comments