A Handy List of Data Science Related Resources for Learning, Interviewing, and Professional Development
Hi everyone, I’ve compiled a list of useful websites and information to help you in whatever stage of the process you are, whether you’re starting your journey in data science, interviewing with companies, negotiating an offer, or just looking to continue to learn more.
This includes interview prep resources, post interview compensation research, and more. I’ll also continue to update the list as I find more resources along my journey.
Here’s a table of contents of what you will find below:
1. For Beginners Interested in Getting Into Data Science
a) Free Resources for Learning Python, R, and SQL
b) Free Resources for Stats
b) Paid Channels for Learning Data Science
c) Machine Learning Resources
d) Where to Find Datasets
2. Interview Prep
a) Individual Code Practice
b) Live Code Practice
c) Cheatsheets
3. On the Job Hunt
4. Post Interview
a) Compensation Information
b) Negotiation
5. Professional Development
a) Make Great Visualizations
b) Learn Tips & Tricks/Other Cool Stuff
For Beginners Interested in Getting Into Data Science
If you think this is the field you want to pursue, as a start, check out this really cool infographic called “Become a data scientist in 8 easy steps”.
Your journey will take up a lot of time and personal investment, so if you’re ready for it, here are some resources for you to build your skills as a Data Scientist. Coding, SQL, machine learning algorithms, and statistics are the basic skills that you’ll need to learn.
Free Resources for Learning Python, R, and SQL
There are a lot of free resources out there that will teach beginners how to code in Python and R. A Google search will return you with lots of results. The ones I’ve listed below are the websites that I think are really great at walking you through step by step.
- Python, R, and SQL https://www.datacamp.com/
- Python, R, and SQL https://www.udacity.com/
- Python, R, and SQL https://www.dataschool.io/
- R https://swirlstats.com/
- SQL https://www.w3schools.com/sql/
As you get started on doing projects, it might be helpful for you to check out my article called An Easy Beginner’s Guide to Git. Git is a versioning control software for your projects.
Free Resources for Stats
- Basic statistics if you want to start at the very beginning https://www.udacity.com/course/intro-to-statistics--st101
- Probability with lots of examples https://www.intmath.com/counting-probability/counting-probability-intro.php
- Highly recommend this book (An Introduction to Statistical Learning with Applications in R) even if you don’t use R. It’s available for free. http://faculty.marshall.usc.edu/gareth-james/
Paid Channels for Learning Data Science
If practicing on your own doesn’t quite work for your learning style, there are other options. You can check out bootcamps or get your masters.
I actually went to a data science bootcamp called Galvanize in San Francisco, which was a classroom environment and I really enjoyed the in-person interaction. There are online bootcamps as well and some will guarantee that you will obtain a job. Do your research and find what’s right for you as there’s pros and cons to individual bootcamps. Some offer job guarantees. Some may offer mentoring. For example, currently, I am a mentor in Springboard’s Analytics Bootcamp and I meet weekly with a student for a half hour phone call to answer questions and go over curriculum. If you want to check out Springboard’s programs, click here.
If you have the time and money, a masters program might be an option if you would like to get more in depth with the material. Bootcamps tend to provide more breadth than depth. I am currently working on my masters online at Georgia Tech and it’s an excellent and cheaper option than most other programs.
Feel free to comment below if you have questions about bootcamps or my masters program.
One other cool thing I want to point out is something new that Coursera started. Once you’ve gotten a good foundation in coding, you can practice it by trying a guided project to learn a job-relevant skill in under two hours by a subject matter expert. Note that this is not free. https://www.coursera.org/courses?query=guided%20projects
Machine Learning and Other Resources
Here are some helpful websites for those interesting in getting started with machine learning.
- Machine Learning taught in a clear manner https://machinelearningmastery.com/
- Andrew Ng’s courses on Machine Learning and Deep Learning https://www.coursera.org/learn/machine-learning https://www.coursera.org/specializations/deep-learning
- Deep Learning https://www.fast.ai/
- Explained Visually — “an experiment in making hard ideas intuitive” https://setosa.io/ev/
- Natural Language Processing http://www.nltk.org/book/ch01.html
Once you’ve learned about machine learning models, you can start participating in competitions. Kaggle is a great way to practice what you learn on real problems. https://www.kaggle.com/competitions
Where to Find Datasets
If you’re looking for data to work on a capstone project, here’s a list of places that is a good starting point.
- Kaggle is a good resource for data, but make sure to look for datasets that have a high usability score, which indicates that it is a relatively clean and good dataset. https://www.kaggle.com
- UCI has a machine learning repository, which is a popular place to find datasets. It even provides the type of task the dataset is for (classification, regression, or clustering) https://archive.ics.uci.edu/ml/datasets.php
- The Census can provide data about people and the economy. https://www.census.gov/data.html
- Covid-19 dataset https://github.com/owid/covid-19-data/tree/master/public/data/
Interview Prep
Interviewing for a data science position is a lot different than other positions and is going to take a good bit of prep work.
In general, there are going to be multiple rounds that be involved a take-home assignment, recruiter interview, technical interview, and onsite coding interview.
Below are some websites where you can practice SQL or Python or other programming languages. I’ve also included some cheatsheets that may be handy for your use prepping or on the job.
As a side note, I’ve heard that book “Cracking the Coding Interview” by Gayle McDowell is really helpful. It contains nearly 200 programming questions and answers that are asked by the large tech companies.
Individual Code Practice (these are all free by the way)
- Leetcode has over 800 questions that you can work on and they have various level of difficulty ranging from easy to medium to hard to choose from. https://leetcode.com/problemset/all
- Codewars is kind of like a structured program in that you progress through different ranks as you complete challenges. https://www.codewars.com/
- HackerRank is a competitive kind of platform in which your solution will be scored on accuracy and you will be ranked against other users. https://www.hackerrank.com/
Live Code Practice
- I recommend that you try this out and practice live mock interviews and coding problems with your peers. This makes it feel more like an actual interview environment and it will be really good practice. Pramp is free to use. https://www.pramp.com/#/
- Alternatively, you can also practice mock interview with someone at a company that you’re applying to. (Free, but currently a charge due to COVID-19. ) https://interviewing.io/
- And of course, don’t forget to practice just coding a problem on a whiteboard. Grab a friend to pretend to be an interviewer!
Cheatsheets
To speed up your work efficiency, it’s always nice to have cheatsheets to refer back to. Here are some that I like.
SQL
- For the basics, see https://learnsql.com/blog/sql-basics-cheat-sheet/
- Once you’ve learn the basics, I would check out window functions. https://learnsql.com/blog/sql-window-functions-cheat-sheet/
R
- Check out the RStudio for lots of great cheatsheets https://www.rstudio.com/resources/cheatsheets/
Python
- I’m a fan of the cheatsheets that Datacamp makes. See cheatsheets for Numpy, Pandas, Seaborn, Scipy, and more. https://www.datacamp.com/community/data-science-cheatsheets
On the Job Hunt
If you’re ready to start applying and interviewing, you can start out by going to the popular sites such as Indeed, ZipRecruiter, and Google, but also check out these resources here.
- Get referrals to top tech companies https://repher.me/
- Find startup jobs here https://angel.co/
- Easily apply to hundreds of companies using LinkedIn Jobs https://www.linkedin.com/jobs/linkedin-jobs/
- And don’t forget, your best bet is to ask your friends and family for referrals!
Lastly, I’ve heard of this new website called Blind in which verified employees can ask and answer questions anonymously. You’ll be able to do your research on the culture and perhaps learn about salary as well. https://www.teamblind.com/
One tip I’ve been given is to start practicing first with companies that you’re not as interested in, and save the companies that you are really excited about for after you’ve had practice.
Post Interview
Now that you’ve gotten an offer, are you prepared to negotiate salary?
Compensation Information
How much is the company paying for this kind of position? Do your research so that you know what the market is paying.
I usually start off with Glassdoor.com to see if I can find salary information. A couple of other websites that offer estimations include https://www.salarylist.com/ and https://www.payscale.com/
Here are a couple websites that I think are good as well
- Find data straight from the Bureau of Labor Statistics https://www.bls.gov/ooh/
- Compensation data https://www.levels.fyi/comp.html?track=Data%20Scientist#
- Find actually salary info from H-1B hires https://h1bdata.info/
Negotiation
It always pays to negotiate! Just do it. It really doesn’t hurt to ask.
- Check out this excellent article on negotiation for women http://womenforhire.com/negotiating_salary_benefits/negotiating_salary_101_tactics_for_better_compensation/
Professional Development
Learning is a continuous journey. To become better at creating visualizations, to gain new knowledge, or just learn tips and tricks, check out some of the content below.
Make Great Visualizations
- I really enjoy the visualizations from the Wall Street Journal’s Graphics Team. Check them out for inspiration for your own visuals. https://graphics.wsj.com/
- There are tons of creative dashboards from Tableau Public. Check out https://public.tableau.com/en-us/gallery/?tab=viz-of-the-day&type=viz-of-the-day
- This article here gives some really nice tips on how to create better visualizations with your data. https://depictdatastudio.com/how-to-create-a-data-visualization-style-guide-to-tell-great-stories/
- Here’s an iconic talk about storytelling with visuals from Hans Rosling. https://www.ted.com/talks/hans_rosling_the_best_stats_you_ve_ever_seen#t-297730
- Color is an important piece of visualizations, one that people don’t always pay attention to. Colorbrewer is a tool that gives advice on selecting good color schemes for your graphics. https://colorbrewer2.org/
Learn Tips & Tricks/Other Cool Stuff
- A weekly Python podcast with interviews, coding tips, and conversation with guests from the Python community https://realpython.com/podcasts/rpp/
- A collection of blogs from bloggers who use the R software https://www.r-bloggers.com/
- Generate dummy JSON data https://www.mockaroo.com/
Comments
One key aspect of data science that's often overlooked is understanding the market dynamics and salary trends. For those considering a career in data science, having access to H1B Salary Data is incredibly beneficial. It provides a real-world perspective on compensation for data science professionals, which can be essential in negotiating fair salaries and evaluating job offers. In a field as dynamic as data science, staying informed about salary trends is crucial for career growth.
I appreciate the inclusion of resources like H1B Salary Data in this list, as it highlights the importance of not only acquiring technical skills but also understanding the industry's financial landscape. It's a great addition to the comprehensive set of data science resources provided in this list.