One of the most popular Machine-Leaning course is Andrew Ng’s machine learning course in Coursera offered by Stanford University. I tried a few other machine learning courses before but I thought he is the best to break the concepts into pieces make them very understandable.

But I think, there is just only one problem. That is, all the assignments and instructions are in Matlab. I am a Python user and did not want to learn Matlab. So, I just learned the concepts from the lectures and developed all the algorithms in Python.

I explained all the algorithms in my own way(as simply as I could) and demonstrated the development of almost all the algorithms in the different articles before. I thought I should summarise them all on one page so that if anyone wants to follow, it is easier for them. Sometimes a little help goes a long way.

If you want to take Andrew Ng’s Machine Learning course, you can audit the complete course for free as many times as you want.

Let’s dive in!

Linear Regression

The most basic machine learning algorithm. This algorithm is based on the very basic straight line formula we all learned in school:

Y = AX + B

Remember? If not, no problem. This is a very simple formula. Here is the complete article that explains how this simple formula can be used to make predictions.

Linear Regression Algorithm from Scratch in Python: Step by Step

Learn the concepts of linear regression and develop a complete linear regression algorithm from scratch in python

towardsdatascience.com

The article above works on only the datasets with a single variable. But in real life, most datasets have multiple variables. Using the same simple formula, you can develop the algorithm with multiple variables:

Multivariate Linear Regression in Python Step by Step

Learn to develop a multivariate linear regression for any number of variables in Python from scratch.

towardsdatascience.com

Polynomial Regression

This one is also a sister of linear regression. But polynomial regression is able to find the relationship between the input variables and the output variable more precisely, even if the relationship between them is not linear:

Polynomial Regression From Scratch in Python

Learn to implement polynomial regression from scratch with some simple python code

towardsdatascience.com

Logistic Regression

Logistic regression is developed on linear regression. It also uses the same simple formula of a straight line. This is a widely used, powerful, and popular machine learning algorithm. It is used to predict a categorical variable. The following article explains the development of logistic regression step by step for binary classification:

A Complete Logistic Regression Algorithm From Scratch in Python: Step by Step

Developed the Algorithm Using a Real-World Dataset

towardsdatascience.com

Based on the concept of binary classification, it is possible to develop a logistic regression for multiclass classification. At the same time, Python has some optimization functions that help to do the calculation a lot faster. In the following article, I worked on both the methods to perform a multiclass classification task on a digit recognition dataset:

Multiclass Classification Algorithm from Scratch with a Project in Python: Step by Step Guide

This article explains two methods: The gradient descent method and the optimization function method

towardsdatascience.com

Neural Network

Neural Network has been getting more and more popular nowadays. If you are reading this article, I guess you heard of neural networks.

A neural network works much faster and much efficiently in more complex datasets. This one also involves the same formula of a straight line but the development of the algorithm is a bit more complicated than the previous ones. If you are Andrew Ng’s course, probably, you know the concepts already. Otherwise, I tried to break down the concepts as much as I could. Hopefully, it is helpful:

A Complete Neural Network Algorithm from Scratch in Python

Learn With a Project Using the Digit Recognition Dataset

towardsdatascience.com

Learning Curve

What if you spent all that time and developed an algorithm and then, it does not work the way you wanted. How do you fix it? You need to figure out first where the problem is. Is your algorithm faulty or you need more data to train the model or you need more features? So many questions, right? But if you do not figure out the problem first and keep moving in any direction, it may kill too much time unnecessarily. Here is how you may find the problem:

If Your ML Algorithm Is Not Performing Well

How to Detect the Problem

towardsdatascience.com

On the other hand, if the dataset is too skewed that is another type of challenge. For example, if you are working on a classification problem, where 95% of cases it is positive and only 5% of cases are negative. In that case, if you just randomly put all the output as positive, you are 95% correct. On the other hand, if the machine learning algorithm turns out to be 90% accurate, it is still not efficient, right? Because without a machine learning algorithm, you can predict with 95% accuracy. Here are some ideas to deal with these types of situation:

A Complete Understanding of Precision, Recall, and F Score Concepts

How to Deal with a Skewed Dataset in Machine Learning

towardsdatascience.com

K Mean CLustering

One of the most popular and old unsupervised learning algorithms. This algorithm does not make predictions like the previous algorithms. It makes clusters based on the similarities amongst the data. It is more like understanding the current data more effectively. Then whenever the algorithm sees new data, based on its characteristics, it decides which cluster it belongs to. This algorithm has other importance as well. It can be used for the dimensionality reduction of images.

Why do we need dimensionality reduction of an image?

Think, when we need to input a lot of images to an algorithm to train an image classification model. Very high-resolution images could be too heavy and the training process can be too slow. In that case, a lower-dimensional picture will do the job with less time. This is just one example. You probably can imagine, there are a lot of uses for the same reason.

This article is a complete tutorial on how to develop a K mean clustering algorithm and how to use that algorithm for dimensionality reduction of an image:

A Complete K Mean Clustering Algorithm From Scratch in Python: Step by Step Guide

Also, How to Use K Mean Clustering Algorithm for Dimensionality Reduction of an Image

towardsdatascience.com

Anomaly Detection

Another core machine learning task. Used in credit card fraud detection, to detect faulty manufacturing or even any rare disease detection or cancer cell detection. Using the Gaussian distribution(or normal distribution) method or even more simply a probability formula it can be done. Here is a complete step by step guide for developing an anomaly detection algorithm using the Gaussian distribution concepts:

A Complete Anomaly Detection Algorithm From Scratch in Python: Step by Step Guide

Anomaly Detection Algorithm Using the Probabilities

towardsdatascience.com

If you need a refresher on a Gaussian distribution method, please check this one:

Univariate and Multivariate Gaussian Distribution: Clear Understanding with Visuals

Gaussian distribution in details and its relationship with mean, standard deviation, and variance

towardsdatascience.com

Recommender System

The recommendation system is everywhere. If you buy something on Amazon, it will recommend you some more products you may like, YouTube recommends the video you may like, Facebook recommends people you may know. So, we see it everywhere.

Andrew Ng’s course teaches how to develop a recommender system using the same formula we used in linear regression. Here is the step by step process of developing a movie recommendation algorithm:

A Complete Recommender System From Scratch in Python: Step by Step

Movie recommendation system based on the user ratings using the linear regression idea

towardsdatascience.com

Conclusion

Hopefully, this article will help some people to start with machine learning. The best way is by doing. If you notice most of the algorithms are based on a very simple basic formula. I see a notion that machine learning or Artificial Intelligence requires very heavy programming knowledge and very difficult math. That’s not always true. With simple codes, basic math, and stats knowledge, you can go a long way. At the same time, keep improving your programming skills to do more complex tasks.

If you are interested in machine learning, just take some time and start working on it.

Feel free to follow me on Twitter and like my Facebook page.

How I Switched to Data Science

My Journey, Mistakes, and Learnings

towardsdatascience.com

Three Popular Continuous Probability Distributions in R with Examples

Use cases of Uniform, Normal, and Exponential Continuous Probability Distributions in R

towardsdatascience.com

A Collection of Advanced Visualization in Matplotlib and Seaborn with Examples

Enriching the Visualization Techniques and Skills

towardsdatascience.com

An Ultimate Guide to Time Series Analysis in Pandas

All the Pandas Function You Need to Perform Time Series Analysis in Pandas. You Can Use This as a Cheat Sheet as Well.

towardsdatascience.com

A Complete Guide to Hypothesis Testing for Data Scientists Using Python

Explained Clearly with Sample Research Questions, Solution Steps, and Complete Codes

towardsdatascience.com

WRITTEN BY

Rashida Nasrin Sucky

Data Scientist and MS Student at Boston University. Read my blog: https://regenerativetoday.com/

Transfer learning for time series forecasting and classification

Transfer learning’s latest frontier

A brief history: ImageNet was first published in 2009 and over the next four years would go on to form the bedrock of most computer vision models. To this day whether you are training a model to detect pneumonia or classify models of cars you will probably start with a model pre-trained on ImageNet or some other large (and general image) dataset.

More recently papers like ELMO and BERT (2018) leveraged transfer learning to effectively improve performance on several NLP tasks. These models create effective context dependent representations of words. …

James Green

·5 hours ago

5 BigQuery SQL performance tips for modern data scientists

SQL tuning tips and advice to help reduce BigQuery execution time and costs. Start 2021 off on the right foot!

BigQuery performance tips — Image licensed to author

Google BigQuery, like other modern hyper-scale data platforms, has a different architecture to what many data professionals and data scientists are used to; it stores its data in columns instead of rows (referred to as a column-store), and processes SQL queries in a fully distributed architecture.

It is these unique properties that enable the platform to achieve such high levels of query performance.

If, like me, you came to BigQuery from an MS SQL background, a lot of what I knew in regards to writing efficient SQL queries, actually no longer applied. …

Tanu N Prabhu

·5 hours ago

Presenting Python code using RISE

In this article, you will use RISE as a Python jupyter notebook extension for turning it into a live presentation.

Scenario

Your boss/professor tells you to find the largest number in a Python list and present the result in front of the whole employees/classmates. Now there are some rules that you need to follow to impress your boss/professor. They should be able to see both the documentation and code in this presentation. You should execute the code too. You may use any presentation tool of your choice. Now, how would you be able to present your code and ace it at the same time?

Solution

The above scenario can be easily tackled with the help of RISE — a Jupyter Notebook plugin, which can convert your .ipynb …

Richmond Alake

·5 hours ago

Interesting AI/ML Articles On Medium This Week (Dec 12)

Read articles on Deep fake Santas, Game of Thrones, AI In Healthcare, The Current State of AI and more…

Some adults know Santa Claus isn’t real. But that doesn’t stop deep fake Santa Claus from making an appearance this Christmas. Thomas Smith explores an AI video generation tool that produces deep fake Santas. These synthetically generated Santas can utter words from provided scripts.

Jair Ribeiro visualises a future where his daughters experience the comfort and efficiency of automated driving cars.

Also, If you are a fan of Game Of Thrones, then Sajid Lhessani’s unique take on the association of programming languages and GOT characters is an article you have to read.

These are just examples of the interesting articles I came across this week. Feel free to scroll through my compiled list of AI/ML/DS articles that are sure to provide some form of value to ML practitioners. …

Andy Patterson

·5 hours ago

Comparing Tile-coding Implementations

This post discusses a direct approach to implementing a tile-coding representation, comparing both the computational and learning performance to the popular Tiles3 implementation. I point out a small bug in Tiles3 and show how in a particular setting this bug can have a large impact on early learning results.

Tile-coding

I won’t go into a deep tile-coding primer, a fantastic introduction to tile-coding can be found in Sutton and Barto, 2018. But I will discuss a few important attributes of tile-coding that will help shape the discussion of the difference between implementations.

A tile-coder is a fixed representation generating function. I say fixed because the function does not change over time unlike — for instance — a neural network. The tile-coder maps a state from the set of states to a vector of features, or…

Friday, December 11, 2020

A Full-Length Machine Learning Course in Python for Free

Linear Regression

Linear Regression Algorithm from Scratch in Python: Step by Step

Learn the concepts of linear regression and develop a complete linear regression algorithm from scratch in python

towardsdatascience.com

Multivariate Linear Regression in Python Step by Step

Learn to develop a multivariate linear regression for any number of variables in Python from scratch.

towardsdatascience.com

Polynomial Regression

Polynomial Regression From Scratch in Python

Learn to implement polynomial regression from scratch with some simple python code

towardsdatascience.com

Logistic Regression

A Complete Logistic Regression Algorithm From Scratch in Python: Step by Step

Developed the Algorithm Using a Real-World Dataset

towardsdatascience.com

Multiclass Classification Algorithm from Scratch with a Project in Python: Step by Step Guide

This article explains two methods: The gradient descent method and the optimization function method

towardsdatascience.com

Neural Network

A Complete Neural Network Algorithm from Scratch in Python

Learn With a Project Using the Digit Recognition Dataset

towardsdatascience.com

Learning Curve

If Your ML Algorithm Is Not Performing Well

How to Detect the Problem

towardsdatascience.com

A Complete Understanding of Precision, Recall, and F Score Concepts

How to Deal with a Skewed Dataset in Machine Learning

towardsdatascience.com

K Mean CLustering

A Complete K Mean Clustering Algorithm From Scratch in Python: Step by Step Guide

Also, How to Use K Mean Clustering Algorithm for Dimensionality Reduction of an Image

towardsdatascience.com

Anomaly Detection

A Complete Anomaly Detection Algorithm From Scratch in Python: Step by Step Guide

Anomaly Detection Algorithm Using the Probabilities

towardsdatascience.com

Univariate and Multivariate Gaussian Distribution: Clear Understanding with Visuals

Gaussian distribution in details and its relationship with mean, standard deviation, and variance

towardsdatascience.com

Recommender System

A Complete Recommender System From Scratch in Python: Step by Step

Movie recommendation system based on the user ratings using the linear regression idea

towardsdatascience.com

Conclusion

More Reading:

How I Switched to Data Science

My Journey, Mistakes, and Learnings

towardsdatascience.com

Three Popular Continuous Probability Distributions in R with Examples

Use cases of Uniform, Normal, and Exponential Continuous Probability Distributions in R

towardsdatascience.com

A Collection of Advanced Visualization in Matplotlib and Seaborn with Examples

Enriching the Visualization Techniques and Skills

towardsdatascience.com

An Ultimate Guide to Time Series Analysis in Pandas

All the Pandas Function You Need to Perform Time Series Analysis in Pandas. You Can Use This as a Cheat Sheet as Well.

towardsdatascience.com

A Complete Guide to Hypothesis Testing for Data Scientists Using Python

Explained Clearly with Sample Research Questions, Solution Steps, and Complete Codes

towardsdatascience.com

Rashida Nasrin Sucky

Data Scientist and MS Student at Boston University. Read my blog: https://regenerativetoday.com/

9

9

A Medium publication sharing concepts, ideas, and codes.

Isaac Godfried

·5 hours ago

Transfer learning’s latest frontier

Read more · 6 min read

James Green

·5 hours ago

SQL tuning tips and advice to help reduce BigQuery execution time and costs. Start 2021 off on the right foot!

Read more · 9 min read

2

Tanu N Prabhu

·5 hours ago

In this article, you will use RISE as a Python jupyter notebook extension for turning it into a live presentation.