Logistic Regression algorithm from Scratch

Opeyemi Osakuade
3 min readApr 22, 2021

--

Hello people🖐

I was going through a project earlier this week that required some basic understanding of Logistic Regression. Then I remembered the Algorithm Challenge by Data Science Nigeria in April 2020. I came 3rd place and got some goodies 😉

The competition remains one of the things I’m proud of in my Data Science career, it gave me the ginger to launch out🚀(I’m still launching oo 😉).

Let me not bore you with too much story. Let’s get into business (it’s a year now)👇

The challenge: Design and develop the Logistic Regression algorithm from scratch using Python

Logistic regression is a classification algorithm, used to model the probability of a certain class or event. It transforms the output into a probability value (i.e. a number between 0 and 1) using the logistic sigmoid function.

For a binary classifier, we want the classifier to output values that are between 0 and 1. i.e. 0 ≤ y θ(x)≤1

Hypothesis Representation

The term logistic regression refers to “logit function” which refers to “log odds”. Odds refers to the ratio of the probability of an event occurring to the probability it does not occur. Taking the log, log odds for the model turns out to be the equation of the Sigmoid Function

sigmoid function derivation

Cost function

Since the logistic regression function(sigmoid) is non-linear, to get a convex function, i.e a bowl-shaped function that eases the gradient descent function’s work to converge to the optimal minimum point, a logistic regression cost function is derived.

https://www.math24.net/convex-functions

But wait, What is cost function itself? I agree with how Swayam’s Blog explained it.

A cost function is basically a continuous and differentiable function that tells how good an algorithm is performing by returning the amount of error as output. The lesser the error, the better the algorithm is doing that’s why we randomly generate the parameters and then keep changing them in order to reach the minimum of that cost function.

Gradient descent

To choose the values of weights that correspond to a convex function and fit the data well (so we reach a global minimum)

https://en.wikipedia.org/wiki/File:Extrema_example_original.svg

we have to ensure that the prediction(h) is at least close to the actual y, minimize the cost function using gradient descent.

Repeat until convergence, updating all weights.

Find the dataset and notebook on my GitHub here

Link directly to notebook

References

https://twitter.com/DataScienceNIG/status/1254718078841245696?s=20

https://www.coursera.org/learn/logistic-regression-numpy-python/home/welcome

Originally published at https://github.com.

--

--