Image source: Author

*“Logistic regression is not a regression but a classification algorithm”.*

You may have seen this in the latest popular machine textbooks, blogs, or you may have heard * Data Science Gurus * say the same thing on their highly subscribed YouTube channels.

Machine learning has usurped and appointed many statistical techniques. Often to the extent that they now do not believe and reject its statistical origin. The case is “Logistic regression not regression”

However, nothing can be further from the truth than this claim. We live in a meme culture, memes even become a cryptocurrency.

So why don’t we use meme to drive home that logistic regression is indeed regression.

This meme would have made things clearer to most of you. However, I want to unpack it for clarity.

*Logistic regression models the continuous outcome (probabilities). The probabilities range between 0 and 1.*

Binary logistic regression is used as a classification algorithm when we want the response variable to be bidirectional (Churn / Not Churned, Pass / Fail, Spam / No spam, etc.)

In general, we make a logistic regression classification algorithm by setting an appropriate probability threshold or threshold (0.4, 0.5, 0.6, etc.).

**Classification problem using a threshold**

Fixing the probability limit is a purely business call and not a statistical one. The threshold may vary by domain.

Frank Harrell in his blog¹ aptly does the thing “classification is a forced choice.”

Now consider this example and select 0.5 as the threshold. Now the ML algorithm gives a default or no default (1-default, 0 – no default) probability for 4 clients at 0.51, 0.49, 0.23, and 0.92. Based on the threshold, 2 is classified as the default and 2 is not the default. However, ask yourself, isn’t it too close to calling customers with probabilities of 0.51 and 0.49? 0.51 is certainly closer to 0.49 (which is not the default) than 0.92 (which is classified as the default).

Some of the popular machine learning packages and low code tools do not explicitly describe the predicted probabilities for the user. The user is thus unaware of what predicted probabilities will be achieved. He simply gets a decision – default or no default (1 or 0). In cases 0.49 and 0.51, the user made a happy decision that the person did not default on the payment and respectively. But a peek inside the predicted probabilities reveals that it was too close to calling!

Another threshold problem is that when we use the wrong scoring rule, such as rating accuracy, it can be easily played. For example, if out of a hundred people, 95 of them default on the loan and five do not. If the classifier categorizes everything into loan defaults, its accuracy would be 95% !!

**So, is there a better way to use logistic regression?**

The answer is yes.

Industries such as finance and marketing make more appropriate use of logistic regression in credit risk modeling and targeting of marketing campaigns.

**Actual use case**

It is said that you are a common market organization responsible for selling and marketing a product in your organization. You are going to launch a marketing campaign to increase sales of that product. You have been given a fixed budget for this. Now you want to get the best possible return on investment, that is, use an exact fixed or even lower budget and get the highest possible sales. Here is what you have.

You have data on 10,000 customers who had either bought or purchased a similar product in the past.

You would like to understand which customers should be targeted to increase the likelihood of buying this time.

Of course, you want to target people who are more likely to buy your product because you have a fixed budget for your campaign. How to proceed in it?

**What is decile analysis?**

Decile analysis was once a commonly used technique, but combining the teaching and grouping of machine learning problems as either a classification or regression type makes people forget about Decile analysis type analyzes. I’m pretty sure most of the scientists just hit wouldn’t have even heard of the Decile analysis. So let’s go back to Decile analysis.

Decile analysis is used to classify a data set from highest to lowest value or vice versa. (Based on predicted probabilities)

As the name implies, the analysis involves dividing the data into ten equal groups. Each group should have the same number. observations / customers.

It ranks customers in order most likely to match the least likely to match.

How to perform Decile analysis?

**Phase 1:** Build a logistic regression model. In this case, the dependent variable was ‘probability of purchase of the product’. When one indicates purchased, 0 means it has not been purchased. In addition, relevant independent variables were selected.

**Step 2:** Obtain the predicted probabilities from the Logistic Regression algorithm. Arrange the probabilities in descending order.

**Step 3: **Divide the entire data set into 10 groups, each group should have the same number. based on the findings. So, if there are 10,000 records, each group would have 1,000 records / customers.

**Step 4:** Calculate the percentage of respondents for each decile

**Step 5: **Calculate the response rate for each decile

**Step 6:** Lower the elevator for each decile

The top decile would have the customers most likely to react, and next decile 2, which would have the customers who follow the next most likely, and so on.

One significant advantage of decilean analysis is that probabilities and probability range are their own error meters. In other words, if the probability range of the upper decile is 0.75 to 0.81, then the probability that the person will not buy the product, i.e. [0.75–0.81], here ‘[]’means a range of values from 0.75 to 0.81, including 0.75 to 0.81)

What does the Decile Analysis output look like?

The table below describes a typical decile analysis result.

Unpack the result.

As noted earlier, each decile has the same number of customers (1000 in each decile).

**% of each decile respondents** = Number of respondents in that decile / number of respondents in all 10 deciles

See Table 1:

Here Decile 1 respondents% = 224/984 = 22.8%

984 is a total number. respondents in all ten deciles

Correspondingly, percentage of Decile 2 respondents = 16.5% (162/984)

And the cumulative% of respondents for the top 2 deciles = 39.2%

**Profits and profit table**

Table 1 Decile 1 includes 10% of customers who are most likely to buy. Decile 1 has the highest no. respondents (224 respondents).

So of all decile respondents, 22.8% of respondents are in Decile 1. So 10% of the 10% of customers in Decile 1 have 22.8% of the answers.

Respectively, for 20 percent of the customer base in decile 1 and decile 2, 39.2% have answers.

The profit table below explains this better.

Image source: Author

**Takeaway profit chart: **The confirmation chart can be used to estimate how many customers respond in each decile. So instead of targeting customers from the lower deciles, customers can only be retrieved from the top deciles.

The dashed line is the baseline. The baseline tells you how many customers would respond if we randomly target customers without a model.

**Response rate**

The response rate tells you what percentage of customers have responded in each decile. The response rate is highest in Decile 1, followed by Decile 2, and so on.

**Response rate for each decile** = Number of respondents in that decile / number of customers in that decile

See Table 1:

Here, the response rate for deciles 1 = 224/1000 = 22.4%

**Takeaway from Response Rates Comparison:** The percentage response for each decile is shown in the diagram below. The average response rate for all deciles is 9.8%. So customers from Decile 1 to Decile 4 are above the average response rate and should be targeted to the campaign.

Image source: Author

**Lift and raise the curve**

**Elevator** = cumulative% of respondents / customers% in each decile

See Table 1:

Raise for deciles 1 = 22.8% / 10% = 2.28

Raise to Decile 2 = 39.2% / 20% = 1.96

**How to interpret**A: If we target two deciles, we target 20% of our customers. In the same deciles, the cumulative percentage of respondents is 39.2%. Therefore, the elevator is 1.96.

Lifting 1 means that there is no reinforcement compared to no. randomly targeted customers. Raise more than 1 means the model approach is better than choosing customers at random.

**Withdraw from map:** Can be used to identify deciles with higher lift.

Image source: Author

**Using Decile Analysis in Business Decisions:**

Now that we’ve built Decile Analysis, the next pertinent question is how do we use it to make effective business decisions:

**Review the Decile analysis table again**

Image source: Author

Based on the above results, we decide to target customers to the Top 4 decals because they are more likely to buy the product.

From a business perspective, the ROI of targeting the top 4 deciles is more. As we move down to deciles, the rate of return on investment decreases rapidly and it is not profitable to reach these customers.

So this sums up this decile analysis article. Your comments are welcome.

References:

.