Confusion Matrix Explained Simply
A confusion matrix helps us understand where a classification model is right and where it is wrong.
It shows correct predictions, incorrect predictions and the types of mistakes the model makes.
What is a Confusion Matrix?
A confusion matrix is a table used to evaluate classification models. It compares the actual values with the model's predicted values.
Confusion matrix = actual results compared with predicted results
Example
If a model predicts whether a customer will purchase or not, the confusion matrix shows how many predictions were correct and how many were wrong.
Why Do We Need a Confusion Matrix?
Accuracy alone can hide important mistakes. A confusion matrix shows the full picture by showing different types of correct and incorrect predictions.
- Shows where the model is correct
- Shows where the model makes mistakes
- Helps calculate accuracy, precision and recall
- Helps understand business impact of errors
- Works for binary and multi-class classification
Key idea: Accuracy tells you how often the model is right; the confusion matrix tells you where it is wrong.
Basic Confusion Matrix Layout
In binary classification, the matrix usually has two actual classes and two predicted classes.
|
Predicted: No |
Predicted: Yes |
| Actual: No |
True Negative (TN) |
False Positive (FP) |
| Actual: Yes |
False Negative (FN) |
True Positive (TP) |
Simple rule: Correct predictions are on the main diagonal.
Rows and Columns
A confusion matrix is read by comparing actual values with predicted values.
| Part |
Meaning |
| Rows |
Actual classes |
| Columns |
Predicted classes |
| Diagonal cells |
Correct predictions |
| Off-diagonal cells |
Wrong predictions |
True Positive (TP)
True Positive means the model predicted the positive class correctly.
Customer Purchase Example
The model predicted that a customer would purchase, and the customer actually purchased.
TP = predicted Yes and actual Yes
True Negative (TN)
True Negative means the model predicted the negative class correctly.
Customer Purchase Example
The model predicted that a customer would not purchase, and the customer did not purchase.
TN = predicted No and actual No
False Positive (FP)
False Positive means the model predicted positive, but the actual result was negative.
Customer Purchase Example
The model predicted that a customer would purchase, but the customer did not purchase.
FP = predicted Yes but actual No
Business meaning: False positives may waste time, money or resources.
False Negative (FN)
False Negative means the model predicted negative, but the actual result was positive.
Customer Purchase Example
The model predicted that a customer would not purchase, but the customer actually purchased.
FN = predicted No but actual Yes
Business meaning: False negatives may mean missed opportunities or missed risks.
Example Confusion Matrix
Suppose a model predicts customer purchases for 100 customers.
|
Predicted: No |
Predicted: Yes |
| Actual: No |
50 |
5 |
| Actual: Yes |
10 |
35 |
- True Negatives = 50
- False Positives = 5
- False Negatives = 10
- True Positives = 35
How to Interpret the Example
| Result |
Meaning |
| 50 True Negatives |
Correctly predicted 50 customers would not buy |
| 35 True Positives |
Correctly predicted 35 customers would buy |
| 5 False Positives |
Incorrectly targeted 5 customers who did not buy |
| 10 False Negatives |
Missed 10 customers who actually bought |
Confusion Matrix and Accuracy
Accuracy can be calculated from the confusion matrix.
Accuracy = (TP + TN) ÷ Total Predictions
Using the Example
Correct predictions = 35 + 50 = 85
Total predictions = 100
Accuracy = 85%
Confusion Matrix and Precision
Precision tells us how many positive predictions were actually correct.
Precision = TP ÷ (TP + FP)
Using the Example
Precision = 35 ÷ (35 + 5) = 35 ÷ 40 = 87.5%
Simple meaning: When the model predicts “Yes”, how often is it correct?
Confusion Matrix and Recall
Recall tells us how many actual positive cases the model found.
Recall = TP ÷ (TP + FN)
Using the Example
Recall = 35 ÷ (35 + 10) = 35 ÷ 45 = 77.8%
Simple meaning: Out of all real “Yes” cases, how many did the model find?
Confusion Matrix in Python
Scikit-learn provides a simple function to create a confusion matrix.
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, predictions)
print(cm)
The output is usually a 2 × 2 table for binary classification.
Visualising a Confusion Matrix
A heatmap makes the confusion matrix easier to understand visually.
from sklearn.metrics import confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
cm = confusion_matrix(y_test, predictions)
sns.heatmap(cm, annot=True, fmt="d")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
Tip: Large values on the diagonal are good. Large values outside the diagonal show mistakes.
Business Examples
Fraud Detection
False negatives are risky because real fraud may be missed.
Marketing Campaigns
False positives may waste marketing budget on customers unlikely to buy.
Multi-Class Confusion Matrix
Confusion matrices can also be used for multi-class classification.
Example
A news classification model may classify articles into business, sport, politics, tech and entertainment.
In a multi-class confusion matrix, each row represents the actual class and each column represents the predicted class.
The diagonal still shows correct predictions.
Common Beginner Mistakes
- Looking only at accuracy and ignoring the confusion matrix
- Confusing false positives with false negatives
- Ignoring which type of mistake is more costly
- Not checking class imbalance
- Assuming all errors have the same business impact
Remember: Different mistakes can have different business costs.
Quick Practice
A model produces the following results:
- True Positives = 40
- True Negatives = 50
- False Positives = 5
- False Negatives = 5
Question: How many predictions were correct?
Answer: Correct predictions = TP + TN = 40 + 50 = 90.
Key Takeaway
A confusion matrix shows the detailed performance of a classification model.
It helps us understand correct predictions, mistakes and the business impact of different types of errors.
Simple rule: The diagonal shows correct predictions; off-diagonal values show errors.
Want to Learn More?
Explore our practical courses in Data Analysis, Machine Learning and AI to apply confusion matrices in real-world projects.
View Courses