Overfitting and Underfitting Explained Simply

Overfitting and underfitting are two of the most important concepts in machine learning. They help us understand whether a model has learned useful patterns or learned the data incorrectly.

Why This Topic Matters

A machine learning model should not simply memorise training data. It should learn patterns that also work on new, unseen data.

Good machine learning = good performance on unseen data

What is Underfitting?

Underfitting happens when a model is too simple and fails to learn important patterns from the data.

Simple Analogy

A student studies only one page for an exam and cannot answer most questions.

Signs of Underfitting	Meaning
Poor training performance	The model cannot even learn the training data properly
Poor testing performance	The model also fails on new data
Very simple model	The model lacks learning capacity

Simple rule: Underfitting means the model learned too little.

What is Overfitting?

Overfitting happens when a model memorises the training data too closely, including noise and small details.

Simple Analogy

A student memorises answers to practice questions but struggles when the real exam questions change slightly.

Signs of Overfitting	Meaning
Excellent training performance	The model memorised the training data
Poor testing performance	The model cannot generalise well
Very complex model	The model learned unnecessary details

Simple rule: Overfitting means the model learned too much detail.

What is a Good Fit?

A good model learns the important patterns without memorising unnecessary details.

Model Type	Training Performance	Testing Performance
Underfitting	Poor	Poor
Good Fit	Good	Good
Overfitting	Excellent	Poor

Goal = balance between learning and generalisation

Visual Understanding

Underfitting

A straight line trying to fit highly curved data.

Good Fit

A smooth curve capturing the main trend.

Overfitting

A very complicated curve trying to pass through every single point.

Why Overfitting Happens

Model is too complex
Training for too many iterations
Too many unnecessary features
Very small training dataset
Model memorises noise in data

Why Underfitting Happens

Model is too simple
Not enough training time
Important features are missing
Insufficient learning capacity

Overfitting in Real Business Scenarios

Customer Purchase Prediction

A model learns the exact behaviour of historical customers but performs poorly when predicting future customers.

Fraud Detection

A model memorises old fraud cases but cannot identify new fraud patterns.

How Train-Test Split Helps

Train-test split helps detect overfitting and underfitting by evaluating the model on unseen data.

Situation	Possible Issue
High training accuracy + low testing accuracy	Overfitting
Low training accuracy + low testing accuracy	Underfitting
Similar good performance on both	Good generalisation

Reducing Overfitting

Technique	Purpose
More training data	Helps model learn broader patterns
Simpler model	Reduces unnecessary complexity
Feature selection	Removes irrelevant features
Regularisation	Controls model complexity
Dropout (deep learning)	Prevents memorisation

Reducing Underfitting

Technique	Purpose
More training time	Allows model to learn better
More useful features	Provides more information
More complex model	Increases learning ability

Simple Python Example

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(max_depth=2)

model.fit(X_train, y_train)

A very small depth may underfit. A very large depth may overfit.

Deep Learning Example

Neural networks can also overfit if they are too complex or trained too long.

from tensorflow.keras.layers import Dropout

model.add(Dropout(0.3))

Dropout helps reduce overfitting by randomly disabling some neurons during training.

Quick Practice

A model achieves 99% training accuracy but only 60% testing accuracy.

Question: Is this likely overfitting or underfitting?

Answer: Likely overfitting because the model performs very well on training data but poorly on unseen data.

Common Beginner Mistake

Many beginners focus only on training accuracy. A model is only useful if it performs well on new data.

Remember: High training accuracy alone does not mean the model is good.

Key Takeaway

Underfitting happens when a model learns too little. Overfitting happens when a model memorises too much. The goal is to build a model that learns useful patterns and generalises well to unseen data.

Simple rule: Good machine learning balances learning and generalisation.

Want to Learn More?

Explore our practical courses in Data Analysis, Machine Learning and AI to understand how real models are built and improved.

View Courses

Overfitting and Underfitting Explained Simply

Why This Topic Matters

What is Underfitting?

Simple Analogy

What is Overfitting?

Simple Analogy

What is a Good Fit?

Visual Understanding

Underfitting

Good Fit

Overfitting

Why Overfitting Happens

Why Underfitting Happens

Overfitting in Real Business Scenarios

Customer Purchase Prediction

Fraud Detection

How Train-Test Split Helps

Reducing Overfitting

Reducing Underfitting

Simple Python Example

Deep Learning Example

Quick Practice

Common Beginner Mistake

Key Takeaway

Want to Learn More?

Popular Courses

Useful Links

Share this page now!

What we do?

Our Contacts

Regional Training