• 0208 432 6218
  • WhatsApp
  • Register

Overfitting and Underfitting Explained Simply

Overfitting and underfitting are two of the most important concepts in machine learning. They help us understand whether a model has learned useful patterns or learned the data incorrectly.

Why This Topic Matters

A machine learning model should not simply memorise training data. It should learn patterns that also work on new, unseen data.

Good machine learning = good performance on unseen data

What is Underfitting?

Underfitting happens when a model is too simple and fails to learn important patterns from the data.

Simple Analogy

A student studies only one page for an exam and cannot answer most questions.

Signs of Underfitting Meaning
Poor training performance The model cannot even learn the training data properly
Poor testing performance The model also fails on new data
Very simple model The model lacks learning capacity
Simple rule: Underfitting means the model learned too little.

What is Overfitting?

Overfitting happens when a model memorises the training data too closely, including noise and small details.

Simple Analogy

A student memorises answers to practice questions but struggles when the real exam questions change slightly.

Signs of Overfitting Meaning
Excellent training performance The model memorised the training data
Poor testing performance The model cannot generalise well
Very complex model The model learned unnecessary details
Simple rule: Overfitting means the model learned too much detail.

What is a Good Fit?

A good model learns the important patterns without memorising unnecessary details.

Model Type Training Performance Testing Performance
Underfitting Poor Poor
Good Fit Good Good
Overfitting Excellent Poor
Goal = balance between learning and generalisation

Visual Understanding

Underfitting

A straight line trying to fit highly curved data.

Good Fit

A smooth curve capturing the main trend.

Overfitting

A very complicated curve trying to pass through every single point.

Why Overfitting Happens

  • Model is too complex
  • Training for too many iterations
  • Too many unnecessary features
  • Very small training dataset
  • Model memorises noise in data

Why Underfitting Happens

  • Model is too simple
  • Not enough training time
  • Important features are missing
  • Insufficient learning capacity

Overfitting in Real Business Scenarios

Customer Purchase Prediction

A model learns the exact behaviour of historical customers but performs poorly when predicting future customers.

Fraud Detection

A model memorises old fraud cases but cannot identify new fraud patterns.

How Train-Test Split Helps

Train-test split helps detect overfitting and underfitting by evaluating the model on unseen data.

Situation Possible Issue
High training accuracy + low testing accuracy Overfitting
Low training accuracy + low testing accuracy Underfitting
Similar good performance on both Good generalisation

Reducing Overfitting

Technique Purpose
More training data Helps model learn broader patterns
Simpler model Reduces unnecessary complexity
Feature selection Removes irrelevant features
Regularisation Controls model complexity
Dropout (deep learning) Prevents memorisation

Reducing Underfitting

Technique Purpose
More training time Allows model to learn better
More useful features Provides more information
More complex model Increases learning ability

Simple Python Example

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(max_depth=2)

model.fit(X_train, y_train)

A very small depth may underfit. A very large depth may overfit.

Deep Learning Example

Neural networks can also overfit if they are too complex or trained too long.

from tensorflow.keras.layers import Dropout

model.add(Dropout(0.3))

Dropout helps reduce overfitting by randomly disabling some neurons during training.

Quick Practice

A model achieves 99% training accuracy but only 60% testing accuracy.

Question: Is this likely overfitting or underfitting?

Answer: Likely overfitting because the model performs very well on training data but poorly on unseen data.

Common Beginner Mistake

Many beginners focus only on training accuracy. A model is only useful if it performs well on new data.

Remember: High training accuracy alone does not mean the model is good.

Key Takeaway

Underfitting happens when a model learns too little. Overfitting happens when a model memorises too much. The goal is to build a model that learns useful patterns and generalises well to unseen data.

Simple rule: Good machine learning balances learning and generalisation.

Want to Learn More?

Explore our practical courses in Data Analysis, Machine Learning and AI to understand how real models are built and improved.

View Courses

What we do?

At London Academy of IT, we provide instructor-led online and in-person IT training in Data Analytics, SQL, Python, Power BI, and more. Our cutting-edge courses are designed to boost performance and enhance employability, providing the competitive edge employers look for.

Our Contacts

London Academy of IT
64 Broadway
Stratford
London E15 1NT
United Kingdom

Regional Training

2012 - 2026 © London Academy of IT Limited. All Rights Reserved.
UKPRN: 10045491. Registered in England & Wales with company no. 07923992.