Unsupervised Learning (Clustering-based Customer Segmentation)

PCA (Principal Component Analysis)

Learning Outcome

Recognize the mandatory pre-processing step (Scaling) and the loss of interpretability.

Determine how many components to keep using a Scree Plot.

Visualize how PCA finds the axes of maximum variance.

Explain the difference between Feature Selection and Feature Extraction (PCA).

Understand the "Curse of Dimensionality" and why too many columns destroy models.

The Curse of Dimensionality

The Problem

Humans cannot visualize anything past 3 Dimensions.
Algorithms (like KNN or DBSCAN) mathematically break down in high dimensions because "distance" loses its meaning.

The Story So Far

We've been looking at datasets with 5 to 10 features. But real-world data (like genetics or image processing) can have 10,000 columns!

The Solution

We need to shrink the dataset from 10,000 columns down to 2 or 3 columns, without losing the underlying information.

Imagine you are holding a 3D teapot, and you need to take a 2D photograph of it that captures its true shape....

PCA is the master photographer. It mathematically rotates your high-dimensional data to find the "perfect angle" that captures the absolute maximum amount of spread (variance) in a flat, 2D or 3D view.

Bad Angle

If you take the photo looking straight down the spout, the picture is just a circle.

You lost all the information.

Perfect Angle

If you take the photo from the side, you can clearly see the handle, the lid, and the spout.

You captured maximum detail

Transition from Analogy to Technical Concept(Slide 5)

Core Concepts (Slide 6)

Core Concepts (Slide 7)

Core Concepts (.....Slide N-3)

Summary

The trade-off for compressing your data is that you completely lose the interpretability of your original features.

You must always Scale your data before applying PCA.

PC1 captures the maximum variance (the "perfect camera angle").

It uses Feature Extraction, blending old columns into new, synthetic Principal Components.

PCA is an unsupervised dimensionality reduction technique.

Quiz

What’s the key difference between Feature Selection and Feature Extraction (PCA)?

A. Selection needs scaling; extraction doesn’t

B. Selection removes original columns; extraction creates new synthetic features from them

C. Selection is unsupervised; extraction is supervised

D. No mathematical difference

Quiz-Answer

What’s the key difference between Feature Selection and Feature Extraction (PCA)?

A. Selection needs scaling; extraction doesn’t

B. Selection removes original columns; extraction creates new synthetic features from them

C. Selection is unsupervised; extraction is supervised

D. No mathematical difference