PCA vs. t-SNE

Jan 14, 2025

Use a single interface to access all SOTA LLMs instead of paying separately for all of them:

Thanks to Abacus AI for partnering today.

PCA vs t-SNE is a popular data science interview question so let’s understand how they differ today.

This table summarizes the differences between them:

Let’s discuss the differences below.

PCA is primarily a dimensionality reduction algorithm. It is NOT inherently designed to create 2D visualizations of high-dimensional dataset.
t-SNE, however, is a data visualization algorithm. We use it to project high-dimensional data to low dimensions (primarily 2D).

PCA is a deterministic algorithm—running it twice on the same dataset will produce the same result.
t-SNE is a stochastic algorithm—running the algorithm can produce entirely different results. Can you explain why? We covered it pretty recently.

As far as uniqueness and interpretation of results is concerned…

PCA always has a unique solution for the projection of data points.
t-SNE, as discussed above, can provide entirely different results, and its interpretation is subjective in nature.

PCA is a linear dimensionality reduction approach. It can only find a linear subspace to project the given dataset. KernelPCA addresses this:

PCA retains the global variance of the data. Thus, local relationships (such as clusters) are often lost after projection, as shown below:

t-SNE preserves local relationships. Thus, data points in a cluster in the high-dimensional space are much more likely to lie together in the low-dimensional space.
- Note: In t-SNE, we do not explicitly specify global structure preservation. But it typically does create well-separated clusters.
- That said, the distance between two clusters in low-dimensional space is NEVER indicative of cluster separation in high-dimensional space, as depicted below:

Those are the key differences between PCA and t-SNE.

And before we end...

Don't forget to check out ChatLLM to access all SOTA LLMs in a single interface instead of paying separately for each of them:

👉 Over to you: What other differences between t-SNE and PCA did I miss?

Thanks for reading!

Daily Dose of Data Science