Why is Kernel Trick Called a "Trick"?

The mathematics behind kernel trick.

Apr 26, 2025

GroundX tops DocBench leaderboard for RAG accuracy

No powerful LLM can ever make up for poor retrieval.

To solve this, we finally have an AI that can understand documents better than humans.

GroundX is an AI system designed to understand large amounts of complex documents.

These documents can have images, tables, and flowcharts along with regular text (like below) and provide that info to LLMs via RAG and Agentic tools.

Recent performance metrics of GroundX on DocBench (an AI document understanding benchmark) suggest that GroundX is better than humans at several knowledge-intensive tasks.

Read the detailed report here →

Build enterprise-grade RAG

Thanks to EyeLevel for partnering today!

Why is Kernel Trick Called a "Trick"?

So many ML algorithms use kernels for robust modeling, like SVM, KernelPCA, etc.

In a gist, a kernel function lets us compute dot products of two vectors in a high-dimensional space without transforming the vectors to that space.

But how does that even happen?

Let’s understand today!

The objective

Firstly, a Kernel lets us compute the dot product between two vectors, X and Y, in some high-dimensional space without projecting the vectors to that space.

Thus, we need a kernel function (k) whose output is the same as the dot product between projected vectors (had they been projected):

If that is a bit confusing, let me give an example.

A motivating example

Let’s assume the following polynomial kernel function:

For simplicity, let’s say both X and Y are two-dimensional vectors:

Simplifying the kernel expression above, we get the following:

Expanding the square term, we get:

Now, notice the final expression:

The above expression is the dot product between the following 6-dimensional vectors:

Thus, our projection function comes out to be:

This shows that the kernel function we chose earlier computes the dot product in a 6-dimensional space without explicitly visiting that space.

And that is the primary reason why we also call it the “kernel trick.”

More specifically, it’s framed as a “trick” since it allows us to operate in high-dimensional spaces without explicitly projecting data to that space.

The one we discussed above is the polynomial kernel, but there are many more kernel functions we typically use:

Linear kernel
Gaussian (RBF) kernel
Sigmoid kernel, etc.

I intend to cover them in detail soon in another issue.

👉 Until then, it’s over to you: Can you tell a major pain point of the kernel trick algorithms?

Thanks for reading, and we’ll see you next week!

P.S. For those wanting to develop “Industry ML” expertise:

At the end of the day, all businesses care about impact. That’s it!

Can you reduce costs?
Drive revenue?
Can you scale ML models?
Predict trends before they happen?

We have discussed several other topics (with implementations) that align with such topics.

Develop "Industry ML" Skills

Here are some of them:

Learn how to build Agentic systems in an ongoing crash course with 11 parts.
Learn how to build real-world RAG apps and evaluate and scale them in this crash course.

Learn sophisticated graph architectures and how to train them on graph data.
So many real-world NLP systems rely on pairwise context scoring. Learn scalable approaches here.
Learn how to run large models on small devices using Quantization techniques.
Learn how to generate prediction intervals or sets with strong statistical guarantees for increasing trust using Conformal Predictions.
Learn how to identify causal relationships and answer business questions using causal inference in this crash course.
Learn how to scale and implement ML model training in this practical guide.
Learn techniques to reliably test new models in production.
Learn how to build privacy-first ML systems using Federated Learning.
Learn 6 techniques with implementation to compress ML models.

All these resources will help you cultivate key skills that businesses and companies care about the most.

Daily Dose of Data Science

Discussion about this post