A Popular Interview Question: Discriminative vs. Generative Models
A simplified guide to generative and discriminative models, along with a quiz.
Based on the data modeling approach, ML models can be classified into two categories:
Generative
Discriminative
I prepared the following visual which depicts how they differ:
I have seen this topic come up in several interviews, so today, let’s understand their details.
#1) Discriminative models
Discriminative models, as the name suggests, are primarily centered around learning decision boundaries that separate different classes.
Mathematically speaking, they maximize the conditional probability P(Y|X)
, which is read as follows: “Given an input X, maximize the probability of label Y.”
As a result, these types of models are explicitly meant for classification tasks.
Popular examples include:
Logistic regression
Random Forest
Decision Trees, etc.
#2) Generative models
Generative models, on the other hand, are primarily centered around learning the class-conditional distribution, as shown in the figure above.
Thus, they maximize the joint probability P(X, Y)
by learning the class-conditional distribution P(X|Y)
:
Popular examples include:
Naive Bayes
Linear Discriminant Analysis (LDA)
Gaussian Mixture Models, etc.
We formulated Gaussian Mixture Models and implemented them from scratch here: Gaussian Mixture Models (GMMs).
Since generative models learn the underlying distribution, they can generate new samples.
For instance, consider a GAN:
Once the model has been trained, you can throw away the discriminator and just use the generator to generate real-looking images:
However, this is not possible with discriminative models.
Furthermore, generative models possess discriminative properties, i.e., they can be used for classification tasks (if needed).
However, discriminative models do not possess generative properties.
Discriminative vs. Generative Quiz
Let’s consider an example to better understand them.
Imagine you are a language classification system.
There are two ways you can classify languages.
Learn every language and then classify a new language based on acquired knowledge.
Understand some distinctive patterns in each language without truly learning the language. Once you do that, classify a new language based on the learned patterns.
Can you figure out which of the above is generative and which is discriminative?
Answer
The first approach is generative. This is because you learned the underlying distribution of each language.
In other words, you learned the joint distribution P(Words, Language)
.
Moreover, as you understand the underlying distribution, now you can generate new sentences, can’t you?
The second approach is discriminative. This is because you only learned specific distinctive patterns of each language.
It is like:
If so and so words appear, it is likely “Langauge A.”
If this specific set of words appears, it is likely “Langauge B.”
and so on.
In other words, you learned the conditional distribution P(Language|Words)
.
Can you generate new sentences here?
No, right?
This is the difference between generative and discriminative models.
Also, the above description might persuade you that generative models are more generally useful, but it is not true.
This is because generative models have their own modeling complications.
For instance, typically, generative models require more data than discriminative models.
Relate it to the language classification example again.
Imagine the amount of data you would need to learn all languages (generative approach) vs. the amount of data you would need to understand some distinctive patterns (discriminative approach).
Typically, discriminative models outperform generative models in classification tasks.
👉 Over to you: What are some other problems while training generative models?
Are you overwhelmed with the amount of information in ML/DS?
Every week, I publish no-fluff deep dives on topics that truly matter to your skills for ML/DS roles.
For instance:
A Crash Course on Graph Neural Networks (Implementation Included)
Conformal Predictions: Build Confidence in Your ML Model’s Predictions
Quantization: Optimize ML Models to Run Them on Tiny Hardware
5 Must-Know Ways to Test ML Models in Production (Implementation Included)
And many many more.
Join below to unlock all full articles:
SPONSOR US
Get your product in front of 85,000 data scientists and other tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters — thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., who have influence over significant tech decisions and big purchases.
To ensure your product reaches this influential audience, reserve your space here or reply to this email to ensure your product reaches this influential audience.