In my opinion, federated learning is among those very powerful ML techniques that is not given the true attention it deserves.
Here’s a visual that depicts how it works:
Let’s understand this topic today and why I consider this to be an immensely valuable skill to have.
We covered this topic in detail in a deep dive as well some months back: Federated Learning: A Critical Step Towards Privacy-Preserving Machine Learning.
The problem
Modern devices (like smartphones) have access to a wealth of data that can be suitable for ML models.
To get some perspective, consider the number of images you have on your phone right now, the number of keystrokes you press daily, etc.
That’s plenty of data, isn’t it?
And this is just about one user — you.
But applications can have millions of users. The amount of data we can train ML models on is unfathomable.
So what is the problem here?
The problem is that almost all data available on modern devices is private.
Images are private.
Messages you send are private.
Voice notes are private.
Being private, it is likely that it cannot be aggregated in a central location, as traditionally, ML models are always trained on centrally located datasets.
But this data is still valuable to us, isn’t it?
We want to utilize it in some way.
The solution
Federated learning smartly addresses this challenge of training ML models on private data.
Here’s the core idea:
Instead of aggregating data on a central server, dispatch a model to an end device.
Train the model on the user’s private data on their device.
Fetch the trained model back to the central server.
Aggregate all models obtained from all end devices to form a complete model.
That’s an innovative solution because each client possesses a local training dataset that remains exclusively on their device and is never uploaded to the server.
Yet, we still get to train a model on this private data.
Furthermore, federated learning distributes most computation to a user’s device.
As a result, the central server does not need the enormous computing that it would have demanded otherwise.
This is the core idea behind federated learning.
The challenges
Of course, there are many challenges to federated learning:
As the model is trained on the client side, how to reduce its size?
How do we aggregate different models received from the client side?
[IMPORTANT] Privacy-sensitive datasets are always biased with personal likings and beliefs. For instance, in an image-related task:
Some clients may only have pet images.
Some clients may only have car images.
Some clients may love to travel, so most images they have are travel-related.
How do we handle such skewness in client data distribution?
What are the considerations for federated learning?
Lastly, how do we implement federated learning models?
These are some of the core topics we discussed in the following ML deep dive: Federated Learning: A Critical Step Towards Privacy-Preserving Machine Learning.
Why care?
The idea behind federated learning appeared to be extremely compelling and smart to me when I first used it in a project at Mastercard.
In my experience, federated learning is one of those training paradigms that deserves much more attention.
I see a great utility for this technique in the near future.
This is because, lately, more and more users have started caring about their privacy.
Thus, more and more ML teams are resorting to federated learning to build ML models, while still preserving user privacy.
In this article, I have put down everything I learned during that exploration.
So, even if you know nothing about federated learning, you are good to go :)
👉 Read it here: Federated Learning: A Critical Step Towards Privacy-Preserving Machine Learning.
👉 Over to you: What are some other challenges with privacy-driven ML?
Are you overwhelmed with the amount of information in ML/DS?
Every week, I publish no-fluff deep dives on topics that truly matter to your skills for ML/DS roles.
For instance:
A Beginner-friendly Introduction to Kolmogorov Arnold Networks (KANs).
5 Must-Know Ways to Test ML Models in Production (Implementation Included).
Understanding LoRA-derived Techniques for Optimal LLM Fine-tuning
8 Fatal (Yet Non-obvious) Pitfalls and Cautionary Measures in Data Science
Implementing Parallelized CUDA Programs From Scratch Using CUDA Programming
You Are Probably Building Inconsistent Classification Models Without Even Realizing.
And many many more.
Join below to unlock all full articles:
SPONSOR US
Get your product in front of 80,000 data scientists and other tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters — thousands of leaders, senior data scientists, machine learning engineers, data analysts, etc., who have influence over significant tech decisions and big purchases.
To ensure your product reaches this influential audience, reserve your space here or reply to this email to ensure your product reaches this influential audience.
shouldn't another callout be that some of these large models have been proven to reproduce parts of their training set? not important for the efficacy of training, but easy way to get in trouble