Accelerate tSNE with GPU

Over 30x faster tSNE than Sklearn.

Feb 24, 2025

At Daily Dose of Data Science, we’re creating the go-to platform for AI and ML professionals seeking clarity, depth, and practical insights to succeed in AI/ML roles—currently reaching 600k+ AI professionals.

We are looking for exceptional technical writers with expertise in AI and ML.

If you're interested, please fill out this hiring form →

Fill the hiring form

Start Date: Immediate.
Location: Virtual.
Salary expectation: $40k-$120k per year.

Experience in community building is a plus but not required.

We’ll follow up with the next steps once you fill the form.

Accelerate tSNE with GPU

One of the biggest issue with tSNE is that its run-time is quadratically related to the number of data points.

Note: We discussed tSNE in complete detail implemented it from scratch here →

Thus, typically, it becomes difficult to use tSNE from Sklearn implementations when your data has over 20k+ data points.

tSNE-CUDA is an optimized CUDA version of the tSNE algorithm. Thus, it provides immense speedups over the standard Sklearn implementation:

As depicted above, the GPU-accelerated implementation is 33 times faster than the Sklearn implementation.

That said, this implementation only supports n_components=2, i.e., you can only project to two dimensions.

The authors do not intend to support more dimensions since this will require significant changes to the code.

But in my opinion, the support for more dimensions doesn’t matter because tSNE is used to generate 2D projections in 99% of the use cases.

These are the benchmarking results by the authors:

It depicts that on the CIFAR-10 training set (50k images), tSNE-CUDA is 700x Faster than Sklearn.

Daily Dose of Data Science

Accelerate tSNE with GPU

Over 30x faster tSNE than Sklearn.

Announcement: We are hiring (fully remote roles)!

Accelerate tSNE with GPU

Discussion about this post