Most People Don’t Entirely Understand How…

Jan 26, 2024

Here's the remaining information which you must know.

11 Comments

Jan 26, 2024Edited

There are two ways it can be handled. The paper on dropout says that during inference they multiply the probability of dropout with the activations of that layers. Where as the implementation in the library such as PyTorch and other, the compensation is done during the training by scaling the activations. Great write up. So true that many doesn’t the scaling factors and many senior people who read the paper says only inference way of compensation is correct.

Expand full comment

Reply (1)

Avi Chawla

Jan 27, 2024

Totally, Jaiprasad. Both ways are pretty widely used. Thanks for appreciating :)

Expand full comment

Omar AlSuwaidi

Jan 27, 2024

Avi TBH, it would be very interesting if you'd create a blog post listing the practical ML questions (with answers) you've asked the candidates during the interview!

Expand full comment

Srinivas

Jan 27, 2024

Never knew this. Thanks a lot for the insights. Amazing these techniques are implemented in the train and eval methods.

Expand full comment

Damien Benveniste

Jan 27, 2024

The best ML newsletter! Thanks Avi

Expand full comment

Reply (1)

Avi Chawla

Jan 27, 2024

Thanks so much, Damien :)

Expand full comment

Vishnuvardhan Jadava

Jan 27, 2024

Hello,

As mentioned by Jaiprasad R, in the comments, the paper on dropout mentions about multiplying w with p during evaluation phase(all the neurons will be present all the time during evaluation time, but the weights(w) are multiplied by p). I am new to Pytorch and TensorFlow. So, I am not sure on how they do it.

Been following dailydoseofds from long time. Thanks for all the great work!

Expand full comment