There are two ways it can be handled. The paper on dropout says that during inference they multiply the probability of dropout with the activations of that layers. Where as the implementation in the library such as PyTorch and other, the compensation is done during the training by scaling the activations. Great write up. So true that many doesn’t the scaling factors and many senior people who read the paper says only inference way of compensation is correct.
Avi TBH, it would be very interesting if you'd create a blog post listing the practical ML questions (with answers) you've asked the candidates during the interview!
As mentioned by Jaiprasad R, in the comments, the paper on dropout mentions about multiplying w with p during evaluation phase(all the neurons will be present all the time during evaluation time, but the weights(w) are multiplied by p). I am new to Pytorch and TensorFlow. So, I am not sure on how they do it.
Been following dailydoseofds from long time. Thanks for all the great work!
There are two ways it can be handled. The paper on dropout says that during inference they multiply the probability of dropout with the activations of that layers. Where as the implementation in the library such as PyTorch and other, the compensation is done during the training by scaling the activations. Great write up. So true that many doesn’t the scaling factors and many senior people who read the paper says only inference way of compensation is correct.
Totally, Jaiprasad. Both ways are pretty widely used. Thanks for appreciating :)
Avi TBH, it would be very interesting if you'd create a blog post listing the practical ML questions (with answers) you've asked the candidates during the interview!
Never knew this. Thanks a lot for the insights. Amazing these techniques are implemented in the train and eval methods.
The best ML newsletter! Thanks Avi
Thanks so much, Damien :)
Hello,
As mentioned by Jaiprasad R, in the comments, the paper on dropout mentions about multiplying w with p during evaluation phase(all the neurons will be present all the time during evaluation time, but the weights(w) are multiplied by p). I am new to Pytorch and TensorFlow. So, I am not sure on how they do it.
Been following dailydoseofds from long time. Thanks for all the great work!
Can you please tell me which online tool you are using to create these beautiful and eye catching images
Great article. Don’t know this concept. Thanks Avi.
Great write up.
Like this article, can you suggest some resources, from where we can gain comprehensive understanding of ML ? Maybe a book or something.
Great write Avi!