1 Comment
User's avatar
Mie's avatar

There is a typo in the loss computation of mixed precision. It must be loss calculated in float16 (same as weight’s precision) and then upscale it up float32.

Expand full comment