Did you know that GPUs cannot divide?
What does this mean? It means they have to use emulation
The result is multiple instructions to perform the task and a much longer execution time than you might expect! Plus, the same is true for square root!
GPUs usually market performance using the FLOPS measure. But the devil is in the details: GPUs are designed for machine learning and graphics processing, solely requiring multiplication and addition. However, in finance, division and square root calculations are common!
So, GPU requires >26x more clocks than CPU for division, and almost 11x more for a square root calculation!
![[Pasted image 20240711152520.png]]