PL5 DeepPrime machine performance prediction

I suggest DeepPrime processing takes some CPU cycles per pixel, some GPU flops per pixel and some overhead per image.

My estimates are:
700 CPU cycles per pixel (take CPU boost clock * 1.5 cores for rough cycles/second)
1.4 million GPU flops per pixel (take highest of FP32 (single) or FP16 (half) or tensor flops)
and 0.6 seconds per image.

These estimates would be helpful for predicting the performance of new machines or upgrades.

Would people like to confirm and/or improve those estimates?