PL4 GPU benchmarking?

I updated the spreadsheet with my results in PL5; along with quite a variety of different settings added as notes. I imagine these settings would make more difference on larger exports.
I’m currently on 16GB of RAM but expect to upgrade to 32GB in the next few days; at which stage I plan to re-run all tests to allow a direct comparison of different RAM sizes.

Couple of issues - I couldn’t find where you guys got GPU GFLOPs figures from. I can’t see the source for R5 and 1000D images. If anyone can point me in the right direction I’ll populate them too.

I couldn’t find where you guys got GPU GFLOPs figures from.

@rymac
Here you can find the specs of each GPU.

1 Like

I have added my results with MacBook Pro Silicon 16".
I have found and processed Egypt, R6, 90D, D850 but not R5 or 1000D.

I’ve tidied the spreadsheet up a bit. Someone had left filters on which was hiding most results. I’ve now removed these and added missing formulas / cell shading to some users PL 5.X results. When time permits I’ll set the spreadsheet up to do this automatically for new entries.

There’s no link for the R5/1000D test images. @Savay as the only person to run these, are you able to provide this, either here or in row7/8 of the spreadsheet?

I’ve now automated the cell shading on the spreadsheet for V5.X of Photolab.
Are there any other automations/features that people would like to see?

Is it possible for an admin to split this topic? Would be good to have a thread specifically for the spreadsheet with the initial post being a link to the spreadsheet, rather than a PL4 specific topic with the sheet buried in it.

Settings for image correction make a big difference. I’ve just run the D850 test with PhotoLab 5.1.1 on macOS 12.1 on a MBP 14 M1 Pro 10/16 16GB memory.

With just DeepPrime turned on at 40, the five images processed in 32s. With my default set of corrections for D850 files (Leica M9 colour, lens sharpening, auto-horizon, auto-crop), the process time jumped to 36s. With another more intensive set of corrections but no local corrections, processing jumped to 45s.

These times are quite extraordinary as my MBP is now overloaded (didn’t close many apps before starting up PhotoLab 5 as I didn’t feel like restarting all my other work) with memory pressure of 63 and 9 GB of swap and I’m tabbing back into my browser to keep working during the testing (which is the other application using the most memory).

Great export times don’t prevent PhotoLab 5 from misbehaving on an M1 Pro Mac. Memory for PhotoLab 5 is at 10.5 GB (would be about 4 GB on an Intel Mac). Spotify playback got choppy while exporting and typing slowed down (nothing like as bad as the base M1 Mac Mini with 8/8 configuration though).

The main reason I did these tests right now was that there was a result in for the M1 Pro from forum name 4 which I couldn’t believe, 31s while there is a conflicting result of 77s for the same M1 Pro 10/16 configuration.

This is half the time of the Radeon 5500XT in the charts and faster or competitive with some of the most powerful GPU’s in existence. My Radeon VII with AMD hardware acceleration enabled (Apple disables it by default, one has to add OpenCore to get it) did outperform the M1 Pro (5 or 6 sec per image with full set of adjustments including DeepPrime) but I don’t have it hooked up right now.

In any case, the M1 Pro does get these good numbers, which I would have expected from the M1 Max but not the M1 Pro. I wonder if the M1 numbers on PhotoLab 5 (almost as good at 35s) hold up to scrutiny. I don’t have a plain M1 here any more to test against. When I tested on my own files, there was a huge difference with PhotoLab 5 on an M1 Mac Mini vs PhotoLab 4.


Just dug up the previous test I ran on 61 D850/D810 images in a real set on an M1 Mac Mini with 8GB memory.

  • PhotoLab 4 - 32m
  • PhotoLab 5 - 10m38s

That is about 3x faster so the M1 35s result looks reasonable. This suggests that for PhotoLab export it doesn’t make much difference which M1 Mac one chooses (M1, M1 Pro, M1 Max). Again I don’t find the editing experience under Rosetta particularly good in comparison to my Intel Macs, with small delays and micro-stutter as well those memory leak issues so I don’t recommend moving to M1 architecture for PhotoLab at all, pushing those of us without Catalina/Big Sur/Monterey Macs to upgrade to either a poor user experience or to Intel Macs which will soon be made obsolete.

All the more reason for DxO not to have cut off Mojave before its time. Capture One 22 does run on Mojave.

Hello,
I have updated my results on the “DxO DeepPRIME Processing Times” spreadsheet:
• PhotoLab 5 (v.5.1.3) with Windows 11 and the latest GPU driver
• PhotoLab 4 (v.4.3.6) with Windows 11 and the latest GPU driver

Summary
For the same hardware platform:
• Great interest in upgrading from PhotoLab 4 to PhotoLab 5: times are almost divided by 2.
• Significant interest in upgrading to the latest GPU drivers, latest minor version of DPL, and WIndows 11: at least 15% increase in performance.

Hi everyone! Sorry for spamming and english level in adavnce.
I have a GTX 680 2Gb GPU with quite old phenom cpu. DxO Photolab (pure raw also) partially support gpu and process photos at about 0.08 MP/sec using it. CPU is even slower. Is it worth upgrading to AMD APU (not cpu) for to use Vega for processing with my 680 installed at one time? Vega performs at last two times worse (lower Flops count than GTX 680). Table shown in this thread tells that vega is 2X faster than RX 560 I used before (Gtx 680 is much slower in DxO than rx 560).
Is there any ways to tweak system to increase perfomance or there is no ways over upgrade? I only have money to buy modern cpu with at last sse 4.1 instructions and at last 1050-level gpu. Is 2400G/another APU good purchase for me?

I basically only ran those to check the scaling vs. PL4 on different RAW Types and Sizes and to check if there are some (CPU-)bottlenecks with smaller batches and individiual RAWs even on a 5950X. (which seems to be the case. Well…at least with the single GFX50 RAW…and then there are some slight hints with the 1000D and R6 Batches)

Unfortunately I don’t have any copyright at all for 20x R5 RAWs and downloaded examples from DPR.
Maybe someone else can provide 20 R5 Raws?!
However i’m under the impression that DeepPrime seems to be completely content angnostic and the image content itself doesn’t matter at all, since it is basically crunching all the Pixels of each file no matter what.
So theoretically you can also use 20x an identical RAW or 20x completely random ones, the results will be the same. So you can in theory use any R5 RAW you are stumbeling upon anyway.

But I think for now the 20x 90D RAWs (for which i indeed have the copyright) with it’s 32-33MPix are sufficiently close to capping out even the fastest GPUs without getting limited by the memory subsystem or the CPU too much and without the time for one complete run getting too out of hand on slower systems.

The 1000D RAWs on the other hand seems to be a tad too small to be of any further interest, since those are only 10MPix per Image.
Even the R6 batch with the 20MPix files seems to show some slight effects of some kind of (CPU? I/O?) Bottleneck for the fastest systems…at least with PL5.

The results you mentioned are not conflicting!
One run is performed on the GPU…the other one on the NPU (aka AI Accelerator aka the 16 Core Apple Neural Engine or ANE! with it’s ~11TOPS which equals ~11TFLOPS for fp16) Hence also the difference in the coloured cells in the “GFLOP” Rows.

All M1 (and also the A14) have the same 16 Core ANE. The Max and the Pro even the same CPU.
Only for the memory controller there is a slight difference…so it’s no wonder those are basically performing the same while being a tad faster than a regular M1.

In fact the presence of the same NPU in each A14/M1 derivative suggests even a iPhone 12 or an iPad Mini could theoretically perform almost in the same ballpark. (lower Powerlimit and fewer CPU cores and weaker memory subsystem aside…)

In fact this also suggests that DeepPrime itself could also in theory run reasonably well on some ARM SoCs used for Android Devices or Chromebooks or Windows on ARM, since those have pretty strong dedicated AI/ML Accelerators too. (Especially the Snapdragons and Exynos)
Some of the newer ones are almost twice as fast as Apples. (24-26 TOPS)

I added my results for my new pc which has an i5 12600k (just using built in UHD770 graphics) with Windows 10 and 11.

Windows 10 was not using the the performace cores, only the ecores used and as a result took a painstakingly 337s to process ‘Eygpte CPU only’ test. Whereas Windows 11 used all the cores and did it in 72s.

So on the new 12th gen intel chips it is worth while upgrading to Windows 11 if you are using only the cpu to process images.

1 Like

Is this still the latest thread for this benchmark? I just added my results with DeepPRIME for the Egypte 111MB image using Photolab 6 trialware under Windows 10.

GPU (Radeon RX 470 with 4 GB) - 41 seconds
CPU (Intel i7-4770K with 16 GB) - 208 seconds

I just added my results on a 4080/16. I’m surprised to see how close it is to the 4090 in these benchmarks but not sure how my other hardware may be impacting the scores. Would love to see more 4080 and 4090 along with the new AMD 7900 GPUs.

I have added four rows to the spreadsheet with results for my new Apple Macbook Pro 14 (10C CPU, 16C GPU, 16C ANE). The combinations are for DeepPRIME and DeepPRIME XD using acceleration setting of Apple Neural Engine or GPU Apple M1 Pro. Obviously worth changing the setting from the default (GPU) to use ANE.

1 Like

Hello,
I updated the graph I made (see here) showing processing speed with DeepPRIME as a function of GPU FP32 score.
I have added the new results for PhotoLab 4, 5, 6, and 6 with DeepPRIME XD.

gif

Please don’t forget to complete the Google spreadsheet with your own results.

Regards,

There’s definitely a good enough point where PhotoLab works just fine for live editing and export speeds reach a 10 to 15 sec per image. One doesn’t have to buy the most expensive graphic card. Almost any card with 8GB VRAM will put a user in the top bracket. Many of the 4GB cards will do fine too. Sorry, I forgot that some people want to use the very slow Prime XD function to generate artificial textures on large batches of images. I’m sticking to Prime.

Looking more closely at the Google spreadsheet it looks like the arbitrary cutoff point for good Prime processing and no slow down in real time previews would be an nVidia GTX 980/1060 or an AMD 480X/580X. Those are all old cards and available for €200 or less. Trying to use DxO PhotoLab 5 or 6 without a card in this category or close is somewhat masochistic.

1 Like

I recently upgraded my PC (built a new):

  • WIN 11 PRO
  • PL6, v6.2.0
  • Asus PRIME Z790-P WIFI
  • i7-13700KF, 16 cores
  • 32 MB DDR5
  • Asus NVIDIA GeForce RTX 3060, 4095 MB GDDR6

Tested with test image “Egypte” to jpg, no resizing, DxO Standard preset.

  • High Quality 5 seconds
  • Prime 18 seconds
  • Deep Prime 5 seconds
  • Deep Prime XD 14 seconds

Generally, this is a good result considering earlier experiences. I have been able to export several images at the same time and still simultaneously being able to do editing work with the next ones without problems. PL can now be used economically in my workflow. I hope this computer will be good for software improvments a long way into the future.

I am curious about what is the reason that the legacy PRIME seems to be slower than newer versions. I gladly avoid choosing that any more and feel no need to use it.

1 Like

Unlike DeepPRIME and DeepPRIME XD, PRIME makes no use of your graphic card’s GPU. The only reason that PRIME still exists is for those people whose GPU is not supported for DeepPRIME and DeepPRIME XD processing. In that case, PRIME would be a much faster option, especially on an older and slower computer.

Mark

1 Like

It must be something like that.
For me it is good to know that Deep Prime is as fast as the High Quality and could be used for any image unless the XD is wanted.
I am really happy wit this now!

2 Likes

Good Morning,

i want to add the table with my resluts but i can´t download any images.
Does anyone have an archive of all the images?

Conny