Some kind of community-generated benchmark for various GPUs would be useful. But it should be based on a standard set of images available to everyone. I’m not sure if there’s an appropriate set of software to capture this information (including CPU, memory, etc.), either. I’ve tried hand timing results from PL4 but my precision isn’t very good.
Simply agree which files to download from a web site like
A large file like a Nikon D850 will extend the process time and improve accuracy.
If you use say 5 files and simply batch process them with only the default DXO correction and DeepPrime enabled then DXO gives a readout of how long it took to produce the batch.
People can then report the time and CPU/GPU used.
DXO will report the time like this:
Well i`ve already given some numbers for a Radeon RX Vega64 under Windows 10 a little bit earlier.
24 32MPIx with my Standard Preset + DeepPrime are finished in around 180seconds.
So It runs somewhere between 3-5 MPx/Sec depending on the files, batch length and corrections applied.
The CPU does still seem to play a role though, since i`ve seen numbers from a 32C Zen2 Threadripper with a GTX1660 crunching a single 50MPix files between 8-10seconds.
I happen to have a RX580/8GB card here left over from my bitcoin mining days. I repeated the test on the Egypte 111MB image.
AMD Ryzen 2700x cpu, RX580/8GB took 29secs to output.
AMD Ryzen 2700x cpu, Nvidia 1060/6GB took 35 secs to output.
That sounds reasonable. How about people download these selected images::
- Images #4 through #8 (wedding couple - 5 RAW NEF - NOT JPG - images): nikon_d850_04.nef through nikon_d850_08.nef. These images have ISOs between 1800 and 12800, which should be a good test of DeepPRIME noise processing. NOTE: DxO PL reported ambiguity in which lens was used; I guessed it was an AF-S NIKKOR 105mm f/1.4E ED.
- This is a 111MB image.
Report the total DxO DeepPRIME processing times (IN SECONDS) on this spreadsheet: https://docs.google.com/spreadsheets/d/1Yx-3n_8D3OreyVwQLA-RVqiQCAgNdXDkxScpXTur1Gs/edit?usp=sharing
(I added times reported by dma.)
Excellent spreadsheet. I went in and update my numbers and added some of the wedding photos. I also reran my Egypte photo runs a couple of times and found that each run was coming out the same at 29secs so I update the spreadsheet to reflect that.
jch2103 googlesheets spreadsheet is starting to get populated with data. It really looks like DXO has tuned PL4 for a Nvidia 1060 class GPU. I tried a RTX2070 I had here and it was only 10% faster. That is a big jump in cost for only 10% performance improvement.
What output file is being produced jpg,dng or Tif? Does this influence result.
I guess it may depend in the file write time. Choosing a full size export should not depend on filetype until DeepPRIME is working.
I used jpg, as my drive with the test photos is a hdd.
My data in the spreadsheet is for jpg output. Read from SSD. Write to SSD.
I have tested several combinations (TIFF/JPEG, with/without resize) on 3 different PCs and none had any significant impact on the numbers.
But was always from and to SSDs. (SATA and M.2 depending on the System)
All my export times were for jpg output. I amended the spreadsheet to suggest that.
I added results with my RTX 3080.
Used NVME SSD to be sure it has no impact on numbers.
Some general observations.
Canon 6DII file of the night sky ISO 6400.
Export a 16 bit Tiff. Prime at 60. PL3
Export a 16 bit Tiff. Deep Prime at 60. PL4
PL3, 28 seconds, PL4 17 seconds. Fans rev with PL3 not with PL4.
The screenshot shows the external GPU usages during export.
I’ve added my DeepPrime times to the spreadsheet.
So far, results have been interesting - it looks like most of my CPU cores are being used, albeit at less than 100% utilization, and the GPU is definitely in play due to my fast render times (approximately 8.5 20 megapixel Sony images per minute on at 1080Ti), but weirdly Task Manager is showing my GPU utilization as basically idle for the duration. My laptop, with a 2060, is a bit slower (approximately 6 images per minute), and I can see similar levels of CPU usage, with a GPU that, again, shows as idle most of the time, but occasionally spikes to 100% a few times per minute.
I have no idea if any of the utilization numbers I’m seeing are accurate. I’m guessing I’m missing some of the GPU spikes on the 1080Ti if they’re really short, but given that I’m not seeing either CPU or GPU running flat out, RAM is plentiful, and all of my data is stored on a SSD, I have to wonder where the bottleneck is. I’m inclined to think GPU, but it’s certainly not obvious from the numbers that I’m seeing.
only 36 sec (DeepPrime, Egypte) on Nvidia P2000 - so that’s around 3Mpix/sec.
It should not be that fast.
CPU - i9 7900
64Gb RAM (DDR4).
I guess the amount of RAM made all the difference here??
DXO performance settings:
Maximum cache size: 20000Mb
Maximum number of simultaneously processed images: 16
P2000 is a humble performer, something is not adding up here.
Would developers be in a position to comment?
Task Manager in Windows is not accurate for GPU usage, it also show near 0% when running deepprime.
Afterburner which is more accurate show around 50% usage for Rtx 3080 but power draw is around 80% of the limit.
I also need at least 2 simultaneously processed images to reach max Mpix/sec.
I’ll do some more test to see if I can reach higher speed changing some parameters.
That makes sense. I’ve just fired up a test run with Afterburner monitoring, and that’s showing GPU spikes up to around 65% in places. GPU usage definitely looks like it’s in short bursts, so I’m guessing there’s still room for optimization somewhere, whether in terms of hardware or software.
I was planning on upgrading from the 6700K when Zen 3 ships, and going up from 16 to 32GB RAM as well, so I guess it’ll be interesting to see if a faster CPU with more cores can help keep the GPU loaded, or whether I’m purely GPU bound by the 1080Ti at this point.
I added times for my Surface Book 2 notebook w/ GTX1050, both for running on AC and on battery. Not surprisingly, times are better on AC (D850 images 196 sec, Egypte image 64 seconds) than when on battery (D850 images 252 sec, Egypte image 89 seconds). The integrated Intel 620 GPU processor is marked with an asterisk by DxO (i.e., don’t use); sure enough, it creates errors when trying to run.