PhotoLab Export Worker Management bug PL7.11.1 & PL8.2.1

@DxO_Support-Team I have already submitted a support request thanks to @Jim_S and the fault he reported in Has anyone noticed 8.2.1 doesn't delete core task after processing and will update it with a reference to this post.

The fault was reported as DxPL not closing all export workers and that is certainly the main manifestation of the problem but I believe that someone in DxO has been working with the export worker management code and completely screwed it up, to use a technical term.

The tests were conducted with NO NR images (for speed) on a Ryzen 5900X (with an RTX 3060, which makes no odds in this case).

This is a video of an extreme case with PL8.2.1 using 6 export workers, which is unlikely to occur in real life but shows why this is not just some new alternative to the old way of handling export tasks but a complete mess in the export worker handling code.

This export works reasonably well until image 16 of 20 at that point DxPL cannot make up its mind as to what is should be doing with the export worker processes, what workers should be shut down, how many should be left running, what day of the week it is etc. and the resulting export time for the all 20 images is nothing short of appalling!

So the tests were conducted first with 10 images but once the worker count went above 5 the management of export workers seemed even more confused than it had been. Changing to 20 images produced a more reproducible scenario but still with the wrong outcome!

The summary table for both runs shows the issue with 6 export workers

and the final run looks like this

with 2 export workers left alive, taking up main memory and GPU card memory.

So testing with 10 images with PL8.2.1 yielded

  1. 1 export worker left when 1 copy selected
  2. 2 export workers left when 2 copies selected
  3. 3 export workers left when 3 copies selected
  4. No export workers left when 4 copies selected
  5. 2 copies left when 5 copies selected
  6. 2 copies left when 6 copies selected

So testing with 20 images yielded a slight difference when 5 copies have been selected

  1. 1 export worker left when 1 copy selected
  2. 2 export workers left when 2 copies selected
  3. 3 export workers left when 3 copies selected
  4. No export workers left when 4 copies selected
  5. No export workers left when 5 copies selected
  6. 2 export workers left when 6 selected.

Tests with PL7.11.1 and 20 images also yielded similar results

The same confusion with 6 workers was present in this case but on image 15 I believe and the export copies left running were the same as for the runs with PL8.2.1 and 20 images.

  1. 1 export worker left when 1 copy selected
  2. 2 export workers left when 2 copies selected
  3. 3 export workers left when 3 copies selected
  4. No export workers left when 4 copies selected
  5. No export workers left when 5 copies selected
  6. 2 export workers left when 6 selected.

Finally testing with PL7.10.0 on an i7-4790K (less than a quarter of the Passmark of the 5900X) yielded the following when testing 6 then 4 then 2 copies (sorry about the order).

No residual export processes were left at any point of the testing but there did appear to be a bit of a hiatus with 6 copies selected part way through the process but it didn’t have a significant impact on the results.

Based on your interesting observations I did a few tests based on the maximum amount of pictures 1, 2, 3, 4 exported to SSD simultaneously as per setting in the settings dialog.

The relevant process is this one “DxO.PhotoLab.ProcessingCore.exe --listening --port=20008 --timeout=7200” where PL starts one per each picture processed simultaneously.

Normally the Windows 10 Task Manager is sufficient to observe various tasks. In this specific case Windows Task Manager is not working properly displaying PL and its sub tasks therefore I have used Process Explorer which operates correctly in displaying PL and its sub tasks.

If the setting is set to 1 or 2 PL starts one or two “DxO.PhotoLab.ProcessingCore.exe --listening --port=20008 --timeout=7200” tasks to export one or two pictures simultaneously to SSD. After completion of the task these tasks remain in memory and are not released from memory. The remaining task are used again if further pictures are exported to SSD but never get released from memory.

If the setting is set to 3 or 4 PL starts three or four “DxO.PhotoLab.ProcessingCore.exe --listening --port=20008 --timeout=7200” tasks to export three or four pictures simultaneously to SSD. After final completion of the task the task gets released from memory really task by task so that no task remains in memory after completion of the export of the pictures.

PL recommends the setting to export 2 pictures simultaneously, 4 pictures are possible maximum with my hardware and maxes out the CPU. The use of GPU is very little during export to SSD.

@winnie_pooh Thank you for confirming my findings, my normal simultaneous copies setting is 3 on the 5900X, 2 on the 5600G and 2 on the i7-4790K.

The 5900X can handle more, when DxPl is doing its job properly but the additional copies don’t make much difference if DP XD(2s) is being used, i.e. it the GPU that is the bottleneck and simply increasing the number of export workers simply mean more processes queuing for the GPU!

In the example above I am also using Process Explorer, which I have used on every machine since way before Microsoft acquired the software house that produced that and other utilities.

You know that the export device used has little impact on the time it takes to export, the main time is for the noise reduction done via the GPU. For my examples above I left the noise reduction out of the equation because I wanted the test runs to complete quickly.

In fact I had moved the test data to my N: drive, a 4k/5K NVME, just to keep run times as short as possible, they normally reside on an HDD.

In my original tests I had tested with 2 copies and then ran a rather random test with 5 copies, the 2 copy test left two export workers around but the 5 copy test first closed down the 2 “stuck” copies from the previous test and then ramped up to 5 copies running before finishing the way that it always has, i.e. by terminating all copies of the export worker.

So when I set up the tests for this topic I went from 1 to 2 export workers and then up to 6 in steps of 1 and it was when exporting with 6 copies that PL7.11.1 and PL8.2.1 really shows the chaos that the export worker management has become.

With 16 out of 20 exports complete, i.e. a point when 6 copies are definitely not needed any more, DxPL closed all but one copy then dithered and started another and stopped that one and as a result wasted an inordinate amount of time.

I tried to repeat the test last night with “VideoProc Converter AI” running to capture the bizarre behaviour but DxPL flagged errors on 5 runs until I restarted and ran a test without the video recorder running when it successfully exported all 20 images with the behaviour I have already described.

I did another restart and started the video recorder app and this time DxPL crashed completely after producing a number or errors and nearly all the rest of the images. I have the videos where it created errors.

Today I repeated the test with V7.11.1 with the video capture running and 6 copies exported without any errors but with same strange behaviour that I reported.

I then ran PL8.2.1 with the recorder running but with 2 copies and it ran without errors and left two export processes still running. I increased the copies to 5 with the recorder running and it repeated the reported behaviour and finished with no export copies running.

Finally I completed the sequence with a run of 6 copies and the video recorder running and finally got the recording I was trying for with no errors.

I will create a video showing the errors from various runs and the successful video and post them here tomorrow.

I terminated and restarted PL8.2.1 with 6 export workers selected, a few minutes ago, started the recorder and wound up with the following

@winnie_pooh , @Jim_S This is the the response I got back from DxO Support

Hello Bryan,

Thank you for your patience while we reviewed this issue.

After internal discussions, we can confirm that DxO PhotoLab has not changed how export workers are created or terminated in recent versions (PL7/PL8). Worker processes may occasionally remain active after an export, influenced by several factors, including:
• The number of images exported
• The number of parallel exports set in preferences
• The available system resources (RAM, CPU)

While this behavior may not have been noticed, it is not necessarily a problem. The workers are designed to shut down automatically after 30 minutes of inactivity or when DxO PhotoLab is closed.

If you are experiencing performance issues due to this behavior, you can try:

1. Manually closing DxO PhotoLab after the export completes to ensure all processes are terminated.
2. Reducing the number of parallel exports in your settings to see if that affects the observed behavior.
3. Monitoring your system’s resource usage through Task Manager to see if the running workers significantly impact performance.

That said, if you notice inconsistencies in how the worker processes behave across different export settings, we would be happy to investigate further with more details, including log files or additional observations.
Let us know how it goes, and thank you again for your feedback!"

@DxO_Support-Team MY comments go something like this

we can confirm that DxO PhotoLab has not changed how export workers are created or terminated in recent versions (PL7/PL8)

I only started investigating DxPL after the release of PL5. I was involved in the Beta Test but only really started to dig a whole lot deeper once PL5 had been released.

I have run a lot of export tests since then (I have actually been using the product since OpticsPro 8) and the first time DxPL left any export process running was PL7. 11.1 and PL8.2.1.

The “standard” way that export workers have been handled is that they are automatically terminated after the last export has been created.

Worker processes may occasionally remain active after an export, influenced by several factors

Not in my experience with the product - ever.

If you are experiencing performance issues due to this behaviour, you can try:

Thank you but we worked that out for ourselves but those are workarounds for a problem created by DxO engineers that needs to be fixed, not swept under the carpet as if that is the “new normal”, actually it is worse than that because it is being touted as the old normal and nothing has changed!

That said, if you notice inconsistencies in how the worker processes behave across different export settings, we would be happy to investigate further with more details,

I included details of the situation with 6 workers running when DxPL cannot make up its mind what it should be doing with the export workers and that behaviour is far from normal but you have not addressed that at all.

So DxO’s response is that the product is working as it always has (untrue), that the product has not been changed in any way (a case of spontaneous bug generation obviously) and that there is no problem that cannot be fixed by user intervention to make up for the unnecessary changes to the export worker task management that now exists.

I tried to figure out what the problem is, but there are too many inconsistencies in the description for me to grasp, so I gave up. If you have problem running 4 export threads in parallel, then don’t do that (classic consultant reply :wink: ). Run the default 2. Why worry?

Strange. Looks like a response from someone misinformed or communication error. On my PC both PL8.3.1/Win PL7.12.1/Win config files contain the following settings in the ‘DxO.PhotoLab.Processing.Properties.Settings’ section:
DopCorShutdownDelay = 7200 (= 2 hours, not 30 minutes)
MaxExportProcessingThreadCount = 8
(The MaxExportProcessingThreadCount setting probably comes from the number of P-cores on my CPU – i7-14700 has 8 P-cores and 12 E-cores. With HT enabled only on P-cores, that gives 2*8 + 12 = 28 “logical cpus”. )

My user.config contains the setting MaxProcessingThreadCount = 2, same as in Preferences (default). I did a test and the two export processes were shut down after two hours of inactivity, as expected (tested on PL8.3.1/Win11).

@Wlodek Thank you for trying and I did not write it to cause confusion. Are the inconsistencies in my description or in what is actually happening?

Unfortunately, I was wrong in my assumptions/assertions so I will need to apologise to @DxO_Support-Team but that must to wait until tomorrow for a more accurate apology.

But I firmly believed that all export workers were shut down once an export was completed. However, I have gone back as far as PL4 and rerun my tests with 30 images and no edits applied, essentially the export processes are then CPU bound and the run times are kept short.

Please note that my tests are run one after another without terminating and restarting DxPL or manually terminating any export processes left running.

DxPL is solely responsible for terminating and starting and re-using (or not) any export processes

The test confirmed my assertions shown in the first post above for PL8.2.1, namely

with 1 export worker selected 1 is left running after the export is completed
with 2 export workers selected 2 are left running
with 3 export workers selected 3 are left running
with 4 export workers selected none are left running? In addition DxPL terminated the export processes left running after testing with 3 export processes before starting it own cycle of exports with new export processes.
with 5 export workers selected none are left running?
with 6 export workers selected 2 are left running but there are delays in producing the exports!!?
with 7 export workers selected 2 are left running with problems again during processing!!?
with 8 export workers selected 2 are left running and problems again during processing!!?
with 10 export workers selected 2 are left running and problems again during processing!!?

The processor is an AMD 5900X with 12 cores (24 threads)

Hence, my conclusion that the norm is that all workers are shut down after an export is completed is wrong and that the “problem” is a recent one is also wrong.

Essentially above 3 export workers DxPL changes the rules and initially closes down all the workers for 4 and 5 copies and then from 6 copies onwards it not only changes the rules but cannot make up its mind how many workers should remain and closes and re-opens workers during the processing and has a really bad impact on the run time.

I personally have a deep distrust of any software that starts employing different strategies at different times ( for no apparent reason, al least that is what it seems to me).

It looks like you are trying to do academic work on a subject which is not academic and not worth the effort. Curiosity is a very positive thing in general, but at some point you must make a “reasonable” choice on what to work on. You can’t get them all anyway. That’s just my 5 cents coming from personal experience. But don’t get me wrong, I like curiosity, and I’m not too wise anyway, just pretending :wink:

In my case, i7-14700KF + RTX-4070, using DPXD2s most of the time, two export threads (default) are enough to almost saturate the GPU. So basically my choice is between one and two. Using more than two threads could lead to my system/software instability, as it’s often the case with overloaded systems, so I didn’t even try it. I didn’t tweak any performance settings in BIOS, didn’t overtune my Windows Power Plans, because my preference is to keep PC working stably for at least 5 years, rather than driving too fast and crashing. Maybe you have tweaked too many things outside PL? There are many posts in this forum which suggest that overloaded (e.g. overheated) GPUs can generate errors, which might be handled by intermediate code (OpenCL code in the drivers, Vulkan, or whatever), leaving little clue to DxO code of what has actually happened.

Just a subset of well-known “rules”:

  • if it works, don’t touch it
  • if it doesn’t work, don’t do it
  • don’t stress the software, it’s always buggy (that needs elaboration)
  • stressing the hardware too much shortens its life
  • read the manual first, then re-read it with understanding

and so on.

Sorry will post again soon when finished

@Wlodek How dare you suggest that it is a waste of time any more than writing about the pathetic sub 75% view of my images when this

turns to this

with a single click of the mouse wheel.

You are right of course and I have mostly stopped investigating anything with respect to DxPL because it is a futile exercise but @Jim_S posted and I started investigating with the strong belief that DxPL closed the export tasks after a successful export, only to discover it doesn’t but does have some peculiar traits!?

Thank you for the test that showed that the 30 minutes (the text in italics in that post were straight from the response from DxO to my support request) is actually 2 hours.

With my old i7-4790K I ran with 2 export workers as the default and that has now increased to 3 with the 5900X. If it was 4 than I would have all workers shutdown after an export (according to the results of my tests)!?

Your i7-14700KF + RTX4070 combination may not get the job done much faster with more than two threads but really should not be particularly unstable with any number of export workers.

I have tweaked nothing outside DxPL and we have had the conversation about overheating before and since then I have installed a liquid cooler but the 5900X runs hot and the case is cramped for such a beast, so I need to look for a bigger case but the tests on the number of copies is only using the CPU and I do not believe that any of my results are caused by overheating issues.

They are caused by “weird” coding. The good news is that most users won’t be dumb enough to use 6 or more export workers to get the job done in real life.

A 3 worker test on the Egypt test image, the snapshot is just after the test completed

and repeated to check if the same three workers were re-used, I believe that they are,

This is the result of the 1 to 10 worker tests using my own image rather that the Egypt image.

The “chaos” doesn’t seem to set in until 7 or more images are involved although typically a video will show DxPL terminating and restarting workers from 6 export workers onwards?

The snapshot has been reworked because the export details roll off the top so were taken from an earlier snapshot in the sequence.

I worked for a hardware/software manufacturer for 36 years delivering 24/7 multi-million pound systems.

I did all of those things with malice aforethought before I ever let my customers near their systems and designed some of those systems (the software), sized the systems, audited theirs and other customers systems and provided 24 hour support.

And I wrote user manuals.

PS:-

The figures I gave earlier that showed problems starting with 6 export worker didn’t show up with the tests of 30 images of mine. They originally showed up during 20 images tests of the Egypt image.

So I returned to the Egypt test, albeit a 30 image test set and made a successful run at 5 workers to set the timing “benchmark” then 6 workers and that appeared O.K. so on to 7 images and this happened

I could recreate this problem with a previous configuration of the 5900X every time I tried to video the 6 export worker process in action. But I then removed the Ramdisk from the startup and also removed a number of other extraneous bits of software and the problem went away, until the above test.

No video recording software is running and a rerun got this, which I have also seen before when trying to video the run!?

I sent the fault report straight to DxO.

I have sent a huge ZIP file to DXO support with many screenshots showing the specific behavior when outputting 1 to 4 images in parallel to the hard disk because screenvideo does interfere with the observed DXO task in terms of CPU/GPU usage.

The raw images 42MB with DeepPrime XD were included and the DXO settings as well with screenshots.

With 3 and 4 images in parallel, the tasks are removed from memory one after the other down to one remaining task.

With 1 and 2 images in parallel, 1 task or 2 tasks remain in the memory in contrary. I have not checked for how long but it was possible to shut them down in task manger without problems.

Basically, the output of four images in the 1 to 4 parallel setting takes approx 01:06-01:25 minutes. Optimum setting might be with 2 pictures in parallel.

The overall performance seems to be in suffficient order with the 8.3.1 DXO version in terms of a little bit better than before but I might be wrong.

The idle performance of DXO on the I9/RTX4000 is between 6-16% GPU load, which is surprisingly high, even though the program only displays one image and nothing is being done.

I hope that DXO development is able to source some more room for improvement in optimizing in the handling of multithreaded programming.