I tested myself on my brand new iMac with 6 cores i5. 17 RAWs, 24MB, the last one with PRIME noise reduction. Note that I still use PL1, results may be different with PL2.
1 in parallel:
about 400% CPU utilization (fluctuating), 600% for the last picture (PRIME)
time: 2:37
2 in parallel:
steady 600% CPU utilization,
time: 1:55
3 in parallel:
steady 600% CPU utilization,
time: 1:52
You can also watch the XPCCor processes, e. g. for the setting at 3:
Each one uses around 600MB while busy. So having lots of memory is not too important for PhotoLab.
Also from what you wrote I think that each XPCCore process uses four threads for heavy processing, which results in 400% CPU utilization for just one picture in parallel. With PRIME the picture is different, it seems to use more cores.
Therefore I would leave the setting at ceil(#cores/4), e. g. 1 for up to 4 cores, 2 for up to 8 cores, 3 for up to 12 cores.
GPU is not used during export, or at least it doesn’t move the needle.
