@John7 There have been a number of discussions in this and other topics about the desirability/need/advisability/… of Keywords in the DOP! For Virtual Copies it is an obvious choice and unless (or until) you export the VC as a new photo then the data will exist nowhere (except in the database) unless DxO implement a new ‘xmp’ entry for VC keywords etc etc. which would have to be written back to the [M[aster photo and …
I see no simple solution to the VC keyword issue unless reliance is placed solely on the database; as for the [M]aster there have been many and long discussions about the “futility” of them being in the DOP and I have not yet explored all the possible scenarios where they many/may not help, principally in relation to the handling of VCs.
Please note that I didn’t design/write/or am in any way responsible for what exists I am simply trying to test bits of it to “destruction” in the hope that DxO can build an even better product out of what I find and report along with others using this forum.
So, it looks like the hierarchies are defined but, as we shall see, not correctly or completely.
The way these keywords are recorded, it would be impossible, or at least difficult, to search for just Orange. We are limited to only being able to search for either of the two top-level keywords, or the complete hierarchies. Which is why the present PL5 search is next to useless for anything but the simplest of searches for flat keywords.
I see this keyword structure in the DOP files (and presumably in the database) as a severe impediment to improving the search mechanism. I really don’t know why DxO didn’t echo the XMP structures, which provide a completely flexible and powerful search mechanism, along with a perfect transmission of the XMP data at export time.
When I use my app to write the keywords, I type in Ora and list of possible alternatives appears…
I choose from the proposed list…
Then I start to type Fru and the list appears again…
… and I choose…
The code behind this then knows about the two hierarchies and writes them correctly to the XMP sidecar, as you saw above.
In addition to which, in reading the metadata, presumably from the XMP file, the PL5 keywords box now shows…
… with Orange duplicated.
This is very confusing and the only way to see why there are two Oranges is to hover over the tokens to get a tooltip to appear…
… which is far from obvious or intuitive.
If I then export to a JPEG (including only keywords) and inspect the metadata, I get…
Thanks. I think we’re all looking forward to seeing the FAQ.
I think I understand your problem in storing ‘interim’ metadata values in the DOP file (i.e., you can’t store them yet in an output file before it’s created…). Perhaps the FAQ will explain this and other issues in more detail.
Nice explanation of how the different keywords-related tags should behave. I hope it helps DxO understand better some of the issues we’ve reported.
And you also point out why using Legacy IPTC is a problem (and should be dropped by DxO as a default). It’s functionally obsolete. The lack of ability to handle things like different language accent marks and much too short field lengths inevitably lead to metadata inconsistency with XMP-based standards.
@sgospodarenko Understood so for the combination of keywords explicitly entered as animal, mammal, bear and animal|mamal|bear you have the following in the DOP
This seems fine but the standalone version of Orange has been added to the lr:hierarchicalSubject as if it were a hierarchical keyword instead of just a standalone.
So, what happens if add in the parents of the three Oranges?

We still get three mentions of Orange in the keywords box and the XMP file now looks like this…
All possible keywords are in the right column, whilst the left column shows them arranged into hierarchies. To add a keyword to a parent, you just drag it from the right column to the appropriate keyword in the left column. This way, a keyword can be added to multiple hierarchies and can exist as a standalone keyword at the same time.
The number in the right column shows how many files reference a given keyword at any one time and pressing the spacebar on a selected line shows a Quicklook panel with a preview of all images that contain that keyword, regardless of where they are in any hierarchies
This completely avoids the confusion we presently have in the PL5 keywords palette, by ensuring that the operation of structuring the dictionary is completely independent of the operation of applying keywords to files.
Of course, we still have the mess of not knowing which version of Orange we are searching for…
… without having to guess from a list of three and then invoke the tooltip on the inserted token to see if we were right…
@joanna, @jch2103 I started to write a reply to your excellent tutorials but needed to do some experiments of my own to confirm what was going on - plus I haven’t yet read your two new posts and won’t be able to until later today/early tomorrow.
To this end I set up 4 directories for Bridge (‘dc’ only), PM hierarchical but still set to update embedded xmp in RAW, IMatch and PL5 itself. The directories contained my previous test data i.e. 3 JPGs and 1 RAW
The test key groups were applied to 3 of the 4 photos by the respective software, captured to a “Baseline” directory and then processed by PL5 and all 4 exported to a subdirectory.
The keywords used were
animal, mammal, bear
animal|mammal|bear
left empty
animal, mammal, bear, animal|mammal|bear
This first post compares how the different packages went about setting up the “baseline” data and contains numerous snapshots! Please ask for any comparison not shown if you want to see a particular combination.
The first two screenshots are a bit weird because they show three separate, standalone, keywords instead of any hierarchical relationship between them. If there is no hierarchical relationship, the hierarchical subject tag should be empty.
The third screenshot (possibly more correctly) shows no hierarchical relationship.
The second three all deal with a badly formatted keyword of animal|mammal|bear, which, as we have previously discussed is only one word, not three, and more than likely requires the authoring software to correctly interpret it. It is unlikely to be universally compatible.
The third three also contain the same invalid keyword, designed to confuse anything that sticks to the “rules”.
If I remember rightly, this kind of “composite” keyword is a speciality of PhotoMechanic and caused @KeithRJ all sorts of grief when we he discovered it isn’t necessarily compatible with all other software.
No it doesn’t. Yet another difference between Windows and Mac
I can’t as I don’t pay rent for software that isn’t as good at editing images as PL5
If this what Lr does, it makes for a very good reason not to use it. Being compatible with Lr doesn’t mean that PL has to copy what they do. DxO should rather lead the field in excellence in metadata handling rather than sinking to Adobe’s poor and confusing level.
@joanna I am a little concerned that I have not fully understood the rules that you are proposing!? During Beta testing this was mostly new to me, I had never used a hierarchical keyword in my life so the way that I went about testing was to compare various programs and the way that they handled various scenarios to see if I could understand the logic.
I restarted that process in the my post above; I was always concerned about mixing PL5 with its “apparent” obsession with ‘hr’ keywords (subject) with ‘dc’ only programs, Adobe Bridge in this test.
So I used one program IMatch to compare the results with PL5 and also to see how your proposals stack up against what these programs are doing with the data. The comparison is Test 2 - animal|mammal|bear and Test 4 - animal, mammal, bear and animal|mammal|bear all explicitly added by the user (me), principally to see what happens!
One problem I had back in Beta testing was that using this approach threw up more inconsistencies than I expected, if we add your proposals then things get even more complicated. So for test 2 we have
IMatch fails to include the flattened keywords in the ‘dc’ Subject which PL5 has included but PL5 leaves out the hierarchical key from the Subject field and, as a consequence ‘dc’ only programs have no sight/idea of the hierarchical nature of the keywords!
Surely for compatibility the Subject must also contain animal|mammal|bear? Hence, in this case shouldn’t the keywords be Subject = animal, mammal, bear, animal|mammal|bear and HR Subject =animal|mammal|bear?
When we get to Test 4 with the “overloaded” set of keywords we have
Based upon your “recommendation” you would suggest that the flat keys have no place in the HR Subject (it does make sense) but I would suggest that animal|mammal|bear should be present in the Subject field to maintain compatibility with ‘dc’ programs. During Beta testing I declared such a key in Bridge and by the time that PL5/LR had finished with the photo the displayed photo in Bridge no longer had the (hierarchical) keyword that Bridge had assigned to it!!??
The other comparisons that I can make show what an absolute “pig’s breakfast” this area is, we may criticise DxO for not getting it “right” (and they had plenty of expertise to tap into) but I am not impressed with what some of the other programs are doing with keywords either! I am not a standards expert but when I see a keyword “vanish” from Bridge I feel that something has gone very wrong!
A league table of what is in the photo after each of the programs has finished putting in the test data. Test 03 is left deliberately blank just in case I need another JPG for a further test.
What is the “right” way to populate the Subject and HR Subject is what we are now discussing in this topic.
I have not included IPTC in the spreadsheet which I can on request (please send bitcoin to the value of …)
As a great friend from the southern US used to say - “Thar’s yer prawblem”
I awoke at around 5:00 this morning, thinking about this, and had to do some research there and then. As a result, I think may have discovered an explanation for the differences you are seeing.
I am now fairly much convinced, but couldn’t find a definitive reference, that adding “composite” keywords to the dc:subject tag may have come about at a time when somebody wanted to, somehow, record hierarchy before Adobe introduced the lr:hierarchicalSubject tag.
From what I can see, apps like iMatch and PM try to be compatible with that practice, even though it was no longer necessary with the arrival of the lr:hierarchicalSubject tag.
I’m not sure what you mean by “dc only programs”
Adobe Bridge is definitely not a “dc only” program, it both reads and writes the lr:hierarchicalSubject tag.
PL5 tries its best to be MWG compliant and is correct in not writing composite keywords to the dc:subject tag. The only minor niggle that I find with PL5’s writing of lr:hierarchicalSubject is that it sometimes writes non-hierarchical keywords as well.
To my mind, adding such composite keywords to the dc:subject tag should be regarded as a defunct and purely legacy requirement and, upon finding such, any decent app should update them by moving them to the lr:hierarchicalSubject tag.
I will stick my head above the parapet and dare to say that any app that does write composite keywords to the dc:subject tag is not adhering to MWG guidelines and should not be relied on.
On the other hand, PL5 needs to “tidy up” what it writes to lr:hierarchicalSubject, omitting purely non-hierarchical keywords.
Absolutely. It goes against MWG guidelines.
And I would have to disagree if PL5 is to be seen as a “correct” manager of keywords. Which isn’t to say that PL5 shouldn’t read and convert such into its correct form. If a “dc only” app can’t cope with that, it is that app that is at fault, not PL5.
I did try PM for its free trial period and, from what I remember, it could read lr:hierarchicalSubject. I can’t say for iMatch but, if it can’t, I would regard it as deficient and not MWG compliant.
Maybe during beta testing, but now, assigning my test hierarchies in Bridge gives me an XMP file with…
… even though Orange is duplicated in the keywords field.
Even with the matter of an abysmal and cramped UI for keywording, I would say confidently that I would rather use PL5 over any of the others that you have talked about because it is the lost MWG compliant. If the other apps you have tested can’t read what PL5 writes, then they are wrong.
N.B. Since the release of PL5, the integrity of keywording has improved on what it was during the beta, but I would still take the recommendation to try and stick to a single metadata manager rather than move back and forth between them. This becomes even more important in the light of your findings.
N.B.B. Don’t ever use composite keywords in the dc:subject tag. It is apps that allow this that are the source of a lot of the headaches people are finding with compatibility, not PL5, which is “almost” MWG compliant.
The HR subjects marked as wrong in Test 01 are because those apps are unnecessarily, but not necessarily wrongly, adding non-hierarchical keywords to the HR subject. In any case, adding them will not cause problems for searching or transmission of hierarchical data.
For Test 04, the first three are marked wrong for the same reasons as Test 02 - adding of composites to the Subject is incorrect. The last two are “sort of” right and, as with Test 01, there is nothing that should affect either searching or transmission.
Side note
PL5 seems to search based on HR subject. This might not be strictly wrong but it does restrict results and make predicates harder to define. All searches should be based on Subject alone.
I’m not sure if I understand the first part of the statement: I find DPL5 metadata in .dop sidecars, no matter if virtual copies are present or not. This would mean that metadata is written to .dop sidecars unconditionally in order to provide MD to virtual copies that might (or might not) be added later, even if the database had been deleted in the meantime. Note that I have no problem with that.
I do propose that DxO also read such MD from the .dop sidecars in order to restore or complement info to the database. If MD were also present in the image file or .xmp sidecar, some kind of dialog should appear to let the user select what action should be taken.
Imo, a sync warning (@BHAYT) is no valid solution in a world of more than one app editing metadata. Simply overwriting MD read from whatever source has been prioritized by someone who is not the user is not a decent way to act either.
@Joanna sorry about “causing” you to wake up early and be so “rivetted”/“inspired”/“puzzled”/“cross” (delete or add or replace as appropriate) that you could not get back to sleep. The central heating has had the same effect on me! Finally having plumbed the central heating into the mains water supply (via an appropriate non-return valve) we flushed the system out (and pushed up the water bill) because the new pump was still not coping.
One bottle of X800 cleaner later and slowly, slowly the heat started reaching the farthest parts of the bungalow, just in time because we had a hard frost last night. Today/tomorrow the X800 has to come out to be replaced by X400 cleaner which can stay in for 1 month!
I just updated to the latest version of Bridge 11 (not Bridge 12) on the test machine and that has only ever used ‘dc’ fields, at least in all my tests!? Zoner also still seems to stick to ‘dc’ fields even when inputting an hierarchical keyword. By this I mean that those programs only set ‘dc’ fields and only display from ‘dc’ fields. I am currently downloading the latest version of Bridge and when that is installed I will retest and update the post and the table as appropriate.
I understand that now and I am “sorry” for being such a “naughty” tester but if programs that are on the market only use ‘dc’ fields and start using the hierarchical keywords it certainly looks “weird” when the keys you entered vanish after a “visit” to LR or PL5 or …
Sometimes ignorance of the accepted way of how things work is actually an “asset”, it leads to discovering the oddities that those only treading the right path can miss!
I think there will be problems with users of older products that might be disappointed except that the chances are that they will never have used hierarchical keys!?
I think that you meant “it is the most MWG compliant”.
Understood and I discovered that they could be added just by experimenting but if there are any real users out in the world then …
An updated version of the table, if any one wants the spreadsheet I can make it available and they can add Photo Supreme, Capture 1 or whatever!