When it comes to keywords, there are standards and there are standards. The trick is to use the most universally accepted standard, in order to make the metadata accessible to the majority of management apps.
During the beta and into the release for PL5, I did a lot of work on compatibility, whilst writing my own keyword management app that gave me more flexibility in browsing and, especially, searching after keywords had been assigned.
I found it is essential to understand the exact minimum XMP tags to use for the best results.
The first, and most important, tag is xmp-dc:subject
. This is also mapped to xmp:subject
and mwg:keywords
. its purpose is to store a flattened list of all keywords stored, whether they be standalone or part of hierarchies.
This is the tag that the vast majority of software uses for searching. If you want to search for a complete hierarchy, then I can simply AND all the constituent keywords in that hierarchy together in the predicate.
So I can have a file marked…
<dc:subject>
<rdf:Bag>
<rdf:li>Entreprise</rdf:li>
<rdf:li>Télécommunications</rdf:li>
<rdf:li>Orange</rdf:li>
</rdf:Bag>
</dc:subject>
And that will allow me to search for either the complete hierarchy or files that contain part of the hierarchy.
Take the example of an image library that contains four different images, one with each of the following hierarchies…
Couleur|Orange
Fruit|Orange|Satsuma
Matériel|Télécommunications|Orange
Entreprise |Télécommunications|Orange
Now, imagine I am writing an article on the effect of the word Orange in advertising and I want to search for all images that contain Orange, regardless of any other keywords or hierarchical context.
See what happens in PL7 when I try to search for Orange in more than one hierarchical context…
The first time around, I am offered all four contexts with a count of one file per reference.
Now, because that first predicate only refers to Orange in the Couleur hierarchy, I now proceed to try and add Orange from the Fruit hierarchy…
So, it seems it is impossible, in PL7, as in previous versions, to search for a keyword in multiple hierarchical contexts at the same time.
And this is just one example of the problems that DxO have produced, because they have indexed files using hierarchies rather than just the simple dc:subject
tag.
What gets written to the DOP file, and also in the database, is…
Keywords = {
{
"Entreprise",
"Télécommunications",
"Orange",
},
{
"Entreprise",
},
{
"Entreprise",
"Télécommunications",
},
},
… which clearly shows that Orange is only accessible through the fully qualified hierarchy, which must include both Entreprise
and `Télécommunications, thereby precluding any other context.
So, to clarify what should be stored where and why…
According to the metadata guidelines, the dc:subject
tag should contain all keywords including those that result from flattening any hierarchies. This is the “working” tag that is used for searching and, if correctly used, should allow any software to find files for any complex logic query.
However, the lr:hierarchicalSubject
tag is only meant to be a means of “transmitting” any hierarchical contexts that may exist for the keywords in the dc:subject
tag.
e.g.…
<lr:hierarchicalSubject>
<rdf:Bag>
<rdf:li>Entreprise</rdf:li>
<rdf:li>Entreprise|Télécommunications</rdf:li>
<rdf:li>Entreprise|Télécommunications|Orange</rdf:li>
</rdf:Bag>
</lr:hierarchicalSubject>
The “bug” that was introduced in PL5, and that still exists, and that can be the cause of incompatibility is all down to the above described “misuse” of hierarchical tags.
Oh, and not forgetting that DxO have not maintained the idea of SPOD (single point of definition), by allowing a keyword to be written to…
- the database
- the DOP file
- an XMP file
Which is why my best recommendation is to never use PhotoLab for keywording unless you are planning on never using any other keyword management software concurrently. And, even then don’t expect to be able to search easily for anything other than simple keywords in non-compound conditions.
See what happens using my software…
- search for Orange…
The search results show 5 images and selecting one of them produces…
… and another…
… etc.