Keyword bugs and enhancements

I think I’m actually ready to file a bug report, but I’m interested in knowing what people think is the right answer.

Yes, I understand some of you hate drag-and-drop for keywords, but I’m far as I’m concerned, the right answer is not to disallow it (and it’s unlikely that DxO would remove a feature).

Start with keyword hierarchies A|B|C and D|E. Tag a JPG with ‘C’ with “whole hierarchy” selected in Preferences.

PL will create a DOP file with the following:

Keywords = {
{
"A",
}
,
{
"A",
"B",
}
,
{
"A",
"B",
"C",
}

Now, drag-and-drop ‘C’ onto ‘E’. What I would expect is:

Keywords = {
{
"D",
}
,
{
"D",
"E",
}
,
{
"D",
"E",
"C",
}

What I actually get is:

Keywords = {
{
"A",
}
,
{
"A",
"B",
}
,
{
"D",
"E",
"C",
}

Of course, they only promise to select the entire hierarchy when assigning keywords, not when moving them around, so maybe this should be feature request. I can’t imagine why anyone who has chosen the “whole hierarchy” preference would want drag-and-drop to work this way.

Checking the JPG, it has subject keywords A, B, C, D, E and hierarchies A, A|B, D|E|C. In other words, it conceptually matches what is in the DOP file.

The workaround is to search for all files with hierarchy D|E|C. These will be the same files as were originally A|B|C. Then select all the files found by the search. Deselect C and re-select it. This will enable hierarchies A and A|B. With the same images selected, deselect A and A|B.

@RobiWann talked about LR having a new notation, where the original entry would only include hierarchy A|B|C and the new entry would only include D|E|C. This would fix the problem only if a search for “D|E” also finds all files tagged with "“D|E|C” (which might be true for LR, but is not true for PL and might not be true for other tools).

Unchecking “whole hierarchy” allows PL to write equivalent entries. The problem isn’t in the entries, but in how a search for a keyword is performed. In any case, explicitly including the parent keywords might be the best option for maximum compatibility with other tools.

@Joanna also mentioned that a hierarchy of just “A” under lr:hierarchicalSubject is an error. However, to the extent that I am reporting a problem with the DOP files, which store keywords in their own way (i.e. no dc:subject or lr:hierarchicalSubject tags), this is irrelevant. In any case, that should probably be a separate bug report (and may already well be).

I am tired after a long committee meeting but I want to answer this and will do so tomorrow morning.

Got up late - back later :sleeping:

Suffice to say I have a possible solution

2 Likes

From and to where do you drag?

The result feels like you dragged C in the keyword list tool to E in the keyword list tool. That would change the A and D hierarchies, and consequentially the sidecar entries of all image files containing the A and/or D hierarchies.

The keyword being dragged is ‘C’. You press the mouse down on ‘C’, hold the button down while dragging the ‘C’ over the ‘E’ and then release the button. In other words, drag-and-drop ‘C’ onto ‘E’.

No. PL will find all files with an entry of A|B|C and change that entry to D|E|C. An image with only the hierarchies A or A|B or D or D|E should not be touched. It would be unnecessary.

What I’d like requires the same number of files to be modified, no more, no less (i.e. files with the hierarchy A|B|C).

OK. After much cogitation, here are my ramblings.

The first thing that strikes me as wrong is the mixing of the keyword “manager”, with which we can add, remove and arrange keywords from the “dictionary”, with the panel for assigning, deleting, etc of keywords for a selected image.

This is nothing short of confusing for a lot of users and can/does lead to all sorts of problems.

Let’s start with @freixas first hierarchy A | B | C.

In my software (which is compatible with most other software, I manage my hierarchies in a separate manager dialog so, for that hierarchy, what I get is…

Choosing that hierarchy in the main image window looks like this…

… and, once the selection has been accepted, like this…

This represents the contents of the dc:subject tag.

There is no explicit display of the hierarchy but it is recorded internally for writing to the lr:hierarchicalSubject tag.

The resulting XMP (either in an image file or an XMP sidecar, looks like this…

  <dc:subject>
   <rdf:Bag>
    <rdf:li>A</rdf:li>
    <rdf:li>B</rdf:li>
    <rdf:li>C</rdf:li>
   </rdf:Bag>
  </dc:subject>
  …
  <lr:hierarchicalSubject>
   <rdf:Bag>
    <rdf:li>A</rdf:li>
    <rdf:li>A|B</rdf:li>
    <rdf:li>A|B|C</rdf:li>
   </rdf:Bag>
  </lr:hierarchicalSubject>

Opening this image in PL7 shows me this (three images to show the hierarchy that appears when I hover the mouse over…

The resulting DOP file contains…

			Keywords = {
				{
					"A",
					"B",
					"C",
				},
				{
					"A",
				},
				{
					"A",
					"B",
				},
			},

However, the search field brings up a very restricted and somewhat confusing option set…

Capture d’écran 2024-01-04 à 18.13.51

Capture d’écran 2024-01-04 à 18.16.06

Capture d’écran 2024-01-04 à 18.16.13


Now, to follow your example in my software, I drag C onto the leaf of D | E and get…

At which point, I see that my software does something odd because the algorithm for parsing the entered hierarchies creates a second hierarchy that ends in C which for some reason I have now got to discover, only retrieves the latter D | E | C when writing to the XMP…

  <dc:subject>
   <rdf:Bag>
    <rdf:li>A</rdf:li>
    <rdf:li>B</rdf:li>
    <rdf:li>C</rdf:li>
    <rdf:li>D</rdf:li>
    <rdf:li>E</rdf:li>
   </rdf:Bag>
  </dc:subject>
  …
  <lr:hierarchicalSubject>
   <rdf:Bag>
    <rdf:li>D</rdf:li>
    <rdf:li>D|E</rdf:li>
    <rdf:li>D|E|C</rdf:li>
   </rdf:Bag>
  </lr:hierarchicalSubject>

So, if nothing else, you have helped my find a possible bug in my software that nobody else has yet found :smiling_face:


And this is part of the problem with designing “foolproof” keyword handling code.

In this case, should dragging have moved the dragged object, or copied it? Because, by moving it, the C no longer belongs to A | B | C but only to D | E | C - or does it?

If I create the following hierarchies in PL7 without dragging…

… then I get exactly the XMP I would expect…

         <lr:hierarchicalSubject>
            <rdf:Bag>
               <rdf:li>A</rdf:li>
               <rdf:li>A|B</rdf:li>
               <rdf:li>A|B|C</rdf:li>
               <rdf:li>D</rdf:li>
               <rdf:li>D|E</rdf:li>
               <rdf:li>D|E|C</rdf:li>
            </rdf:Bag>
         </lr:hierarchicalSubject>

… and the DOP looks like this…

			Keywords = {
				{
					"D",
				},
				{
					"D",
					"E",
				},
				{
					"D",
					"E",
					"C",
				},
				{
					"A",
					"B",
					"C",
				},
				{
					"A",
				},
				{
					"A",
					"B",
				},
			},

So, without drag drop, PL7 seems to be doing the right thing.


As you can see, the DOP file you got resembles the XMP that my code produces but, I have a sneaking suspicion that both mine and DxO’s may be wrong and this does, indeed need looking at more deeply.

Finally, I get a sneaking feeling that using the same keyword at the same level, in more than one hierarchy, could be the cause of the problem, but this is going to require a bit more digging.

Look very carefully at what happens when you drag one simple keyword onto another…

The selected keyword is moved, so, in the case of your test hierarchies, using drag-drop, I ended up with -
XMP…

         <dc:subject>
            <rdf:Bag>
               <rdf:li>A</rdf:li>
               <rdf:li>B</rdf:li>
               <rdf:li>D</rdf:li>
               <rdf:li>E</rdf:li>
               <rdf:li>C</rdf:li>
            </rdf:Bag>
         </dc:subject>
         …
         <lr:hierarchicalSubject>
            <rdf:Bag>
               <rdf:li>A</rdf:li>
               <rdf:li>A|B</rdf:li>
               <rdf:li>D</rdf:li>
               <rdf:li>D|E</rdf:li>
               <rdf:li>D|E|C</rdf:li>
            </rdf:Bag>
         </lr:hierarchicalSubject>

DOP…

			Keywords = {
				{
					"D",
				},
				{
					"D",
					"E",
				},
				{
					"D",
					"E",
					"C",
				},
				{
					"A",
				},
				{
					"A",
					"B",
				},
			},

I am one of those who don’t like the interface palette. It’s my fault probably but still.

In my personal use i prefere two main functions and one deaper laying add function.
Function one: The use of Keywords to find images. Source files and if needed also exported files or Only exported files. (in case of being inside dxopl exclusively. For it’s DAM) Advanged settings to exclude or include folders in the search. And /OR / exact
Function two : adding keywords to images by selecting 1 or more images and check keyword(s) whom needs to be added.
Note both function needs to be separated in palette/window in order not to accidently add keywords wile you search and select extra keywords to narrow the selection.
Function 3 is managing/edtting keywords , which is a separate function which don’t need any images being visible.
And these three functions must be separated and not be active in one window/palette. Could be as the tabs in palette tools inside the keyword palette.

The left top search bar in dxo could be function 1 but some advanged options are missing. Like folder exclusion.
The right bottom palette for keywords needs a lock function to avoid unwanted activating function 3 editting ,adding keywords and changing hierachie of the keywords.

Ad this moment i am spending time to update, clean and organise the exported jpeg archive iptc and keywords. Not in dxo because i don’t know if things got strange due me or dxo.'s dam.
@BHAYT , Bryans link to a watcher will be my next tool to look at the jpeg properties. See if it is popping up right on the other side.

Then i have to compare export file with sourcefile if it’s there (oocjpegs are in the earlier day’s often just editted and overwritten.) iptc and xmp keywords.
Also that checked by watcher tool.
Then complete back up of archive.
And only then i open dxo pl in read modes with a fresh DB. In order to see if the hierachical structure is taken over. (every keyword should be used so dxopl should build exactly the same structure as i have build in Bridge.

From that point i can start to investigate if i can get use to the v7.2 functionality.
i alway’s liked the exif search possibility’s top left in dxo.

I am not in any assistance of bug searching by the way, enhancements maybe.

Glad I was able to help.

Yes, the XMP file “matches” what I had reported for the DOP.

I’m thinking I need to do a feature request, not a bug report. DxO does what it promised when the “whole hierarchy” option is selected—it selects parent keywords when a keyword is assigned to an image. They make no promises about what happens with drag-and-drop, and so there is no bug to report.

My feature request would be for a strict hierarchy option.

  • Assigning a keyword to an image also assigns all parent keywords.
  • Unassigning a keyword unassigns all parent keywords.
  • You would not be allowed to disassociate a parent keyword from an image as long as child keyword is associated.

A drag-and-drop would be equivalent to:

  • Select all images with the keyword.
  • Unassign the keyword from all selected images (following the rules above).
  • Change the keyword’s parent in the hierarchy from the old to the new.
  • Assign the keyword to the still selected images (following the rules above).

Given my example:

  • Select all images with the keyword A|B|C (this is a short-hand for selecting all images with the keyword C whose parent is B and then A).
  • Remove the association of each image with the keywords A|B|C, A|B, and A.
  • Create a new keyword C whose parent is E and then D.
  • Associate every selected image with D|E|C, D|E, and E.

These rules would operate only when the strict hierarchy option is enabled. Whether you would want this option enabled depends on what you want to do with a hierarchy. Without this option, then the current drag-and-drop result given in my example is not really incorrect, although I don’t understand what value it has.

I’m not sure if I agree or disagree. I don’t have your software to work with and screen captures don’t always convey how something operates.

I want to have a global hierarchy that applies to all images. I also want to apply keywords from that hierarchy to images. With your software, it looks like one selects keywords from a flat list. I need a tree structure, where I can expand and collapse the various branches so that won’t work for me.

Let’s explore another issue using this example:

actions
  lead
  follow
metals
  lead
  gold

Your hierarchy manager seems to have a separate pane for keyword counts. I am only be interested in counts for specific keywords. The two uses of “lead” should be counted separately. There is no case in which I would want the counts for the two to be merged.

Because of these two requirements, the keyword management panel and the keyword assignment panel I would want would look very similar. Whether this is confusing to users and whether it leads to all sorts of problems depends on your keyword model.

In my opinion, you should either go all in or not at all; in other words, you should either have a flat list of keywords (no hierarchy) or a strict hierarchy. The default appears to be to support a semi-hierarchy, (where selecting a child doesn’t automatically select the parent), which is where I suspect a lot of the problems come from. In cases where people have some use for a semi-hierarchy, an interface like yours might have some value.

PL has to support a semi-hierarchy because I get the impression a lot of other people do. But PL could support strict hierarchies as an option.

It’s been a while since I last posted anything here, but I thought I would give an update on what I’ve been doing. I finished writing a tool that checks for inconsistencies between keywords in PL’s database and in the corresponding DOP/XMP files (for RAW images) or DOP/image files (for RGB images).

My tool found 23,212 images where at least one master/virtual image had at least one keyword. Of those, it found 3,024 images with at least one mismatch between the database values and the external values.

Before I ran my tool, I deleted my PL database and indexed all my image files. With XMP synch enabled in Preferences, I then proceeded to tag images with keywords, create new keywords, rename keywords, and move keywords around in the keyword hierarchy. During all this time, I used no other keyword-management tools.

After all this was done, I checked for discrepancies. There should have been none, but I found many. Here is an example of the list of problems for one file:

2022_09_29 10_28_47 2914 C70D.CR2: ERR: (XMP) Missing subject Sunset Crater Volcano NM
2022_09_29 10_28_47 2914 C70D.CR2: ERR: (XMP) Extra subject Sunset Crater Volcano NP
2022_09_29 10_28_47 2914 C70D.CR2: ERR: (XMP) Missing hierarchy Locations|United States|Arizona|Sunset Crater Volcano NM
2022_09_29 10_28_47 2914 C70D.CR2: ERR: (XMP) Extra hierarchy Locations|United States|Arizona|Sunset Crater Volcano NP
*** Problems in file ...

This specific examples shows two things:

  • I renamed a keyword entry from “Sunset Crater Volcano NP” to “Sunset Crater Volcano NM”.
  • The edit made its way to the database and the DOP file, but not to the XMP (despite having XMP synch enabled). I know the DOP is OK because my tool noted no DOP problems.

I also spotted many cases in which a keyword present in the database was entirely missing from the corresponding DOP/XMP/image file.

The greatest number of mismatches were in XMP files, but there were also many DOP and RGB mismatches. The bottom line is that PL cannot maintain consistency between its database and its external files, even when there is no external tool interference.

Given that this is not an easily reproduced bug, I don’t know how DxO will respond to a bug report. I don’t have time to prepare one right now, but will try to submit something in the coming week.

For what it’s worth, my tool will also fix the external files so that they match PL’s database. I then delete the database, re-index, and then everything should match. I have run the tool in “fix mode” and things are looking good for the moment.

2 Likes

What if you manually chose File > Metadata > Write to image ?
Is a XML created then?

Manually saving metadata to files creates the entries indeed, be it in RGB files or in XMP sidecar files. The same thing works for DOP sidecars.

As pointed out many times already, I set DPL to NOT read/write/sync settings and metadata, which lets me control the interaction with Lightroom Classic, my “single point of definition” for metadata.

Manual control is somewhat annoying though: There are no keyboard shortcuts!
Moreover, it would be nice to have separate checkboxes for reading and writing metadata in DPL’s settings. Vote for the respective feature request I posted in 2022.

1 Like

@platypus it makes absolutely no difference if a user uses automatic updating or manual updating if the update process is flawed. In your case you are essentially taking from the Lightroom Classic managed metadata and importing it into DxPL presumably so that data will find its way into the exports made from DxPL.

From @freixas’s analysis the process is flawed at some point or points which may or may not have an impact on metadata “just” coming into DxPL.

@freixas the example you give appears to be a “failed” renaming operation, i.e. the change from “NP” to “NM” has not succeeded in its entirety, I presume only with respect to the xmp sidecar files which would be really “weird”, i.e. it gets the database right (I presume), it gets the DOP right (I presume) but fails to write back the updated xmp sidecar file (I presume) at all!!

Do many failures fall into this category?

Please make a support request so that you can get the problem(s) on the DxO support “radar”.

Having any situations where files that should be written but are not being written should not really be that difficult for DxO to spot, even if that is only one of a number of issues detected by your data integrity/consistency software.

  • It’s an XMP file (in XML format)
  • The problem is not that the XMP doesn’t exist, it’s that it doesn’t match the database.

I suspect File > Metadata > Write to image would fix the mismatches in the XMP file, but haven’t personally verified it.

I’ve been doing some testing on what happens when you visit a file (by entering a folder). I need to re-check everything to make sure, but my notes say that an XMP file is automatically used for metadata/keywords if one exists and the image has never been indexed. The XMP is also automatically used for metadata/keywords if one exists, the image has been indexed, but the XMP’s modification date is later than the modification date for the image in the database. If my notes are correct, then the automatic XMP import option is not needed—unless your goal is to completely disable PL’s ability to use whatever external keyword/metadata handling is going on.

I found inconsistencies with all file types—inconsistencies with the database and with each other. For example, the XMP file might have been correct, but not the DOP.

The number of failures, as I’m sure you understand, is irrelevant since a single bug can produce thousands of errors. Every failure I found can be categorized as a failure to properly update external files when a keyword change occurred, whether it was that a keyword was added, deleted, or renamed.

It’s a JavaScript/JScript/JS… dialect, similar to JSON. Even if you change ‘name =’ to ‘“name”:’, JSON parsers will complain (unless they are buggy) about trailing commas like in ‘WatermarkTextColor = {1,1,1,}’. If you have Windows and notepad++, you can use the FSTool plugin and its JSFormat tool for a more friendly display.

1 Like

I have written a parser for DOP files. As far as I can tell, the syntax is DxO’s own. It might look like all sorts of things, but I don’t know that it is. Simplistic parsers will fail. There are at least two different quoted string formats, one with double quotes and one with double square brackets, that you have to be careful with. JSON would interpret double square brackets as nested arrays, and I don’t think that’s correct.

My parser is written in PHP and I’m happy to share the code with anyone who wants it. As I had to reverse-engineer the syntax from my DOP files, there may be constructs that don’t show up in my DOP files that my parser will fail on. Caveat emptor. My parser reports a syntax error when it encounters something it doesn’t know about, and then I fix it.

If all you want to do is look at a DOP file, there are lots of ways to do that.

1 Like

I’d love to test it out on my Mac.

I need a few questions answered first so that I can make sure you will be able to use the code.

  • Are you a programmer?
  • Are you familiar with PHP?
  • Do you have PHP installed on your Mac?
  • Are you familiar with parsers?
  • Do you know how a recursive descent parser works?

I can give you the code regardless of your answers, but, for example, if you are not a programmer, it won’t do you much good.

you forgot to ask him SSN# & DOB …

Thank you for share with us.

My code won’t do anyone any good without some help. Please answer the questions listed in response # 85 above. The code is PHP code, and the parser won’t do anything on its own, so its not very useful to non-programmers.