Build your own AI-app to generate searchable image keywords ... Too easy!

Note: Strictly, not PhotoLab related … but interesting all the same.



I used Firebase Studio to develop an app for me; specifying that I wanted:

  • image

To which it responded;

I thought that sounded like a pretty good start, so I told it to proceed;

  • There was a wait of less than 5 minutes, and it came back with …
    image

And, here’s the result (after pointing the generated app at some “snapshots” taken way back in 2003) …


Impressive; yes … A bit scary; certainly !!!

A few years ago, a colleague had uploaded all his images to google and let it add keywords and other metadata like geolocation etc. Adobe offers similar services and others might too.

It’s an easy way to get keywords for the price of whatever privacy is lost in the process. It’s also a good way for providers to train their artificial intelligence systems - and less conflicting with copyright and rights of the author regulations, iff
a) things are well documented in the conditions for use and
b) service providers actually handle these things carefully


I wonder if DxO will be joining the pack by doing something of their own or by subscribing to a service…

Questions…

Where do the images go for evaluation? Great opportunity for the provider of the AI app to glean your image for other purposes.

Where is the keyword thesaurus kept?

Where are the keywords stored? In the file or in a sidecar or in a database?

Does it cope with RAW files?

The difference here, tho (and this is the part that particularly interested me), is that the AI-bot generated this app completely from scratch - based only on my simple specifications (plus a little extra refinement that I added to the process … not shown in the screenshots above, for simplification/clarity).

  • I watched the decision-making and code-generation process in real time

  • It wasn’t something that had been developed and made available (as I understand your Google & Adobe examples to have been).

That’s why I categorised this as “a bit scary” … because it indicates where this technology is leading to - - and it may well replace many jobs (which is something that I was previously sceptical about, for anything other than just simple tasks).

1 Like

The DAM IMatch for Windows can do this.
It can also do this local only if you use an AI model available offline like the ones for Ollama or LM Studio.

But it is indeed scary to see what AI can do today in simple programming tasks.

1 Like

Good questions, Joanna - - but I wasn’t actually interested in the generated outcome - Rather, I was curious about whether an AI “programmer” could truly generate a half-useful result based only on a written set of requirements (a User Specification, if you like); and the answer was, scarily, a resounding “yes” !

“Identify every aircraft in this collection of photos and tell me where and when they were taken.”

That’s an AI app I’d like. My phone can already tell me there’s an aircraft in the photo.

You can try that for yourself … See the link in the opening post of this thread (above).

John

When I say “identify” I mean it in its most literal sense. It’s a research task, not a categorisation task. It would involve other photos taken around the same time, plus some knowledge of the subject and quite possibly some web searches.

Have a look at the keywords on this photo.
Imgur

I think there are several areas involved. In the first case, it’s about adapting or supplementing existing software for personal needs. This definitely makes sense if your own needs differ from the “mainstream”.

I have already described the second case: Communication with an AI assistant to control image manipulation within Photolab.

The third case is identification. This is facial recognition in the broadest sense, which is linked to as many data points as possible. The question to the AI could also have been specialized so that the pilot’s name should also appear in the metadata. This area seems to me to be the most sensitive.

‘something on their own’ I doubt it. Training such AI models costs million$ and need specific competencies DxO doesn’t have. What they can do is using some existing LLMs that can run locally and 100 % offline…

What worries me most about AI generated apps is how much the AI is limited to your request and doesn’t go off and do its own thing in the background, communicating with “big brother”.

Also, if AI had been around 25 years ago, I wouldn’t have been able to work as a software engineer and pay the mortgage on my house or feed and cloth myself.

As it is, there are now folks getting AI to write their dissertations and other university assignments - all without them really knowing what they are talking about.

Look at what happened to “Dave” in 2001 A Space Odyssey.

And I wasn’t suggesting that you “feed” a folder full of aeroplane photos to (something like) the categorisation app that I got the AI-app to wizz-up in my example.

Instead, I’m suggesting you might ask the AI-bot (via the link above) to create an app, specific to your requirements, to identify the aircraft in your photos.

  • We all might be astonished at what it can do … or not !

Indeed - - I’m certainly glad I’m not still working in the IT field.

1 Like

So far, all these systems have in common that the first results of knowledge acquisition are so “faulty” that they have to be subjected to correction cycles, the results of which are fed back into knowledge acquisition until the result is “sufficiently good”. Somehow this also reminds me of a future story in which an AI demanded and received more computing capacity. After the AI was asked a question about the future and no answer was forthcoming, it replied when asked that it would take about 2 hours to calculate the next second.

You can make a computer do anything you wanted to do, but be careful you don’t make it do more than you wanted to otherwise the human race become surplus to requirements.

Microsoft are starting to find out.

@John-M The implication is that AI will replace old fashioned developers and that may be partially true but a tale from the past might (or might not) be appropriate

I was the UK technical lead for the implementation/installation of a Voice Mail system with a UK mobile start-up.

The product had been written for PacBell, a fixed wire system, and added to their US exchange system where the “Message Waiting icon” was a light on the handset turned on by the exchange. Not exactly cutting edge and it wasn’t going to work with mobile handsets, so the MWNS (Message Waiting Notification System) was born and it fell to me to design that sub-system and that design went to the UK development team to be written.

The individual who set about writing it decided he would try to program a strictly “go to less” COBOL program, with nested IFs to a depth that was truly ridiculous and our Systems Software development group had changed the working of DMS II to handle new structure types.

The result was a program that attempted to handle ‘Abort Transaction’ Exceptions X,levels deep (as X tends to a large finite number) in the program logic, i.e. it was never going to work!

With the delivery deadline approaching we needed a solution so “muggins” (me) had to rewrite, sorry write, the product from scratch with less than 3 months to delivery.

It turned out to be 50,000 lines of COBOL, I had the advantage that I knew the design intimately and had written a library of error handling code for another customer that had been in use for years and obviously we … didn’t make the deadline! The first version hit the customers Test system less than 3 weeks from handover and failed the first tests.

We were actually about three weeks late with the delivery.

So the lessons to be learned are

  1. If you want a job done then do it yourself, which becomes eminently possible with help of AI.
  2. AI is only as good as the models on which it is based (trained), what would have happened if it was based on the code of the original coder on my project!?
  3. AI needs to be kept up-to-date which even I wasn’t with respect to the implications of the changes to record and structure locking on the new DMSII release.
    The development test system was being used by multiple programmers at the same time and they were creating structure locks all over the place and you can’t handle an ‘Abort Transaction’ exception successfully when buried deep in the logic of a program, i.e. you need to exit to a higher level in the program, reset any fields that had been changed by attempting to process the current (failed) transaction and start the transaction again, i.e. you build the program to expect such occurrences.
    On the live system my MWNS reported such occurrences (a very rare occurrence on the operational system) and handled them correctly but they were being caused by the main application, written in LINC) (which generated COBOL) which simply hung and was almost certainly holding on to Locks when it shouldn’t have.

I am currently using ChatGPT to write utilities for my use with PureBasic. Some of the program work 1st. time or after minor mods, but the the AI model suggests more features and things can start to go awry the deeper we go

So I entered the following request to ChatGPT

Please write a PureBasic program to backup .dop and .xmp sidecar files from a user selected directory to a user selected location which is either a subdirectory of the directory being backed up or one selected by the user. Please output details containing the directory backed up and the backup location and counts of the number of .DOP files and the number of ,xmp files backed up. When one backup operation is completed please prompt the user to either select another directory for backing up or terminate.

The first copy failed with a syntax error, ChatGPT seems to get confused with the dialects of Basic that exist but the amended version worked fine, albeit without a proper GUI.

So you get

The next enhancement using the existing UI was to date/timestamp the backups and then ChatGPT suggested

" Let me know if you’d like to:

  • Include a timestamped subfolder even for custom destination,
  • Add log files inside the backup directory,
  • Or do recursive backups including subdirectories.

Happy to help with all of that!"

and it had previously suggested

" Let me know if you’d like:

  • Recursive subdirectory traversal
  • Logging to a file
  • Progress bars or GUI version

I’m happy to extend it!"

So friendly and I will ask it(!?) to add the logging first to the current version and then ask for a GUI version!

Hmmm. That seems like a long way around for that particular task.

In macOS Finder, all I would do is to create a search for files with DOP or XMP extensions…

Then create a destination folder, select all of the found files and drag them to it.

Why write apps when the OS can do it?

Simples!

For the fun of it?

Anyways, macOS also has Automator. It can do a lot of things by just adding a few statements. But it takes a few lines of real knowledge to get started. Maybe some AI could also create Automator scripts or Finder services and whatever they are called under Windows.

Another Example

@platypus You know me too well.

Arguably it was to see how reliable the process could be given I haven’t asked ChatGPT to do exactly that task before and wanted to demonstrate the process in this topic.

The current incarnation now looks like this and includes a log file in the chosen location of the backup

@platypus The good news is you can also use ChatGPT simply by creating a user account, all my efforts have been done on a free account which does limit the number of certain things that can be done for free without waiting for a certain length of time after your allotted requests etc. are exhausted.

Please note how I structured my original request, too ambiguous and who knows what you will get, too tight a specification might strangle the products creativity and allowing the product a wider scope might (or might not) provide an interesting insight into other possibilities.

ChatGPT seems to understand PureBasic reasonably well (and Python as well) but does occasionally give lines of code which are not syntactically correct. During this set of tests I am using only ChatGPT generated code and the product now looks like this

@joanna Why not and the code can then be added to a product which is a combination of a number of smaller experiments. The application above has grown considerably since the earlier version, the screen was ChatGPTs idea, based on my requests and is simple but gets the job done.

ChatGPT is suggesting

"Would you like:

  • A progress bar or progress % display?
  • More image extensions or a user-defined filter?
  • Option to compress the backup folder as a ZIP?

Let me know!"

It certainly needs a lot more image extensions or simply to be told to treat all files that are not .dop or .xmp as an “image” file. Possibly because I am running tests on each version it would be good if it could remember what was used the previous time to minimise my input completely.

It would also be useful to run it alongside PL to make sure that there is no interference between the two programs, i.e. so the backup could be run at any point in a PhotoLab session, including on the current directory!

Maybe another Windows user could enlighten me if there are any compatible OS features or third party utilities similar or the same as the two Mac features/products mentioned here.