Photolab and Adobe DNG files - archiving old formats

My concern over DNG is the fact that it seems that much like Tiffs and RAW files it is a container format meaning that not all DNG files are compatible with all DNG editors. DxO is a case in point as it creates DNGs that are intended for further processing in other apps and that it can not open. This seems perverse to me as it is capable of opening RAW files that it has edited. I believe it does this through the use of its sidecar file so why not do this with DNGs. I also extend the question to Tiffs as there is no reason other than file size why a Tiff should not contain both an edited image and the source RAW file along with JPEG previews and the processing instructions from multiple image editing applications.

Getting back to my original question I am still mystified why DxO PL is unable to open a DNG based on a RAW file when it cannot open the RAW original. Other image editing apps can (e.g. Apple Photos, Aperture) so why not DXO. After all isn’t the whole point of DNG to make the images available in applications that can’t open the raw original?

1 Like

I confused CRW and CR2. But despite that, my point was that all CR2 files are not equal. I can’t find a canonical (pardon the pun) reference to this fact, but have heard it anecdotally and I know for a fact this is true with Pentax .PEF files because I’ve had software in the past that could not open all .PEF files until additional camera support was included.

I also assumed (though thinking in more modern terms, so likely .CR2) that any proprietary RAW format would also include lossless compression.

As for support… that I know of… macOS, iOS, Apple Photos (macOS and iOS), Aperture (discontinued), Lightroom, Photoshop, Pixelmator Photo (iOS), Luminar, Affinity Photo, and DxO PhotoLab (with some limitations) is a pretty decent list for starters. I only have DNG files (excepting some really old photos in JPEG) and it has been a long time since I’ve been unable to open one.

This is a concern for me, too. As I think I mentioned previously, for a time I would convert my .PEF files to .DNG files specifically so I could open them in Aperture until such time as my camera model was added to the supported camera list. That is one of the key points of DNG…

From Wikipedia.

Canon’s new CR3 file format, on their mirrorless camera range, does offer a C-RAW compression option; it is lossy, but apparently not noticeably so except when shot under particularly poor lighting conditions. Some their cameras that shoot CR2 offer S-RAW or M-RAW compression options, both with a lower pixel count. Maybe someone does offer truly lossless compressed RAW, but I know much less about non-Canon digital cameras.

I get the impression from internet reading that in the past converting a Canon Raw to the DNG format did save some disk space but that Nikon and others already employed lossless compression in their Raw files meaning there was no space saving advantage to converting to Raw.

One “possible” advantage to doing the conversion is that meta data can be safely written to the DNG and updated as necessary, whereas only a few applications are brave enough to write IPTC data back into camera raw files. The advantage depends mostly on how you manage raw files and their sidecar files. I am engaged in a major overhaul of my photo library structure and have discovered a few issues with xmp,

These issues include discovering many xmp orphans that have been left when the parent image has been deleted. These are very small text files so take little space so they just clutter the display in the Finder. I will be writing a utility app to locate and remove these files. A slightly more tricky issue is discovering some raw files that have not been renamed to the format that I have standardised to which incorporates the capture date in sql format as a prefix. The app “A better finder rename” can rename the image files but I have had to write another utility to update the names of the xmp sidecars.

The bumps in the xmp road that I mention above caused me to look at writing iptc data such as keywords directly into the original raw files. There is an urban myth that raw files are never modified and indeed most applications do just read them, mostly because they are complex structures that involve the maintenance of a number of tables of data containing “tag fields” that point to specific data. Adding or inserting meta or other data involves updating these pointers which means there is a chance of errors which would render the raw file unreadable. So most software just writes a simple xml based sidecar. However, PhotoMechanic and ExifTool will update the camera raw file file with iptc data if you ask nicely (all except .raf files).

I experimented with some copies of Olympus raw files adding keywords using ExifTool. No problems occurred and the keywords were visible in “some” applications e.g. Apple’s Preview. DxO PhotoLab would not display them in the UI but it did add them to JPEG derivatives.

So to recap, converting a raw to DNG could avoid messing around with sidecar files.

best wishes
Simon

…easy to test: Convert all .cr2 files from a folder and see what you get.

60 camera original cr2 files: 1.35 GB
Same converted by Adobe DNG Converter: 1.08 GB

As we can see, there are some savings. When I compare metadata with exiftool, I see that the DNG file has roughly 25% less lines, Canon specific information that might be necessary or not.

Make your own choice. Mine is to stay with what my cameras give me.

The leading DAM software packages - Photo Supreme, Daminion, and iMatch - also all have the option to write metadata to RAW files.

I found this page which suggests Canon bodies do in fact use lossless compression. (No mention of Pentax, which do use it and have for a long time.)

The intent to “not modify RAW files” is, I think, misplaced. There has long been a rule of thumb when using programs like Photoshop to never touch the original pixel data. The same should hold true with RAW files. The original pixels should never be modified, but adding meta-data is incredibly useful for many reasons including those mentioned by @skids.

@platypus the difference you see in file size may be down to the efficiency of the compression algorithm. I know when I was first experimenting with DNG they were slightly larger than the same image in PEF and that, along with speed advantages, led me to stay with PEF in camera for a while. But both formats were compressed.

One strong indicator that compression is in use is variation in file sizes. If you think about it, without compression, every RAW image (assuming consistent camera settings on quality/dimensions) should have the same file size if compression is not used. Mine vary considerably and I reckon I could pick out shots with a lot of blue sky simply by noting the smaller file size, because large areas of homogenous colour compress very well. Conversely, a picture of a crowd or foliage will have a lot of detail and therefore compress less.

I am tending to agree with you. As I type my computer is renaming files in the folder that stores images taken in 2017. There are 7000 odd images and each one has an xmp file, so the folder contains 14000 files.

Some of the images did not follow my naming convention of ReverseDate_Time_CameraName.Ext e.g.“2014_08_20_202607_P1234560.rw2”. Renaming the images is simple using a tool like “A better Finder Rename” but the .xmp and .dop files take more effort as they are not read by the renaming tools. Most of the sidecars just contain iptc meta data such as keywords and it would be much more convenient to have this data stored inside the raw files themselves.

The often quoted issue of writing iptc data into raw files seems to be the fact that the raw files formats are undocumented along with the often quoted myth that raw files are never modified . However, it seems to me that most raw files are based on the tiff format version 6. This describes how to store a multitude of data including multiple images within a single file. These parts of the file can be thought of as blocks of unrelated data. Both Raw and Tiffs file both use fixed format tables of data that list data tags that point to blocks of data. I suspect that the “undocumented” portion of a raw file refers to the sensor data and the “makers notes”.

In other words it is not to difficult to walk through a tiff or a raw and identify blocks of bytes that are for example a jpeg or the raw data and so on. Adding iptc data is just a case of adding the data in either an already defined block or creating a new block followed by adding a tag pointer to the data and then updating the pointers that point to all the other data blocks that follow the insertion point.

While writing to the raw file is more complex than creating an xmp sidecar file it is no more complex, as far as I can see, than writing iptc data to a tiff file. If true it does beg the question as to why side car files were created in the first place as there is no reason why DxO, Adobe or anyone else could not add their own processing data into the raw files using “private tags”.

I am beginning to suspect that sidecar files were created for the convenience of programmers rather than photographers or computer users.

best wishes
Simon

Biggest problem of exifdata/metadata in rawfiles is that there isn’t a “standard” and it’s not a open data. different developerapplication/rawdiggers can read different amount of data out of the exif.
Why? i think its the manufactorers choice not to reveal all data inside the rawfile.
writing into the rawfile’s exif section can make it even more icky and fail triggerd.
a XMP-file as sidecar can solve a lot of this risk.
I do be with you as in it should be a standard and every reader should read the same info.

DxO chose for DOP-files where they place the image specific DATABASE info as a kind of replaceable “image developing data” so you didn’t needed the DATABASE at all. ( if the cach did grow to big and slowing the app down you could deleted it and reindex done.)
Now with keywords writing only in database this feature got broken.
So they have to figure out if they want to use the DOP-files to store this exif/metadata info( all what’s not in the rawfile’s exif.) or use the XMP-file system for that and basicly write in a same-named sidecar all exif data to company the rawfiles thus get 3 files per “image” in your original DAM.

Older camera’s could give Tiff’s and Jpegs not RAW-files, i supose you talked about those:

Yes, it’s easier to create a sidecar file then make image rawfile coding fully open/disclosed for every programmer who want’s to do something with the files.

Me i started to use XNviewmp (freeware) to create something of a DAM on the frontside of DxOPL, because the search function is quite handy one’s you use it. (with XMP and set in side the company files to move and delete all: raw dop and xml.

We have to wait a bit to let the passthrough of exif/metadata in DxOpl fully controllable (edit inside which effects the front and the rear) and keeps the dxo database just as it was: a cach driven copy which can be easily rebuild. (this wil support the movability of removable SSD’s and you can log in at any pc if you bring your workspace profile with presets and such with you.)
(jpegs and iptc) and raw and source tiff xmp.

professional users will use real DAM applications and would be frustrated using those “half baked simplified DAM systems” but they have far more images to wander through and also at command on the clock.

creating linear DNG’s with extra editted exif/metadata inside would cleanup the list of files but with what cost? (thinking of not be able to read dxo DNG’s back in DXOPL is one of them so you can’t preprocess the rawfiles to a filesystem and go back later to process the image.)

a amateur casual shooter,

Peter

Hi Peter,

I had forgotten that XnView allows “companion” files (sidecars) to be associated and then moved and deleted as per the raw “lead” file.

I am using “NeoFinder” which does not seem to have the same option but I am able to write my own utility apps to clean up any .xmp / .dop orphans.

One of the strengths of DxO PL3 is the fact that it creates .dop files which mean I can avoid owning a database of meta data and corrections with the added advantage that I can create my own tools to glue my DAM app to DxO.

One negative of PL2 and PL3 is that if an “openwith” command is issued on a raw file that is in a folder of thousands of images PhotoLab adds all the images to its database which takes an age.

The solution is to drag and drop the image into an album or use a tool that uses the DxO provided Lightroom -> DxO command line tool.

My present plan is to stick with original raw files, .xmp and .dop files and store my images in folders based on month of capture. This should mean that there are never more than a couple of thousand images in a busy month which then means I can allow PL3 to catalog a complete month or even use CaptureOne in session mode given that neither DxO or PhaseOne are anywhere near producing Digital Asset Management solutions that comes close to power and reliability of XnView, NeoFinder or the other stand alone DAM application.

As for Raws, the majority are Tiff files so I think developers could be writing meta data to them.
However, I accept that it complex and difficult to test especially when compared to editing a simple xml file. However there are advantages to having .xmp and .dop files, they are tiny in comparison to the raw files they refer to meaning that changes to meta data and edits get backed up vey quickly and any write errors do not damage the original raw image file.

best wishes
Simon

1 Like

hmm, i am ABSOLUTE :slight_smile:not an expert, not even informed at first base( know some basic things that’s it), but is the difference between a tiff and a raw file not the demosiacing principle? so is it possible that camera’s who deliver tiff based rawfiles are processing the sensordata including the demosiacing before storing?
in that case those work PRIME still with those files?

Neither am I anything like an expert although I recently wrote an application that extracted the JPEG preview file from the raw files I have on my computer (crw, nef, orf, rw2).

While a tiff and a raw look like single files they are actually a series of blocks of data. The first few bytes of either type include a pointer to a “table” of data. This table of data is a list of records, each of a fixed length (12 bytes), each record includes a numeric ID known as a tag, an indication of how the binary data should be read e.g. Ascii text, where the data is in the file and how long it is. These tables are known in tiff speak as “Image File Directories” or IFDs.

So to find the jpeg preview I first had to find the start of the table and then look for the tag that is used to identify jpeg data. Next the code reads the location and size information and then extracts the block of bytes from the file. In this example this block of data is a jpeg image.

The latest update to the Tiff specification is dated 1992 which is well before the invention / adoption of digital raw files so the specification is unlikely to include information on how raw data should be written. So if raw data were stored in a tiff structure it would no longer be a tiff because it would not meet the spec.

So most of the camera manufacturers took the tiff structure, changed its internal ID and saved their raw data and a collection of their own private meta data within a tiff like structure. This includes their private data along with jpeg previews, iptc meta data, exif data. So reading a raw file (or a tiff) is a little like being given a book that starts with an index. Some of the pages referred to by the index are in a Foreign Language while others are in English. I can read the index and the pages that are in English but while I can see the Foreign Language pages I don’t understand them. If I add extra pages mid way through the book I have to update the index page because the page numbers will have changed. Both raw and tiff files may include multiple indexes (IDFs).

Getting back to raw files. The tiff structure could be used to store any binary data such as a spreadsheet or word processor file. However, if this type of data is included then its not a tiff just a data structure. So you a correct in suggesting that tiffs only include demosiaced data its just that any that do are called raw files. Also a tiff file may include any number of images as may a raw file (my raw files seem to hold two jpegs plus the raw data).

Lastly why not change the tiff format to include raw data? Well Adobe own tiff so they did and its called DNG (i’m probably being simplistic here!).

From a programmers point of view it is far simpler to test code schemes for xmp, tiff, dng and say dop files rather than having to test code schemes against three hundred odd raw formats. So we are stuck with xmp unless we adopt DNG

Sorry for yet another long post,

best wishes
Simon

That sounds as an expert to me! :grinning:
about the text, good exploitation, i can follow you but not swim on my own so to speak.
i can see/understand that Raw files are just blocks of data placed in a IFD controlled by Numeric ID, And i know (read somewhere) raw uses 12Bits for decoding the exposure data and the others for something else (forgot what, probably redundancy check )

(And you are a clever programmer to decode those files for your own use.)

well i know there are "linear"DNG’s which are tiff’s (demosiaced) with a floating WB (every raw-developing app can produce them, and Adobes DNG is maybe a real raw in DNG container:

Citaat
Because DNG is a file format based on the TIFF format, it can not only be used to store RAW data, but also RGB data. So you can have a DNG file that is a RAW file, but you can also have a DNG file that is not a RAW file but a so-called “linear” RGB file. That is quite confusing, certainly because some RAW converters use that option. What you actually get is a TIFF file in a DNG envelope.
Citaat
some? most DNGexport in raw converters is linear. So that’s why i hesitating to say Adobe’s DNG converter is fully raw-DNG.

But then again who am i to claim this knowledge.

That’s a thing what is true and or you need a independed RAW DNG converter with DAM functionality (adding tags and such) which can read almost all types of RAWfiles. i believe Adobe Bridge can be this if it can give tagging metadata to the DNG.
help text adobe DNG converter
For my use i hope that DxOpl is evolving there DAM functionality and there DNG setup,
reading Real DNG (if i encounter RAW-files which arn’t in there list.) and linear DNG’s and writing linearDNG which you can swing back in DXO to create semi-finished products.

Better a long clear text then a short one who need’s extra explanation :slight_smile:
Regards
Peter

Whatever the ifs and possibilities might be, the question boils down to this i.m.o.

  1. Trust Adobe and convert
  2. Don’t trust Adobe and keep your originals

As with all either-or choices, this choice is bringing in risks that we can probably not fathom satisfactorily.

To get the best of both worlds, keep the originals and convert to dng too. Bringing edit history, keywords etc. into these files depends on what converter we use. Being able to read meta and development data and interpreting them correctly can limit the selection of apps though. DPL might be a wallflower in this dance, I’m afraid.

2 Likes

No matter how good option 1 is it is an additional processing stage that can go wrong or introduce errors. Whereas option 2 means that it is always possible to start processing from the beginning.

The problem is really one of terminology : a tiff file conforms to both the structure rules and data type laid out in the tiff specification. A raw file follows the structure rules but not the datatype. A dng is an extension of tiff that allows raw data and requires certain meta data but also allows demosiaced data as do raw files because the programmer can place anything they want into them.

What I find perplexing is that DxO PL is unable to open dngs based on raw files from cameras it has no knowledge of and also the reports that Capture One makes a better conversion from camera raws than it does from dngs based on the same raw files. These two issues seem to contradict the major “selling” point of dng files which is future proofing the raw data.

I think that I am going to ignore dng for the time being and possibly look at using Exiftool to read xmp keyword data and write it into the original raws. I will have backups before I try this!

best wishes
Simon

1 Like

While camera manufacturers add their own proprietary ‘makernotes’, there are three main metadata interchange standards in use by the photography industry - IPTC, EXIF and XMP. And even though different camera manufacturers have their own RAW formats, to the best of my knowledge they all follow very similar principles, so the risk of anything going wrong is very, very low, provided you stick to well regarded software.

IMHO the very much bigger risk, long term, is that the sidecar files become misplaced or deleted - just like old printed photographs that have become separated from the albums that once held the hand-written metadata to explain them, compared to those that have the metadata safely written on the reverse of the actual photo. Probably not a problem while you’re in charge of them, but perhaps when your collection is eventually passed on to someone else, especially if they aren’t a photographer and have no idea what a sidecar file does…

A very good analogy!

Exactly! My “spring clean” of my image collection has revealed horrors including orphan .xmp files where the image has been deleted but the .xmp missed and worse two or more .xmp files in different folders referring to a single image and containing different IPTC data. Also, all those .xmp files increase the workload when trying to sort things out. The jump from seven and half thousand images in a folder to fifteen thousand files in a folder is a significant burden when those extra files have to be checked.

Other problems included some early digital images have lost their Exif data and the same camera generated file name being used several times for different images.

My simple solution to the xmp keyword issue is to add principle keywords to the file name. In the longer term I shall also investigate using a tool to copy keywords from the xmp files into the raw file (or perhaps a dng copy). This also means that the xmps have to be renamed as well if I decid to keep them. A further problem is the size of my collection, some 80,000 images and 1Tbyte means that any operation applied to every image takes many many hours.

Spot on, I am now thinking about how my collection of family snaps can be passed to the next generation. Its quite ironic that there is no issue with images taken before 1999 yet we have to have these discussions about the modern image format. Progress!

best wishes
Simon

1 Like

Hi,

This page may be of interest: https://exiftool.org/idiosyncracies.html#raw

Simon

I was trying to write a program to add keywords to my nef files using exiftools in Pascal. I read from about 65000 images the keywords and stored them in a text file on disk. It toke me about 22 minutes. It was my first try. I stopped with it since I discovered that I could use ViewNx2 with my camera,D750. I think it’s the latest model that is supported. I will continue with my program, but slowly. :blush:

Correct me if I’m wrong.
The raw contains the sensordata in 12 or 14 bits digital values(Nikon). It is written to disk in 12 or 14 bits info. To keep it simple forget eventual compression.
A tiff file contains rgb data in 8 or 16 bits depth, resulting in 24 or 48 bits pixels.
A dng file contains…?
A linear rgb file is an image converted out of the sensor data but with no color space corrections?

George