Neither am I anything like an expert although I recently wrote an application that extracted the JPEG preview file from the raw files I have on my computer (crw, nef, orf, rw2).
While a tiff and a raw look like single files they are actually a series of blocks of data. The first few bytes of either type include a pointer to a “table” of data. This table of data is a list of records, each of a fixed length (12 bytes), each record includes a numeric ID known as a tag, an indication of how the binary data should be read e.g. Ascii text, where the data is in the file and how long it is. These tables are known in tiff speak as “Image File Directories” or IFDs.
So to find the jpeg preview I first had to find the start of the table and then look for the tag that is used to identify jpeg data. Next the code reads the location and size information and then extracts the block of bytes from the file. In this example this block of data is a jpeg image.
The latest update to the Tiff specification is dated 1992 which is well before the invention / adoption of digital raw files so the specification is unlikely to include information on how raw data should be written. So if raw data were stored in a tiff structure it would no longer be a tiff because it would not meet the spec.
So most of the camera manufacturers took the tiff structure, changed its internal ID and saved their raw data and a collection of their own private meta data within a tiff like structure. This includes their private data along with jpeg previews, iptc meta data, exif data. So reading a raw file (or a tiff) is a little like being given a book that starts with an index. Some of the pages referred to by the index are in a Foreign Language while others are in English. I can read the index and the pages that are in English but while I can see the Foreign Language pages I don’t understand them. If I add extra pages mid way through the book I have to update the index page because the page numbers will have changed. Both raw and tiff files may include multiple indexes (IDFs).
Getting back to raw files. The tiff structure could be used to store any binary data such as a spreadsheet or word processor file. However, if this type of data is included then its not a tiff just a data structure. So you a correct in suggesting that tiffs only include demosiaced data its just that any that do are called raw files. Also a tiff file may include any number of images as may a raw file (my raw files seem to hold two jpegs plus the raw data).
Lastly why not change the tiff format to include raw data? Well Adobe own tiff so they did and its called DNG (i’m probably being simplistic here!).
From a programmers point of view it is far simpler to test code schemes for xmp, tiff, dng and say dop files rather than having to test code schemes against three hundred odd raw formats. So we are stuck with xmp unless we adopt DNG
Sorry for yet another long post,
best wishes
Simon