The Metadata in TIFF files

Version 2 (Robin Mills, 26 Apr 2015 23:29)

1 1 Robin Mills
h1. The Metadata in TIFF files
2 1 Robin Mills
3 1 Robin Mills
The Tagged Image File Format is a container.  It's very flexible and can deal with multiple pages, different colour spaces, frame configurations as well as metadata.  The specification is available from:  https://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf
4 1 Robin Mills
5 1 Robin Mills
The format is quite simple and consists of a fixed format header (of 8 bytes) which provides the offset of a directory of records.  A directory has a two byte header which contains the length of the directory followed by 12 byte records or "tags".  The final entry in the directory is always the offset to the next dictionary.  A dictionary of length == 0 terminates the directory chain.
6 1 Robin Mills
7 1 Robin Mills
A tag consists of a 12 byte record: Tag, Type, Count, Offset which are 2,2,4 and 4 bytes respectively.  The Tag defines the purpose of the record (Width, Height, ColorSpace etc) and the Type defines the nature of the data.  Count and Offset are used to contain the data, or provide an offset in the file at which to read the raw data for this tag.
8 1 Robin Mills
9 1 Robin Mills
This is all very well described in the specification.  The code in tiff image.cpp/ TiffImage::printStructure() decodes the dictionaries.
10 1 Robin Mills
11 1 Robin Mills
The TIFF container is so flexible it is used as the structure for most RAW formats including Adobe's DNG.  Additionally the TIFF contain is used to store the metadata that lies inside the Exif and Iptc data blocks within JPEG and PNG files.
12 1 Robin Mills
13 1 Robin Mills
*Example:*
14 1 Robin Mills
The version of exiv2(.exe) which ships with v0.25 provides options -pS to reveal the structure of the TIFF and option -pX is used to extract the raw XMP/xml data.  The option -pa is used to print the metadata in human readable format.<pre>$ exiv2 -pa ~/tif
15 1 Robin Mills
Exif.Image.ImageWidth                        Short       1  40
16 1 Robin Mills
Exif.Image.ImageLength                       Short       1  470
17 1 Robin Mills
Exif.Image.BitsPerSample                     Short       3  8 8 8
18 1 Robin Mills
Exif.Image.Compression                       Short       1  LZW
19 1 Robin Mills
Exif.Image.PhotometricInterpretation         Short       1  RGB
20 1 Robin Mills
Exif.Image.StripOffsets                      Long        1  2694
21 1 Robin Mills
Exif.Image.Orientation                       Short       1  right, top
22 1 Robin Mills
Exif.Image.SamplesPerPixel                   Short       1  3
23 1 Robin Mills
Exif.Image.RowsPerStrip                      Short       1  1092
24 1 Robin Mills
Exif.Image.StripByteCounts                   Long        1  5086
25 1 Robin Mills
Exif.Image.PlanarConfiguration               Short       1  1
26 1 Robin Mills
Exif.Image.Predictor                         Short       1  Horizontal differencing
27 1 Robin Mills
Exif.Image.SampleFormat                      Short       3  Unsigned integer data
28 1 Robin Mills
Exif.Image.XMLPacket                         Byte      2500  (Binary value suppressed)
29 1 Robin Mills
Xmp.dc.title                                 LangAlt     1  lang="x-default" this is a title
30 1 Robin Mills
/Users/rmills/tif: (No IPTC data found in the file)
31 1 Robin Mills
$ exiv2 -pS ~/tif
32 1 Robin Mills
STRUCTURE OF TIFF FILE: /Users/rmills/tif
33 1 Robin Mills
 address |    tag                      |      type |    count |   offset | value
34 1 Robin Mills
      10 | 0x0100 ImageWidth           |     SHORT |        1 |  2621440 | 40
35 1 Robin Mills
      22 | 0x0101 ImageLength          |     SHORT |        1 | 30801920 | 470
36 1 Robin Mills
      34 | 0x0102 BitsPerSample        |     SHORT |        3 |      182 | 
37 1 Robin Mills
      46 | 0x0103 Compression          |     SHORT |        1 |   327680 | 5
38 1 Robin Mills
      58 | 0x0106 PhotometricInterpret |     SHORT |        1 |   131072 | 2
39 1 Robin Mills
      70 | 0x0111 StripOffsets         |      LONG |        1 |     2694 | 0
40 1 Robin Mills
      82 | 0x0112 Orientation          |     SHORT |        1 |   393216 | 6
41 1 Robin Mills
      94 | 0x0115 SamplesPerPixel      |     SHORT |        1 |   196608 | 3
42 1 Robin Mills
     106 | 0x0116 RowsPerStrip         |     SHORT |        1 | 71565312 | 1092
43 1 Robin Mills
     118 | 0x0117 StripByteCounts      |      LONG |        1 |     5086 | 0
44 1 Robin Mills
     130 | 0x011c PlanarConfiguration  |     SHORT |        1 |    65536 | 1
45 1 Robin Mills
     142 | 0x013d Predictor            |     SHORT |        1 |   131072 | 2
46 1 Robin Mills
     154 | 0x0153 SampleFormat         |     SHORT |        3 |      188 | 
47 1 Robin Mills
     166 | 0x02bc XMLPacket            |      BYTE |     2500 |      194 | <?xpacket begin="..." id="W5M0Mp ...
48 1 Robin Mills
$ exiv2 -pX ~/tif | xmllint -pretty 1 -
49 1 Robin Mills
<?xml version="1.0"?>
50 1 Robin Mills
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
51 1 Robin Mills
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2">
52 1 Robin Mills
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
53 1 Robin Mills
    <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="">
54 1 Robin Mills
      <dc:title>
55 1 Robin Mills
        <rdf:Alt>
56 1 Robin Mills
          <rdf:li xml:lang="x-default">this is a title</rdf:li>
57 1 Robin Mills
        </rdf:Alt>
58 1 Robin Mills
      </dc:title>
59 1 Robin Mills
    </rdf:Description>
60 1 Robin Mills
  </rdf:RDF>
61 1 Robin Mills
</x:xmpmeta>
62 1 Robin Mills
<?xpacket end="w"?>
63 1 Robin Mills
855 rmills@rmillsmbp:~/Documents/exiv2 $ </pre>
64 1 Robin Mills
65 2 Robin Mills
You can clearly see the XMLPacket is stored at offset 194 into the file and consists of 2500 bytes.   You can of course directly extract the XMP with the following command which says:  set the block size to 1 byte, skip 194 bytes and dump the next 2500 bytes:<pre>863 rmills@rmillsmbp:~/Documents/exiv2 $ dd bs=1 skip=194 count=2500 if=~/tif  | xmllint -pretty 1 -
66 1 Robin Mills
2500+0 records in
67 1 Robin Mills
2500+0 records out
68 1 Robin Mills
2500 bytes transferred in 0.005714 secs (437526 bytes/sec)
69 1 Robin Mills
<?xml version="1.0"?>
70 1 Robin Mills
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
71 1 Robin Mills
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2">
72 1 Robin Mills
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
73 1 Robin Mills
    <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="">
74 1 Robin Mills
      <dc:title>
75 1 Robin Mills
        <rdf:Alt>
76 1 Robin Mills
          <rdf:li xml:lang="x-default">this is a title</rdf:li>
77 1 Robin Mills
        </rdf:Alt>
78 1 Robin Mills
      </dc:title>
79 1 Robin Mills
    </rdf:Description>
80 1 Robin Mills
  </rdf:RDF>
81 1 Robin Mills
</x:xmpmeta>
82 1 Robin Mills
<?xpacket end="w"?>
83 1 Robin Mills
864 rmills@rmillsmbp:~/Documents/exiv2 $ </pre>The option -pX doesn't actually use dd to achieve the same result and is much more convenient to use.
84 2 Robin Mills
85 2 Robin Mills
To demonstrate that the metadata block in a JPEG is a TIFF file, extract and print the structure.<pre>$ exiv2 -pS ~/jpg 
86 2 Robin Mills
STRUCTURE OF JPEG FILE: /Users/rmills/jpg
87 2 Robin Mills
 address | marker     | length  | data
88 2 Robin Mills
       2 | 0xd8 SOI  
89 2 Robin Mills
       4 | 0xe1 APP1  |   14862 | Exif..II*.....................
90 2 Robin Mills
   14868 | 0xe1 APP1  |    2720 | http://ns.adobe.com/xap/1.0/.<
91 2 Robin Mills
   17590 | 0xed APP13 |     110 | Photoshop 3.0.8BIM.......6....
92 2 Robin Mills
   17702 | 0xe2 APP2  |    4094 | MPF.II*...............0100....
93 2 Robin Mills
   21798 | 0xdb DQT   |     132 
94 2 Robin Mills
   21932 | 0xc0 SOF0  |      17 
95 2 Robin Mills
   21951 | 0xc4 DHT   |     418 
96 2 Robin Mills
   22371 | 0xda SOS   |      12 
97 2 Robin Mills
$ dd bs=1 skip=12 count=14862 if=~/jpg of=bull.tif
98 2 Robin Mills
14862+0 records in
99 2 Robin Mills
14862+0 records out
100 2 Robin Mills
14862 bytes transferred in 0.038783 secs (383211 bytes/sec)
101 2 Robin Mills
$ exiv2 -pS bull.tif
102 2 Robin Mills
STRUCTURE OF TIFF FILE: bull.tif
103 2 Robin Mills
 address |    tag                      |      type |    count |   offset | value
104 2 Robin Mills
      10 | 0x010f Make                 |     ASCII |       18 |      134 | NIKON CORPORATION.
105 2 Robin Mills
      22 | 0x0110 Model                |     ASCII |       12 |      152 | NIKON D5300.
106 2 Robin Mills
      34 | 0x0112 Orientation          |     SHORT |        1 |        1 | 1
107 2 Robin Mills
      46 | 0x011a XResolution          |  RATIONAL |        1 |      164 | 
108 2 Robin Mills
      58 | 0x011b YResolution          |  RATIONAL |        1 |      172 | 
109 2 Robin Mills
      70 | 0x0128 ResolutionUnit       |     SHORT |        1 |        2 | 2
110 2 Robin Mills
      82 | 0x0131 Software             |     ASCII |       10 |      180 | Ver.1.00 .
111 2 Robin Mills
      94 | 0x0132 DateTime             |     ASCII |       20 |      190 | 2015:02:13 20:46:51.
112 2 Robin Mills
     106 | 0x0213 YCbCrPositioning     |     SHORT |        1 |        1 | 1
113 2 Robin Mills
     118 | 0x8769 ExifTag              |      LONG |        1 |      210 | 210
114 2 Robin Mills
    4080 | 0x0103 Compression          |     SHORT |        1 |        6 | 6
115 2 Robin Mills
    4092 | 0x011a XResolution          |  RATIONAL |        1 |     4168 | 
116 2 Robin Mills
    4104 | 0x011b YResolution          |  RATIONAL |        1 |     4176 | 
117 2 Robin Mills
    4116 | 0x0128 ResolutionUnit       |     SHORT |        1 |        2 | 2
118 2 Robin Mills
    4128 | 0x0201 JPEGInterchangeForma |      LONG |        1 |     4184 | 4184
119 2 Robin Mills
    4140 | 0x0202 JPEGInterchangeForma |      LONG |        1 |    10670 | 10670
120 2 Robin Mills
    4152 | 0x0213 YCbCrPositioning     |     SHORT |        1 |        1 | 1
121 2 Robin Mills
$ </pre>
Redmine Appliance - Powered by TurnKey Linux