The Metadata in PNG files¶
The Portable Network Graphics (PNG) is a raster graphics file format that supports a portable, legally unencumbered, well-compressed, well-specified standard for lossless bitmapped image files.
A PNG always starts with an 8-byte signature: 137 80 78 71 13 10 26 10 (decimal values). The remainder of the file consists a series of chunks beginning with an IHDR chunk and ending with an IEND chunk.
1. Chunks¶
Each chunk consists of four parts:
Length | 4 bytes | An unsigned integer giving the number of bytes in the chunk's data field. The length counts only the data field, not itself, the chunk type code, or the CRC. Zero is a valid length. Although encoders and decoders should treat the length as unsigned, its value must not exceed 2^31-1 bytes. |
---|---|---|
Chunk type | 4 bytes | Type codes are restricted to consist of uppercase and lowercase ASCII letters and they are case sensitive. + The case of the first letter indicates whether the chunk is critical or not. If the first letter is uppercase, the chunk is critical; if not, the chunk is ancillary (more details are mentioned below). + The case of the second letter indicates whether the chunk is "public" (either in the specification or the registry of special-purpose public chunks) or "private" (not standardised). Uppercase is public and lowercase is private. + The third letter must be uppercase to conform to the PNG specification. It is reserved for future expansion. + The case of the fourth letter indicates whether the chunk is safe to copy by editors that do not recognize it. |
Chunk data | Length bytes | The data bytes appropriate to the chunk type, if any. This field can be of zero length. |
CRC | 4 bytes | A CRC (Cyclic Redundancy Check) calculated on the preceding bytes in the chunk, including the chunk type code and chunk data fields, but not including the length field. The CRC is always present, even for chunks containing no data. |
Fig.1. The chunk layout.
Chunks declare themselves as critical or ancillary. Chunks that are necessary for successful display of the file's contents are called critical chunks. If a decoder encounters a critical chunk it does not recognize, it must abort reading the file or supply the user with an appropriate warning.
Name | Multiple | Ordering constraints | Content |
---|---|---|---|
IHDR | No | Must be the first | the image's width, height, and bit depth |
PLTE | No | Before IDAT | the palette; list of colors. |
IDAT | Yes | Multiple IDATs must be consecutive | the actual image data, which may be split among multiple IDAT chunks. Such splitting increases filesize slightly, but makes it possible to generate a PNG in a streaming manner. |
IEND | No | Must be the last | it just marks the image end. The chunk's data field is empty. |
Fig.2. Some standard critical chunks.
Chunks that are not strictly necessary in order to meaningfully display the contents of the file are known as ancillary chunks. It includes gamma values, background color, and textual metadata information, ect... If a decoder encourters an ancillary chunk that it does not understand can safely ignore it.
Name | Multiple | Ordering constraints | Content |
---|---|---|---|
cHRM | No | Before PLTE and IDAT | the chromaticity coordinates of the display primaries and white point. |
gAMA | No | Before PLTE and IDAT | gamma values. |
iCCP | No | Before PLTE and IDAT | an ICC color profile. |
sBIT | No | Before PLTE and IDAT | the color-accuracy of the source data. |
sRGB | No | Before PLTE and IDAT | the standard sRGB color space. |
bKGD | No | After PLTE; before IDAT | the default background color. It is intended for use when there is no better choice available, such as in standalone image viewers. |
hIST | No | After PLTE; before IDAT | the histogram, or total amount of each color in the image. |
tRNS | No | After PLTE; before IDAT | transparency information.For indexed images, it stores alpha channel values for one or more palette entries. For truecolor and grayscale images, it stores a single pixel value that is to be regarded as fully transparent. |
pHYs | No | Before IDAT | the intended pixel size and/or aspect ratio of the image. |
sPLT | Yes | Before IDAT | a palette to use if the full range of colors is unavailable. |
tIME | No | None | the time that the image was last changed. |
iTXt | Yes | None | UTF-8 text, compressed or not, with an optional language tag. |
tEXt | Yes | None | text that can be represented in ISO/IEC 8859-1. |
zTXt | Yes | None | compressed text with the same limits as tEXt. |
Fig.3. Some standard ancillary chunks.
2. Textual information chunks - the metadata in PNG¶
The iTXt, tEXt, and zTXt chunks (text chunks) are used for conveying textual information associated with the image. They are the places we can find all metadata of PNG file.
Each of the text chunks contains as its first field a keyword that indicates the type of information represented by the text string. The following keywords are predefined and should be used where appropriate:Keywords | Explain |
---|---|
Title | Short (one line) title or caption for image |
Author | Name of image's creator |
Description | Description of image (possibly long) |
Copyright | Copyright notice |
Creation Time | Time of original image creation |
Software | Software used to create the image |
Disclaimer | Legal disclaimer |
Warning | Warning of nature of content |
Source | Device used to create the image |
Comment | Miscellaneous comment; conversion from GIF comment |
Other keywords may be invented for other purposes. The keyword must be at least one character and less than 80 characters long. Keywords of general interest can be registered with the maintainers of the PNG specification. According to XMP Specification, an XMP packet is embedded in a PNG graphic file by adding a chunk of type iTXt with the keyword 'XML:com.adobe.xmp'. There are no standard for Exif, IPTC data. In Exiv2, when Exif, IPTC are added, they are stored in zTXt text chunks and save as ASCII.
2.1 tEXt Textual data¶
In the format:
Keyword | Null separator | Text |
---|---|---|
1-79 bytes | 1 byte | n bytes |
The text is interpreted according to the ISO/IEC 8859-1 (Latin-1) character set. The text string can contain any Latin-1 character. Newlines in the text string should be represented by a single linefeed character (decimal 10).
2.2 zTXt Compressed textual data¶
In the format:
Keyword | Null separator | Compression method | Compressed text |
---|---|---|---|
1-79 bytes | 1 byte | 1 byte | n bytes |
The zTXt chunk contains textual data, just as tEXt does; however, zTXt takes advantage of compression. The zTXt and tEXt chunks are semantically equivalent, but zTXt is recommended for storing large blocks of text. The only presently legitimate value for Compression method is 0 (deflate/inflate compression)
2.3 iTXt International textual data¶
In the format:
Keyword | Null separator | Compression flag | Compression method | Language tag | Null separator | Translated keyword | Null separator | Text |
---|---|---|---|---|---|---|---|---|
1-79 bytes | 1 byte | 1 byte | 1 byte | 0 or more bytes | 1 byte | 0 or more bytes | 1 byte | 0 or more bytes |
This chunk is semantically equivalent to the tEXt and zTXt chunks, but the textual data is in the UTF-8 encoding of the Unicode character set instead of Latin-1
Example:
This example uses the version of exiv2(.exe) that ships in v0.25 to show the structure of the file. The option -pa prints all metadata, -pS prints the structure of the file, -pX extracts the raw XMP/xml data.
841 rmills@rmillsmbp:~/gnu/exiv2/trunk/website $ exiv2 -pa ~/png Exif.Image.ImageWidth SLong 1 320 ... Exif.Thumbnail.JPEGInterchangeFormatLength Long 1 4376 Iptc.Application2.ObjectName String 4 ovni ... Iptc.Application2.RecordVersion Short 1 4 Xmp.dc.title LangAlt 1 lang="x-default" this is the title 842 rmills@rmillsmbp:~/gnu/exiv2/trunk/website $ exiv2 -pS ~/png STRUCTURE OF PNG FILE: /Users/rmills/png address | index | chunk_type | length | data 8 | 0 | IHDR | 13 | 33 | 1 | zTXt | 8769 | Raw profile type exif..x...[r.. 8814 | 2 | zTXt | 270 | Raw profile type iptc..x.=QKn. 9096 | 3 | iTXt | 2524 | XML:com.adobe.xmp.....<?xpacket 11632 | 4 | iCCP | 1404 | icc..x...i8.......af\...w_3... 13048 | 5 | sBIT | 3 | 13063 | 6 | zTXt | 87 | Software..x...A.. ......B....}. 13162 | 7 | IDAT | 8192 | x...Y.$Wv&v.{.{l.T.......[w.=m$ ... 119814 | 20 | IDAT | 8192 | k........!..B*.....\*.(!..0.s.. 128018 | 21 | IDAT | 3346 | .Y.L@I$M.Z[.0A ...K#.t.0+.G(.jX 131376 | 22 | IEND | 0 | 843 rmills@rmillsmbp:~/gnu/exiv2/trunk/website $ exiv2 -pX ~/png | xmllint -pretty 1 - <?xml version="1.0"?> <?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about=""> <dc:title> <rdf:Alt> <rdf:li xml:lang="x-default">this is the title</rdf:li> </rdf:Alt> </dc:title> </rdf:Description> </rdf:RDF> </x:xmpmeta> <?xpacket end="w"?> 844 rmills@rmillsmbp:~/gnu/exiv2/trunk/website $
Most of the file consists of IDAT blocks which contain the image itself (pixels). More interestingly, the zTXt chunks contain the Exif and Iptc data. The iTXt chunk contains the XMP/xml. Both the Exif and Iptc data blocks are stored using the TIFF container specification. The example in the TIFF document illustrates how to extract and print the structure of the Exif data written in TIFF format. http://dev.exiv2.org/projects/exiv2/wiki/The_Metadata_in_TIFF_files
References¶
1. PNG Spec , Version 1.2 | http://www.libpng.org/pub/png/spec/1.2/PNG-Contents.html |
2. Wikipedia. PNG. | http://en.wikipedia.org/wiki/Portable_Network_Graphics |
3. Image File Formats - JPG, TIF, PNG, GIF Which to use? | http://www.scantips.com/basics09.html |
4. Benefits of the PNG Image Format | http://www.atalasoft.com/png.aspx |
Updated by Robin Mills over 6 years ago · 16 revisions