Project

General

Profile

Extracting raw ICC profile data that's embedded in the Exif?

Added by Ray NA almost 11 years ago

Hi,

Is there a way to extract the ICC profile from an image file that has it embedded as part of the Exif, in the IFD0 structure? I just need to get at this data from the Exiv2 API - the API does NOT need to interpret the data.

Looking through the output from exiftool on images, it appears that Capture NX2 2.2.5 will embed the ICC profile in the Exif data. I only know the simplest things about the exif structures and not what is valid etc.

With that said, exiftool tells me that the ICC_Profile has a tag 0x8773, contained with the IFD0 directory

  JPEG APP1 (41806 bytes):
  ExifByteOrder = II
  + [IFD0 directory with 23 entries]
  | 0)  SubfileType = 1
  | 1)  ImageWidth = 320
  | 2)  ImageHeight = 212
  | 3)  BitsPerSample = 8 8 8
  | 4)  Compression = 1
  | 5)  Make = NIKON CORPORATION
  | 6)  Model = NIKON D300
  | 7)  Orientation = 1
  | 8)  SamplesPerPixel = 3
  | 9)  XResolution = 300 (300/1)
  | 10) YResolution = 300 (300/1)
  | 11) PlanarConfiguration = 1
  | 12) ResolutionUnit = 2
  | 13) Software = Capture NX 2.2.5 W
  | 14) ModifyDate = 2011:01:05 12:30:16
  | 15) Artist =
  | 16) ReferenceBlackWhite = 0 255 0 255 0 255 (0/1 255/1 0/1 255/1 0/1 255/1)
  | 17) Copyright =
  | 18) ExifOffset (SubDirectory) -->
... snip ...
  | 19) ICC_Profile (SubDirectory) -->
  |     - Tag 0x8773 (532 bytes, undef[532]):
  |         0218: 00 00 21 ec 4e 4b 4f 4e 02 20 00 00 6d 6e 74 72 [..!.NKON. ..mntr]
  |         0228: 52 47 42 20 58 59 5a 20 07 d9 00 02 00 14 00 11 [RGB XYZ ........]
  |         0238: 00 07 00 0a 61 63 73 70 41 50 50 4c 00 00 00 00 [....acspAPPL....]
  |         0248: 6e 6f 6e 65 00 00 00 01 00 00 00 00 00 00 00 00 [none............]
  |         0258: 00 00 00 00 00 00 f6 d6 00 01 00 00 00 00 d3 2d [...............-]
  |         [snip 452 bytes]
  | + [ICC_Profile directory with 9 entries]
  | | ProfileHeader (SubDirectory) -->
  | | + [BinaryData directory, 128 bytes]
  | | | ProfileCMMType = NKON
  | | | - Tag 0x0004 (4 bytes, string[4]):
  | | |     0210: 4e 4b 4f 4e                                     [NKON]
  | | | ProfileVersion = 544
  | | | - Tag 0x0008 (2 bytes, int16s[1]):
  | | |     0214: 02 20                                           [. ]
  | | | ProfileClass = mntr
  | | | - Tag 0x000c (4 bytes, string[4]):
  | | |     0218: 6d 6e 74 72                                     [mntr]
  | | | ColorSpaceData = RGB 
  | | | - Tag 0x0010 (4 bytes, string[4]):
  | | |     021c: 52 47 42 20                                     [RGB ]
  | | | ProfileConnectionSpace = XYZ 
  | | | - Tag 0x0014 (4 bytes, string[4]):
  | | |     0220: 58 59 5a 20                                     [XYZ ]
  | | | ProfileDateTime = 2009 2 20 17 7 10
  | | | - Tag 0x0018 (12 bytes, int16u[6]):
  | | |     0224: 07 d9 00 02 00 14 00 11 00 07 00 0a             [............]
  | | | ProfileFileSignature = acsp
  | | | - Tag 0x0024 (4 bytes, string[4]):
  | | |     0230: 61 63 73 70                                     [acsp]
  | | | PrimaryPlatform = APPL
  | | | - Tag 0x0028 (4 bytes, string[4]):
  | | |     0234: 41 50 50 4c                                     [APPL]
  | | | CMMFlags = 0
  | | | - Tag 0x002c (4 bytes, int32u[1]):
  | | |     0238: 00 00 00 00                                     [....]
  | | | DeviceManufacturer = none
  | | | - Tag 0x0030 (4 bytes, string[4]):
  | | |     023c: 6e 6f 6e 65                                     [none]
  | | | DeviceModel = 
  | | | - Tag 0x0034 (4 bytes, string[4]):
  | | |     0240: 00 00 00 01                                     [....]
  | | | DeviceAttributes = 0 0
  | | | - Tag 0x0038 (8 bytes, int32u[2]):
  | | |     0244: 00 00 00 00 00 00 00 00                         [........]
  | | | RenderingIntent = 0
  | | | - Tag 0x0040 (4 bytes, int32u[1]):
  | | |     024c: 00 00 00 00                                     [....]
  | | | ConnectionSpaceIlluminant = 0.9642 1 0.82491
  | | | - Tag 0x0044 (12 bytes, fixed32s[3]):
  | | |     0250: 00 00 f6 d6 00 01 00 00 00 00 d3 2d             [...........-]
  | | | ProfileCreator = 
  | | | - Tag 0x0050 (4 bytes, string[4]):
  | | |     025c: 00 00 00 00                                     [....]
  | | | ProfileID = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  | | | - Tag 0x0054 (16 bytes, int8u[16]):
  | | |     0260: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  | | 0)  ProfileDescription = Nikon Adobe RGB 4.0.0.3001
  | |     - Tag 'desc' (118 bytes, type 'desc'):
  | |         037c: 64 65 73 63 00 00 00 00 00 00 00 1b 4e 69 6b 6f [desc........Niko]
  | |         038c: 6e 20 41 64 6f 62 65 20 52 47 42 20 34 2e 30 2e [n Adobe RGB 4.0.]
  | |         039c: 30 2e 33 30 30 31 00 00 00 00 00 00 00 00 00 00 [0.3001..........]
  | |         03ac: 00 1b 4e 69 6b 6f 6e 20 41 64 6f 62 65 20 52 47 [..Nikon Adobe RG]
  | |         03bc: 42 20 34 2e 30 2e 30 2e 33 30 30 31 00 00 00 00 [B 4.0.0.3001....]
  | |         [snip 38 bytes]
... snip ...

I have been extracting preview images from Nikon NEFs via the Exiv2 API. However, if I use aRGB color space when processing of the NEF, the preview (jpeg) image is also in aRGB.

Once I have the preview image, I'd like to grab the embedded ICC profile such that I can ask ImageMagick's C API to perform the conversion to the standard sRGB color space such that any extracted preview image can be uploaded to the web/viewed in any browser (hence the sRGB color space requirement).

I've asked the IM boards and at the moment they also do not provide a way to grab the (uninterpretted) ICC profile if its in the Exif data.

Thanks


Replies (8)

RE: Extracting raw ICC profile data that's embedded in the Exif? - Added by Andreas Huggel almost 11 years ago

If "uninterpreted" means you just need the 532 bytes of data in the above example (00 00 21 ec 4e 4b 4f 4e 02 20 ...), then you should easily be able to get that: It's the value of Exif.Image.0x8773 if I understand the sample correctly.

For Exiv2 to decode that ICC sub-directory, it will need to become aware of this structure, which requires configuration along the lines of this slightly outdated tutorial in the wiki.

Can you please create an issue for this feature and attach a sample image with such a tag?

Andreas

RE: Extracting raw ICC profile data that's embedded in the Exif? - Added by Ray NA almost 11 years ago

Thanks, yes the 532 bytes is exactly what I need, although different ICC Profiles will be different sizes of course.

I will log an issue/feature request along with suitable image over the wkend (currently working at client site).

Having a look around, I found: [[http://www.color.org/ICC_Minor_Revision_for_Web.pdf]] which states the the 0x8773 tag is used to identify the embedded ICC profile within the IFD structure for a given tiff/jpeg etc.

The section, "B.3 Embedding ICC profiles in TIFF files" p73 if you need this reference.

RE: Extracting raw ICC profile data that's embedded in the Exif? - Added by Ray NA almost 11 years ago

"you should easily be able to get that: It's the value of Exif.Image.0x8773"

Can I ask if I am using the i/f correctly?

 Exiv2::Image::AutoPtr image = Exiv2::ImageFactory::open(filename);
 image->readMetadata();
 const Exiv2::ExifData&  ed = image->exifData();
 Exiv2::ExifData::const_iterator  d;

 const char*  iccproftag = "Exif.Image.0x8773";
 if ( (d = ed.findKey(Exiv2::ExifKey(iccproftag)) ) == ed.end()) {
     std::cout << "unable to find ICC profile" << std::endl;
 }
 else
 {
     // COUNT and SIZE report the size of the data (ie 536 bytes)
     std::cout << "lookup=" << iccproftag << " => " << d->key() << "=>  @" << d->idx() << " count=" << d->count() << " size=" << d->size() << std::endl;

*     // BUT the dataArea() size_ is 0!!!*
     Exiv2::DataBuf  buf = d->dataArea();

     if (buf.size_ > 0)
     {
         char path[PATH_MAX];
         strcpy(path, filename);
         strcpy(path, basename(path));
         strcat(path, ".icc");
         std::cout << "  dumping ICC to " << path << std::endl;

         int fd;
         if ( (fd = open(path, O_CREAT | O_TRUNC | O_WRONLY, 0666 & ~msk))) {
             write(fd, buf.pData_, buf.size_);
             close(fd);
         }
     }
     else
     {
         const Exiv2::Value&  val = d->value();
         Exiv2::DataBuf  vbuf =  val.dataArea();
         std::cout << "  empty dataArea() ???  value size=" << val.size() << ", dataArea().size=" << vbuf.size_ << std::endl;
     }

Running this, I have:

lookup=Exif.Image.0x8773 => Exif.Image.InterColorProfile=>  @26 count=532 size=532
  empty dataArea() ???  value size=532, dataArea().size=0

so the dataArea() is always empty even though the reported size is 532 bytes??

RE: Extracting raw ICC profile data that's embedded in the Exif? - Added by Andreas Huggel almost 11 years ago

I noticed that tag already has a name; it's called Exif.Image.InterColorProfile. Something like this is how its value can be extracted and stored in a file:

    Exiv2::Exifdatum& md = exifData["Exif.Image.InterColorProfile"];
    if (md.size() > 0) {
        Exiv2::DataBuf buf(md.size());
        md.copy(buf.pData_, Exiv2::invalidByteOrder);
        Exiv2::writeFile(buf, "iccprofile");
    }

Note that while the exifData[...] notation is shorter, it creates an entry if it doesn't exist yet, so it may be better to use ExifData::findKey().

The data area on the other hand is a somewhat obscure (and not very well documented) feature, which is used e.g., for the image data of thumbnail tags, where the actual tag value is only an offset that points to the area where the image data is found.

BTW, the NEF image you posted in #756 also contains the ICC profile in Exif.Nikon3.ICCProfile. (Unfortunately, the JPG doesn't have this tag either.)

Andreas

RE: Extracting raw ICC profile data that's embedded in the Exif? - Added by evan pan almost 6 years ago

Hi, Andreas.
Thanks for sharing your problem. As for me, I have seldom tried to deal with that kind of problem before. Have you ever worked it out? I wonder whether you have any exprience about pdf extraction process. Because there is something wrong with my pdf reader. I need convert pdf into text or other formats. Any suggestion will be appreciated. Thanks in advance.

Best regards,
Pan

RE: Extracting raw ICC profile data that's embedded in the Exif? - Added by Robin Mills almost 6 years ago

I was unaware of this conversation. However, I have good news: #1074 will provide support for ICC profiles in v0.26. v0.26 will be ready later this year.

RE: Extracting raw ICC profile data that's embedded in the Exif? - Added by Robin Mills almost 6 years ago

The linux utility pdftotext appears to do a good job of converting PDF to ascii. If you're looking for a library to perform PDF to TXT conversion, the utility pdftotext uses libpoppler.so.52 about which I know nothing at all.

I know a great deal about PDF because I was a Senior Computer Scientist at Adobe Systems in San Jose, California for more than 10 years. I intended to add PDF metadata support in v0.26 #1138. However it's been deferred to make the schedule. It's quite unlikely that Exiv2 would support extracting text from a PDF file. The scope of Exiv2 is metadata. Reading content isn't metadata.

Adobe have excellent libraries for PDF support, however I don't know the licensing arrangements. I'm an engineer and not a business guy. I intended to add PDF support to Exiv2 using podofo. I know very little about podofo and #1138 would give me a much better feel for that library. podofo has a number of sample applications including podofoextracttxt. Extracting an ascii document from a PDF is non trivial because PDFs are usually generated by printer drivers which can break lines of text into many little strings. The reassembly of those strings into simple words and spaces is a clever trick. I've tried podofoextracttxt and, although it reports all the strings, it doesn't appear to perform the reassembly.

    (1-8/8)