Auto-Tag PDFs based on PDF colour

Hi all,

from time to time I am scanning documents as a PDF and I am scanning some in black/white, others in colour and others both.

Is there a way to automatically add a tag - for example “colour” - if the added PDF is a coloured scan? Or “b/w” if the scan is black only?

Thanks a lot in advance.
Thomas

I don’t think so. DT would basically have to print the pdf to „see“ the colors.

Theoretically it’s possible by analyzing a bitmap representation of the PDF but as it’s a rather unusual request and the first of its kind, there are no such plans currently.

You’re right, of course. I didn’t take into account that the OP was talking about a scanned PDF. As to their question: It could be possible using some ObjC-Scripting (i.e. getting the bitmap representation of the PDF and somehow™ analyze it).

@ttrepper:
Something like this

ObjC.import('Foundation');
ObjC.import('AppKit');
(() => {
  const app = Application("DEVONthink 3")
  app.includeStandardAdditions = true;
  const r = app.selectedRecords()[0];
  const path = $.NSString.alloc.initWithString(r.path());
  const pdfData = $.NSData.dataWithContentsOfFile(path);
  const pdfImg = $.NSPDFImageRep.imageRepWithData(pdfData);
  const pageCount = pdfImg.pageCount;
  for (let i = 0; i < pageCount; i++) {
    pdfImg.setCurrentPage(i);
    const temp = $.NSImage.alloc.init;
    const rep = $.NSBitmapImageRep.imageRepWithData(temp.TIFFRepresentation);
    console.log(rep.bitsPerPixel);
  }
})()

run from script editor on a record selected in DT will output the bits per pixel for each page of the PDF. If you run that on a b/w scan and it prints “2” to the script editors console for each page, the script can be used as a building block for your auto-tagging task. Otherwise, somebody else might have a better idea.

In my sample run, it output 32 for every page, but than I always scan in color.

1 Like