Just sharing this to add a little back to the community.
If you want to update the metadata keywords(or other fields) that are actually stored inside a PDF stored in DT this is the most efficient way I’ve found of doing so.
on NSdataFromASdata(asData)
return (current application's NSArray's arrayWithObject:asData)'s firstObject()'s |data|()
end NSdataFromASdata
on ASdataFromNSdata(nsData)
set theCode to current application's NSHFSTypeCodeFromFileType("'rdat'")
return (current application's NSAppleEventDescriptor's descriptorWithDescriptorType:theCode |data|:nsData) as data
end ASdataFromNSdata
set pdfDoc to (current application's PDFDocument's alloc()'s initWithData:(my NSdataFromASdata(data of theRecord)))
set PDFAttributes to current application's NSMutableDictionary's dictionaryWithDictionary:(pdfDoc's documentAttributes())
PDFAttributes's setValue:(documentKeywords as list) forKey:"Keywords"
set pdfDoc's documentAttributes to PDFAttributes
set data of theRecord to (my ASdataFromNSdata(pdfDoc's dataRepresentation()))
If you just update the PDF directly outside DT, DT is unaware of the change. This approach ensures DT indexes the changes as they are made. You can use the same approach to update other fields in the PDF metadata.
Parts of this are shamelessly lifted from Convert NSData into raw data string - #4 by ShaneStanley - AppleScript - Late Night Software Ltd.
1 Like
If I understand the code correctly, it
- gets the current PDF metadata as the
PDFDocument
’s documentAttributes
in a NSMutableDictionary
- It then sets the
keywords
key in this dictionary to a list of documentKeywords
- and replaces the
documentAttributes
with this modified dictionary.
I suppose that this overwrites any previously defined keywords
instead of updating (i.e. merging) them, but I may be wrong.
In any case, I think the code could be made much simpler (and forego ASdataFromNSData
as well as NSdataFromASData
by
- creating the
PDFDocument
from theRecord
’s path
property through getting an URL from that and then calling initWithURL
for the PDFDocument
- After setting the
documentAttributes
, write out the PDFDocument
with writeToPath
Might be a tad faster, too data:image/s3,"s3://crabby-images/78a42/78a427f8e797d5478534125a6501f320d74a82a8" alt=":wink: :wink:"
What do you mean by “outside DT” – with a script, in an app?
That is correct, in my use case I wanted to replace them. But its simple for somebody else to change if required.
If you go via the initWithURL and update the PDF file DT does not detect the change. If you view the file in DT or via other tools the Keywords have been updated, but if you search for those keywords DT doesn’t find them.
Did you try running DT’s indicate
method after writing the PDF? That is supposed to update the index.
I did try ‘indicate’ but it didn’t pick up on the keyword changes in the PDF metadata.
Weird. Perhaps @cgrunenberg has an idea.
indicate
is used to index files/folders, synchronize
is the right command.
Unsurprisingly, you are correct. Thanks a bunch.
Here’s a JavaScript version of @MrSkooby’s script, following the path I suggested before:
ObjC.import('PDFKit');
(() => {
const app = Application("DEVONthink 3")
/* Get the record to add keywords to */
const rec = app.getRecordWithUuid('x-devonthink-item://83E31EAE-991E-41A3-A146-20E4A77295E8');
/* create a PDFDocument from the record's path converted to an URL */
const PDFDoc = $.PDFDocument.alloc.initWithURL($.NSURL.fileURLWithPath($(rec.path())));
/* get the documentAttributes as a mutable array so we can add keywords to it */
const PDFAttributes = $.NSMutableDictionary.dictionaryWithDictionary(PDFDoc.documentAttributes);
/* Set the "Keywords" entry of the PDFAttributes to something unique
for testing purposes */
PDFAttributes.setValueForKey($([$('blurbsel')]), $('Keywords'));
/* Update the PDF's documentAttributes */
PDFDoc.documentAttributes = PDFAttributes;
/* Write back the PDF */
PDFDoc.writeToFile($(rec.path()));
/* Force DT to update it's index re this record */
app.synchronize({record: rec});
})()
The setValueForKey
call is a bit weird because it needs Objective-C data structures. $([…])
converts a JavaScript array into an NSArray
, while $('…')
converts a JavaScript string into an NSString
. Therefore, $([$('blurbsel')])
is an NSArray
with an NSString
as its only element.
I was (or I think I was) using indicate against the path property of a record. The same path I used to update the PDF file when I was testing the initWithURL route. But it didn’t appear to work. All searches for the newly added keywords failed.
Unfortunately I didn’t spot synchronise, or if I did I assumed it was related to the “sync” process. Oh well learned something new!
Thanks.