When I annotate a PDF and copy paste the annotations from inspector into an RTF file, DT automatically adds a prefix to each highlighted word(s), sentence and paragraph (when they are highlighted separately) , for example 2 Highlight 2020-11-20, 21:39:28 (page, highlight, date+time)
In some cases this can be useful but for my use in most cases it makes reading the annotations extremely tedious. Is there any way to eliminate the prefix. My only workaround has been to OCR the whole annotations section in the inspector, which is not a realistic solution.
-- Remove page, annotation type and date from manually copied PDF annotation
-- 1. Copy your annotation in the annotation inspector
-- 2. Run the script
-- 3. Paste
-- NOTE: from a short test it seems "thePattern" works with my locale,
-- for any other locale you'll probably have to adjust the date matching \\d\\d\\.\\d\\d\\.\\d\\d\\, \\d\\d\\:\\d\\d\\:\\d\\d\\
use AppleScript version "2.4"
use framework "Foundation"
use scripting additions
set thePattern to "^\\d+\\t\\w+ ?\\w+\\t\\w+\\t\\d\\d\\.\\d\\d\\.\\d\\d\\, \\d\\d\\:\\d\\d\\:\\d\\d\\t"
set theText to the clipboard as string
set theText_clean to my regexReplace(theText, thePattern, "")
if theText_clean ≠theText then
set the clipboard to theText_clean & return
display notification "Copied cleaned annotation"
else
display dialog "There's either no annotation in your clipboard or you have to adjust the Regex" with title "Clean annotation"
end if
return theText_clean
on regexReplace(theText, thePattern, theRepacement)
try
set theString to current application's NSString's stringWithString:theText
set newString to theString's stringByReplacingOccurrencesOfString:(thePattern) withString:(theRepacement) options:(current application's NSRegularExpressionSearch) range:{location:0, |length|:length of theText}
set newText to newString as string
on error error_message number error_number
activate
display alert "Error: Handler \"regexReplace\"" message error_message as warning
error number -128
end try
end regexReplace
a very nice script, thank you.
It does not work despite checking that the annotations are in the clipboard. Error message below. Probably related to date problem. I will figure it out.
I am trying to use the script as a generic regex text clean up.
May I ask you where and how in the script I would insert the replace in the script if I want to use it for something else
thank you
If you post an annotation that doesn’t work I’ll take a look. If you don’t want to post your real username replace it, e.g. if your real username is “petE 31” change it to “userA 31”.
-- Remove page, annotation type and date from manually copied PDF annotation
-- 1. Copy your annotation in the annotation inspector
-- 2. Run the script
-- 3. Paste
-- NOTE: This seems to work with date format 2021-11-20 or 20.11.20 and with or without a authorname
-- However it's likely that you need to adjust the pattern as I'm by no means familiar with Regex
use AppleScript version "2.4"
use framework "Foundation"
use scripting additions
set thePattern to "^\\d+\\t(\\w+ ?\\w+)\\t(\\w+ ?\\w+)?\\t(\\d\\d\\d\\d-\\d\\d-\\d\\d|\\d\\d\\.\\d\\d.\\d\\d), \\d\\d:\\d\\d:\\d\\d\\t"
set theText to the clipboard as string
set theText_clean to my regexReplace(theText, thePattern, "")
if theText_clean ≠theText then
set the clipboard to theText_clean & return
display notification "Copied cleaned annotation"
else
display dialog "There's either no annotation in your clipboard or you have to adjust the Regex" with title "Clean annotation"
end if
return theText_clean
on regexReplace(theText, thePattern, theRepacement)
try
set theString to current application's NSString's stringWithString:theText
set newString to theString's stringByReplacingOccurrencesOfString:(thePattern) withString:(theRepacement) options:(current application's NSRegularExpressionSearch) range:{location:0, |length|:length of theText}
set newText to newString as string
on error error_message number error_number
activate
display alert "Error: Handler \"regexReplace\"" message error_message as warning
error number -128
end try
end regexReplace
I really don’t understand what is going on. Now since 5 minutes ago, DT is adding my name to the prefix. I am not on drugs and not completely crazy. I have no clue what is going on. I have multiple proofs that the name was not there before. Perhaps another PDF ?
I will retest your initial solution.
you are very patient !
I doubt that’s possible. If you’ve copied annotations in DEVONthink 3.6 before and they had no authorname then there’s no way I can think of that annotations now include an authorname without changing DEVONthink preferences.
I’m new to AppleScriptObjC so I’m not sure why this happens, however it should be possible to compile by adding a space on a blank line and then trying to compile again, If that doesn’t work I’m out. The script works over here.
I now understand. When the annotations are created with the native DT PDF viewer, the author name is included. When the PDF is annotated with the open in external app, in this case PDF Expert, the name is absent.
Thank you SO MUCH for your help.
I will make the changes using the regex 101 site.