If DevonThink could find images inside RTF files …

… that would certainly set it apart from similar programs.

For all I know, there may be a secret regex that accomplishes this. Imagine hitting CMD-F, typing in the magic code (or checkbox), and then replacing all images with nothing.

Wow, would that save time when removing dozens of avatars embedded in forum text, or what? I often select and clip—using CMD-) to Take Rich Note—a long swathe of Reddit text. It’s such a nightmare to delete 20+ embedded images that I’ve actually considered doing a plain text clip instead and losing many important links (along with bold and italics that might make a difference).

Anyway, if such power does not exist, it would definitely attract public interest if it came into existence, say, in the next update. Holy cow!

Too bad the Clutter-Free checkbox option for the whole-webpage-capture (which I never use) does not also (optionally) apply to, and so remove images from, selected-region-capturing via CMD-).

The days of avatar and other crap-image clutter are upon us. We have faith that DevonThink can conquer this foe!

Removing images from HTML documents

It’s not very difficult to remove images from HTML programmatically. Something along these lines should do the trick

/* Define regular expressions for  images in HTML */
const RE = /<img[^>]*>/g;
const app = Application("DEVONthink 3");
/* Get all selected records and loop over them */
app.selectedRecords().forEach(r => {
   /* Get the type of the record */
  const type = r.type();
  if (r.type() === "html") {
  /* Remove all occurrences of the RE in the current record. */
    r.plainText = r.plainText().replaceAll(RE,"");
});

Note: Code is not tested at all and requires at least one HTML record to be selected.

Removing images from RTF(D)

I’m uncertain if DT ever creates an RTF file with embedded images. When I clipped a web document in RTF format, I got an RTFD, which is in fact a folder. What one could do (and that’s really quite convoluted):

  • get the path of the “document” (in fact, the path of the folder)
  • export that folder to some temporary place on the computer (like /tmp)
  • open the file TXT.rtf in this folder
  • in that file, remove all references to images. Note that they might also come in the form of URLs to external images like Avatars.
  • write the modified contents back to disk
  • import this RTF file again into DT
  • remove the folder from /tmp and the old RTFD from DT

As I said: Convoluted, but not impossible.

More than you asked for

Also, this will remove all images. Which is not necessarily helpful if someone included an image to make a point (as opposed to get on your nerves with an avatar).
So, you might as well just save the web document as text (or markdown), perhaps.

Use \uFFFC in Nisus Writer Pro or TextSoap.

This script creates RTF records from selecetd RTFD records.

Note: This script creates new records. They have a new UUID, i.e. existing links still link to the original record and NOT to the new one.

(It is possible to write the RTF data back into the original record, however DEVONthink then doesn’t know that the record has changed and still shows its type as RTFD. If one afterwards opens the record and changes its content (e.g. by typing a space and deleting it), then DEVONthink changes the type to RTF. If someone wants to use such a script let me know)

-- Create RTF records from RTFD

-- Note: This script creates new records. They have a new UUID, i.e. existing links still link to the original record and NOT to the new one.

use AppleScript version "2.4"
use framework "Foundation"
use scripting additions

property moveOriginalRecordToTrash : true

tell application id "DNtp"
	try
		set theRecords to selected records
		if theRecords = {} then error "Please select some RTFD records."
		
		repeat with thisRecord in theRecords
			set thisRecord_Type to (type of thisRecord) as string
			if thisRecord_Type is in {"rtfd", "«constant ****rtfd»"} then
				set thisRecord_Path to path of thisRecord
				my importRTFVersion(thisRecord_Path, thisRecord)
			end if
		end repeat
		
	on error error_message number error_number
		if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
		return
	end try
end tell

on importRTFVersion(theRecord_Path, theRecord)
	try
		set theURL to current application's NSURL's fileURLWithPath:theRecord_Path
		set {theAttributedString, theError} to current application's NSAttributedString's alloc()'s initWithURL:theURL options:(missing value) documentAttributes:(missing value) |error|:(reference)
		if theError ≠ missing value then error (theError's localizedDescription() as string)
		
		set theAttributedString_Range to {location:0, |length|:theAttributedString's |length|()}
		set theDocumentAttributesRTF to {NSDocumentTypeDocumentAttribute:(current application's NSRTFTextDocumentType)}
		set theData to (theAttributedString's RTFFromRange:(theAttributedString_Range) documentAttributes:theDocumentAttributesRTF)
		
		set theTempDirectoryURL to my createTempDirectory()
		
		set theTempURL to ((theTempDirectoryURL's URLByAppendingPathComponent:(current application's NSProcessInfo's processInfo()'s globallyUniqueString()))'s URLByAppendingPathExtension:"rtf")
		set {successWriteRTF, theError} to (theData's writeToURL:theTempURL options:(current application's NSDataWritingAtomic) |error|:(reference))
		set theTempPath to (theTempURL's |path|()) as string
		
		tell application id "DNtp"
			try
				set theImportedRecord to import theTempPath name (name without extension of theRecord) to (location group of theRecord)
				
				tell theImportedRecord
					set aliases to aliases of theRecord
					set comment to comment of theRecord
					set creation date to creation date of theRecord
					try
						set custom meta data to custom meta data of theRecord
					end try
					set exclude from search to exclude from search of theRecord
					set exclude from see also to exclude from see also of theRecord
					set exclude from Wiki linking to exclude from Wiki linking of theRecord
					set label to label of theRecord
					set locking to locking of theRecord
					set rating to rating of theRecord
					set state to state of theRecord
					set tags to tags of theRecord
					try
						set thumbnail to thumbnail of theRecord
					end try
					set unread to unread of theRecord
					set URL to URL of theRecord
				end tell
				
				if moveOriginalRecordToTrash then
					move record theRecord to trash group of (database of theRecord)
				end if
				
			on error error_message number error_number
				if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
				return
			end try
		end tell
		
		set {successDeleteDir, theError} to (current application's NSFileManager's defaultManager()'s removeItemAtURL:(theTempDirectoryURL) |error|:(reference))
		if theError ≠ missing value then error (theError's localizedDescription() as string)
		
	on error error_message number error_number
		activate
		if the error_number is not -128 then display alert "Error: Handler \"importRTFVersion\"" message error_message as warning
		current application's NSFileManager's defaultManager()'s removeItemAtURL:(theTempDirectoryURL) |error|:(missing value)
		error number -128
	end try
end importRTFVersion

on createTempDirectory()
	try
		set theTempDirectoryURL to current application's |NSURL|'s fileURLWithPath:((current application's NSTemporaryDirectory())'s stringByAppendingPathComponent:("" & space & (current application's NSProcessInfo's processInfo()'s globallyUniqueString())))
		set {successCreateDir, theError} to current application's NSFileManager's defaultManager's createDirectoryAtURL:theTempDirectoryURL withIntermediateDirectories:false attributes:(missing value) |error|:(reference)
		if theError ≠ missing value then error (theError's localizedDescription() as string)
		return theTempDirectoryURL
	on error error_message number error_number
		activate
		if the error_number is not -128 then display alert "Error: Handler \"createTempDirectory\"" message error_message as warning
		error number -128
	end try
end createTempDirectory

Does that copy the content of the images into the RTF?

Do you mean whether it OCRs the images and replaces them with the OCR results? No :slight_smile:

It just removes all images.

No, I was not thinking about OCR. Rather: does it copy the content (aka bytes) of the images into the RTF like a formatted note in DT does. Apparently not.

Btw: what’s the difference here between RTF and RTFD? Afaict, RTF would allow to reference or embed images, too.

RTFD always links images; it’s just a link to an internal resource in the RTFD package.

No, RTF is always without any images or other attachments.

The specification knows a pict element, though. Is that never used?

No idea. You can test what happens in DEVONthink and other apps when you add an image to a RTF. It will be saved as RTFD.

It seems that RTF as defined by Microsoft does support embedded images. Apple’s implementation, however, doesn’t. Always nice to have standards…

1 Like

WordService provides a service to remove attachments. In addition, the hidden preference RichNotesWithoutAttachments makes it possible to always capture RTF and never RTFD.

Thank you so much for this. I’m a DIY guy and should have considered using emacs to either scan through RTFD documents, kill the image-inclusion bits, and delete the associated files in the hidden directory.

Here is the quick and easy non-DIY answer! Thanks so much. I’ll try it out now.

Amazing! Thank you. I’m going to try it out now.

It works! Amazing script. I used to program in Obj-C (command-line tools only) and still to this day have not yet included the Foundation framework in any of my scripts and taken them to the next level. Lovely work. Thanks.

1 Like