Can DT auto convert UTF encoding of iOS created UTF-16LE html files?

Hi,

I cannot understand why (some iOS/iCloud/Shortcuts update), since 2024 April 4 DEVONSave v3 shortcut is creating html files in UTF-16LE.
These render fine on DTTG but they are not rendered properly on DT.

Is there a way from within DT to script re-open the files as UTF-16LE and save them back as default encoding?

(reason why I use DEVONSave v3 over the extension/bookmarklet: it really saves the text as displayed on screen, including Safari Translated articles)

Are the files rendered as expected in Safari on the Mac? If that’s the case, then exporting and reimporting them might help.

Thanks for the suggestion

While the exported files do render fine in Safari, the re imported files do not render at all (not even text) in DT :cry:

Archive.zip (12.0 KB)
These are two such files

The HTML code actually claims that the encoding is UTF-8 but it isn’t. Did DEVONthink To Go create or just import/receive these HTML files?

1 Like

Thanks. I will check the whole chain.

The next release will improve the compatibility to such files.

Thanks.

It’s a weird file.

if I understand, the encoding of the file is UTF-16LE and DT does not detect it as html because it cannot parse even the “<html>”

I have so far not found where in the Shortcut UTF-16LE is set

At least the first one is just borken, as it contains two html elements. Invalid HTML, regardless of the encoding.

I have the same issue with ALL files. Changing to content=“text/html; charset=utf-16” does not solves the issue in DT, it still interprets as UTF-8.

Apart of adding two <�/html> tags at the ending, that I’ve corrected in my Shortcut Script, an Hex view shows some weird start characters:

My sample marked as UTF-8, attached, shows perfect in Safari and Edge macOS, but wrong in DT. Trying to edit something in DT, completely corrupts the file. BBEdit shows well in text mode, but changing into UTF-16 from BBEdit, shows nothing in DT but still is viewed right in Safari and Edge.

Perhaps could be interesting a menu entry to change the encoding as Web Navigators have.

Slave Ships From Space - Dark Worlds Quarterly.html.zip (12.1 KB)

[Edit to add that DTTG renders it right]

See above, the next release will fix this (v3.9.7).

1 Like

I’m trying to sort this issue with a smart rule, but I cannot make it work. My idea is convert the received file to utf-8 from utf-16. From terminal the conversion works and I get a new utf-8 file that works inside DT. However, when I try to do it via one Smart Rule, the resulting file ends orphaned.

I think I need to “get” the resulting file and add to the same folder as the original, and delete the original, but I don’t know how to do it. Any help is appreciated.

The smart rule just generates a new file but does neither replace the original file nor set the source of the record.

Tried this in the script, but now apart of the orphaned files, it logs “skipped” on each file.

I’m an absolute negate to this kind of things. What I want to accomplish, is convert the same file from UTF-16 to UTF-8, and then continue with the DT part of DEVONsave script.

on performSmartRule(theRecords)
	repeat with theRecord in theRecords
		set thePath to path of theRecord
		set newPath to (texts 1 thru -5 of thePath) & "_UTF-8.html"
		do shell script "iconv -f utf-16 -t utf-8 " & quoted form of thePath & " > " & quoted form of newPath
		delay 1
		set theGroup to incoming group -- esto apunta a la bandeja de entrada global
		import newPath to theGroup
	end repeat
end performSmartRule

At the end, @BLUEFROG did it. As easy as it was, and I weren’t able to do it.

on performSmartRule(theRecords)
	tell application id "DNtp"
		repeat with theRecord in theRecords
			set source of theRecord to (do shell script "iconv -f UTF-16LE -t UTF-8 " & (quoted form of (path of theRecord as string)))
			set tags of theRecord to ((tags of theRecord) & "UTF-8")
		end repeat
	end tell
end performSmartRule

This will be useful until next update should solve the issue.

1 Like