Control Character Encoding on RTF export?

Hi all, although I guess this is a question for DEVON technologies:

I’ve been creating RTF/RTFD notes in DTPO, actually in YAML format so that I can retrieve a structured document. I’ve used mixed-means to enter these notes: some using TextEdit/DTPO and some using TextMate.

On export to TEXT, I’ve got encoding issues: looks like the TextEdit ones are in UTF-16, and the TextMate ones are in UTF-8. So YAML chokes on the two-byte encoding. So are these encodings established by the text editors, ie TextEdit and TextMate, and not DTPO?

Any thoughts on how I can guarantee a single-byte encoding (ie UTF-8) on export (as TEXT) from DTPO? I assume that getting the “plain text” element via Applescript would simply reflect the byte encoding of the original RTF, and not convert to UTF-8?

Thanks, Charles

OK, I see that it’s TextEdit doing this, which seems to encode an RTF document as UTF-8, but if it becomes an RTFD, the text gets encoded as UTF-16LE. In this format, the first two bytes of exported text are the Byte-order Mark, which could be used (although it isn’t its purpose) as a way of identifying a UTF-16 document.

Best, Charles

OK final, or should I say actual question: what encoding is the “plain text” element of a DTPO database record if the stored document is an RTFD?

If I “get” the plain text of the RTFD with Applescript, it returns a UTF-16LE string, but if I get plain_text with rb-appscript, I get a UTF-8 string.

Any thoughts on where the conversion is taking place?

Best, Charles