Inconsistent behavior of URL command createHTML

URL commands

I have noticed that occasionally the URL command createHTML does not always create HTML.

The URL I am sending to DevonThink looks like this:

x-devonthink://createHTML?title=A%20Simple%20Guide%20to%20Minimalist%20Landscape%20Photography&location=https://digital-photography-school.com/photograph-minimalist-landscape/&source=dummy-representation-of-html

As you can see the “title” is percent encoded and although you cannot see this, the “source” is as well. In the above example I have left out the actual HTML because this can be many kilobytes of data.
The “location” field is not encoded. Should it?

Occasionally when I send this command to DevonThink, it responds by creating a document using the Web Internet location type. I can see that DT connects with the site as specified in the location and typically this is slow and takes a while (tens of seconds). Then when I send the same request a second time after the first request is finished it will do the right thing and create the article of the type HTML text.
When I send the command a second time before the first request is finished, it will be added to the fetch queue and create another document of the type Web Internet Location.

In all situations the exact same request is send to DT.

To make certain that the ending is done properly I am using the following swift method to encode the string:

func encodedString(_ string: String) -> String? {
	string.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed)?
      .replacingOccurrences(of: ";", with: "%3B")
      .replacingOccurrences(of: "&", with: "%26")
}

Does anyone have a clue why DT is not responding consistently? I can understand that for example I have made a mistake on how to construct the URL, but that should then not lead to inconsistent behavior I would think.

I have not found a way yet how to trigger this behavior. I have tried to stop and start DT in between requests but that does not cause the behavior to stop or start.

Regards,

Fred

Yes, all fields should be encoded.

Perhaps you could share the complete code that you use to “send” the command to DT?

Also, only replacing semicolons and & signs is not sufficient in many cases (e.g., you’d have to percent-encode the space character as well for an URL). And it’s not clear what exactly you encode there: The URL? The HTML contents?

@BLUEFROG: Would one have to specify the location and a source parameter with the createHTML command? To me, that makes very little sense, as the location should specify the URL from which to load the HTML, while source seems to refer to “string”.

Here is the complete code:

struct Devon: Loggable {

    static let devonThinkCreateHTMLCommand = "x-devonthink://createHTML?title=%Title&location=%Location&source=%Source"

    private static func encodedString(_ string: String) -> String? {
        string.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed)?
            .replacingOccurrences(of: ";", with: "%3B")
            .replacingOccurrences(of: "&", with: "%26")
    }

    static func exportToDevonThink(article: InMemoryArticle) {
        Task {
            if let html = encodedString(article.html),
               let title = encodedString(article.title),
               let url = encodedString(article.url.absoluteString){
                let command = devonThinkCreateHTMLCommand
                    .replacingOccurrences(of: "%Title", with: title)
                    .replacingOccurrences(of: "%Source", with: html)
                    .replacingOccurrences(of: "%Location", with: url)
                let debugCommand = devonThinkCreateHTMLCommand
                    .replacingOccurrences(of: "%Title", with: title)
                    .replacingOccurrences(of: "%Source", with: "dummy-representation-of-html")
                    .replacingOccurrences(of: "%Location", with: url)
                debug("URL = '\(debugCommand)'")
                if let url = URL(string: command) {
                    info("Create HTML for \(article.title)")
                    NSWorkspace.shared.open(url)
                }
            }
        }
    }
}

The debug command is only added to log the command.

This is not standalone code so this is not something you can compile and run.

Fred

As you can see the encoding is happening on all fields. I thought that location was not but it turned out to be encoded as well.

I can understand that my algorithm is wrong or broken. However I would expect the behavior of DT to be identical in each case.

Just for background, I am sending the URL and the HTML because the HTML is coming from the given URL but has been processed and cleaned for readability. The processing also removes adds etc.

Fred

All you need is the location parameter for the createHTML command and yes, the entire URL needs to be url-encoded, e.g.…

x-devonthink://createHTML?location=https%3A%2F%2Fwww.devontechnologies.com%2Fapps%2Fdevonthink

No, the URL components need to be URL encoded. You still want the & and the ? in the URL command, but you must encode them in the HTML and the location. The slashes in the location URL do not need to be encoded, nor does the colon.

But regardless: The HTML created by that is of limited usability. It’s missing all styles, since those are referred to by absolute URLs without a host part. Even creating a formatted note or a web archive using the URL command with location only, no source parameter results in an HTML without any styling.

I meant the URL in the location.

The HTML created by that is of limited usability.

Usability is a matter of perpective :wink:

Does that really make sense? If you expect DT to load the data from the location, why send the HTML? If you don’t want DT to load the data from the location, why set this parameter?

And as you wrote initially, DT connects with the site. Which seems to indicate that it does use the location, presumably ignoring the source.

Regardless: You must URL encode the location, title and source parameters. I tried understanding the description of addingPercentEncoding:withAllowedCharacters and gave up. What a mess, and who told them to never give useful examples? Anyway, urlQueryAllowed seems to be kind of ok here. Though – why would you then hand-encode ; and &? The latter is certainly not allowed in an URL query.

Certainly. And it appears that the discourse page(s) are more useful to DT if clipped without CSS :wink: I just tried clipping from FF as HTML and web archive – same results as with the URL commands. So, the behavior is consistent.

This is clearly not true. The HTML is beautifully styled. Here is a PDF of an article I created in DT. Compare that with the original: https://digital-photography-school.com/how-make-storytelling-landscape-photos-4-steps/

How to Create Landscape Photos That Tell Stories.pdf (3.1 MB)

The question is not about how useful this is. The question is about why I get a weblink in some cases and a HTML document in other cases. And if I do it a second time it is always right.

I want the HTML I provide in an HTML document and the URL to be registered as where it is coming from.

No I wrote initially that DT sometimes connects to the site. This is not consistent behavior.

Regarding the semicolon I found out that hard way that if I don’t encode it in the HTML the command fails.

Regarding the ampersand, I am encoding this inside the fields. This to prevent that they act as a URL parameter separator.

The variable devonThinkCreateHTMLCommand contains the necessary ampersands for the URL.

This is coming from the help of DT:

Commands:

  • createFormattedNote: Creates a formatted note.
  • createHTML: Creates a new HTML document.
  • createMarkdown: Creates a Markdown document.
  • createPDF: Creates a PDF.
  • createRTF: Creates a rich text document.
  • createWebArchive: Creates a web archive.
  • createBookmark: Creates a new bookmark.

This states that createHTML is supposed create a HTML document. Sometimes it doesn’t. Should I report this as a bug?

And what is produced when it doesn’t? A bookmark? If so, then no, it’s not a bug.

Yes, I wrote that in the initial article. It is called a Web Internet Location which is a bookmark I guess. I would think that ‘createHTML’ should always create an HTML document. If I wanted to have a bookmark I would use `createBookmark’.

Why does it create a bookmark sometimes and always an HTML document when sent a second time?

This is well documented here and elsewhere: If a web capture fails for whatever reason, a bookmark is captured so at least there’s something to work with. There is no 100% bullet-proof web capture method as even things like poor network quality or slow/unresponsive servers can inhibit downloading the data to create a document.

I would think that createHTML doesn’t ask for a web capture. Is the fact that I provide both a URL and HTML causing this? I could try to not provide the URL or inject it inline in the HTML to not loose the original URL.

Just a wild idea: you could create a DT record using AppleScript or JavaScript from Swift. That would allow you to set the HTML content and the URL without triggering a download. I think.