Do you see the same problem when you import to the filesystem instead of a database? Maybe in the latter case, only bookmarks are imported, not files?
I wasnāt until I tried it out. I have to admit that I find the GUI irritating and the results not consistent: importing to the filesystem seemed to work ok whereas importing into the database did not give the results promised by the documentation.
DEVONthink doesnāt change links while downloading (and never did), it only looks for items in your database(s) having the full absolute URL (e.g. after resolving relative links) while browsing and should use them if found. Please send me the database and Iāll have a look at it.
In that case, I repeat my suggestion for the OP to use a command line tool that can change the URLs, eg wget.
The URLs are fine. The question at hand is this part of behavior:
So you open the all.html
page which has a link to subdir/page.html
. When you click that link, DT should āresolve the relative linkā to be āthe full absolute URLā http://pat.local/dt-html-import-site/subdir/page.html
, and since thereās a document that was downloaded that has that exact URL, use that item.
I emailed you the test database I used.
I seem to be missing something here. Until now, I thought that you wanted to view downloaded HTML pages while youāre offline (cf. your first post here). Now you say that you want DT to āresolve the relative linkā (i.e. subdir/page.html
) to the absolute URL (i.e. http:/host/something/subdir/page.html
).
But this is, as far as I can tell, the exact behaviour DT shows now. With the obvious caveat that it canāt āresolveā this relative link when the relevant server is offline (either because it is turned off or because you have no connection.
I just tried it with a website here (mind you: one hosted outside of my local network ā maybe thatās relevant?) and it works exactly as expected: If WLAN is off, DT displays the HTML pages as they where downloaded. So that seems to work as described in the documentation, at least in this case.
Note Since @pete31 saw the same behaviour as @padillac, also with a local web server, maybe it is related to the fact that the URLsās domain is .local
? At least in @pete31ās example, Iām certain that the loopback interface will be used. Depending on the network layer, someone/something might take a shortcut there and handle server/connection detection differently then for a ārealā interface?
The problem is the local webserver
I apologize for shouting
@BLUEFROG and I see no problem but @pete31 and @padillac do because @BLUEFROG and I apparently tried external websites (i.e. not hosted by a server running on the current machine).
I just tried to replicate @padillacās setup with a local Apache and saw the exact same behaviour: DT displays the website ok as long as the server is running. If the server is turned off, DT complains about itnot being available. I think that this is inconsistent: It is not relevant on which server the site is hosted.
So maybe @cgrunenberg could have a look into the code and figure out how the local interface (lo
) and the other(s) (en0
etc.) are handled differently?
Indeed. As described multiple times previously, the process is:
- Use the Download Manager to import a site
- Open one of the imported items
- Click a link, and have DT open the imported document that corresponds to that link, rather than requesting it from the web server.
Does it work when you click a link on one of the pages? Thatās the question of this thread. The individual pages all work fine. Itās when clicking on a link from one page to another that DEVONthink does not load the imported document.
Interesting observation, and one that I found plausible. However, I removed the pat.local
entry from /etc/hosts
and restarted the machine, and DT still tries to connect to pat.local
(which now just times out because the domain doesnāt exist).
I will try it with an external site at some point though, to see if it behaves differently. fwiw, @BLUEFROG confirmed that he saw the same issue that I reported.
Yes for an external site.
No for a local stor site.
local is a special domain, so it might be handled differently than other TLDs
Okay! I think we cracked it I changed it to pat.chicken
and it works as expected now. Thank you for digging into this.
Well, the links appear not to work in DTTG, which was the whole point. Oh well. Hopefully DTTG3 adds this functionality.
(If DTTG2 should support this same functionality, please let me know)
The WebKit doesnāt seem to handle local and remote requests the same way but the next release will fix this.
The joys of opaque frameworks
Ok, I read, and reread this entire thread. The solution seems to be pat.chicken
Seriously, I want to do the same as padillac wanted to do,
- Import a website
- Navigate it in DT3 offline (either with no network connection, or with the website down)
I imported a website, subdirectory (complete), Files [all options selected], follow links in subdirectories
When I selected the main page [html], and turned off wifi,I had the same issue, with the page showing āThe internet connection appears to be offlineā
I am using DevonThink 3, on my iMac. I do not know what he meant by changing from pat.local to pat.chicken - and I do not think that applies for me, though it could
Am I missing the obvious solution?
Which version of macOS do you use and whatās the URL of the page?
If (and that is actually a very small if) the website uses JavaScript to load parts of the page, this behaviour is expected and completely normal. One of the consequences of Web 2.0, Iād say.
@pete31 shared an example of the Apple develper documentation recently. It consists of just a bare HTML scaffolding, and every single part is filled in at ārun timeā (aka when the page is opened) by JavaScript: the documentation is retrieved from a server, which obviously will not work when your machine is offline.
So āimporting a website for offine useā will probably not do what one would naively expect in many cases nowadays.