Frustrated with slow loading HTML pages

This is a continuation of a previous post.

I am in the process of going through about 1500 web pages which imported in a batch from DA. As we know, the only format for doing this seems to be HTML. My problem is that HTML pages are relatively slow to load in DT and when you have to process that many pages, a few second seconds each time can really add up.

So, I would like to either import and/or batch convert these to Web Archives. I understand that might take some time, but I don’t mind letting the machine run attended to allow that to happen. However, from the lack of responses to my post on this, I assume it is not possible to do something like this at this time.

Failing some kind of batch import/convert to webarchive is there any other thing I can do to speed up the loading of pages. I would swear that it take long sometimes than just loading a web page in my browser particularly when I am getting spinning balls when I delete or move pages (I am going to try rebuilding the database to see if that fixes the spinning ball issue).

You might have a look at the scripts posted in this thread:

Looks great but will the script work with HTML pages as well as bookmarks?

No. But just replace “is link” with “is html”.

Ok, fantastic…I will try it out later.

Ok, I am trying the script now and it crashed before I finished. Is this related to not being able to find a site? What exactly is supposed to happen in that case?

Ok, here is what seems to be happening. The script is choling on particular pages (no idea why) then creating a 0 byte HTML record and then crashing. Its not related to unavailable pages because the normal, single convert to webarchive works.

I notice a report about this on the forumpage whre the script was located and it said that this was supposed to have been fixed?

This should indeed be fixed. Which version of DT Pro are you using?


Could you send a crash log (see folder ~/Library/Logs/CrashReporter) to our support? Then I could check if it’s still the old issue or “just” another WebKit issue. Thanks!

Only the script is “crashing” in that it just stops. I don’t see a Crash Report in the folder that is relevant.

Each download of a web archive might last up to 60 seconds (the internal timeout) if the server is not responding or very slow.

I don’t think that is the case since I can quickly create a webarchive using the context menu command. This seems to be the same problem as reported earlier.

Hmm, may that could be it because the last choke was on a slow loading page.

Is there a way to modify the script to move on in case of a time out rather than just stopping? I could go back later and look for 0 byte files and try to recreate the archives for those later. As it is, it makes a batch convert kind of difficult as I have had to restart the process 5 times already.

Tested again and I don’t think loading time is the issue.