Frustrated with slow loading HTML pages

sgmiller · December 11, 2006, 12:03pm

This is a continuation of a previous post.

I am in the process of going through about 1500 web pages which imported in a batch from DA. As we know, the only format for doing this seems to be HTML. My problem is that HTML pages are relatively slow to load in DT and when you have to process that many pages, a few second seconds each time can really add up.

So, I would like to either import and/or batch convert these to Web Archives. I understand that might take some time, but I don’t mind letting the machine run attended to allow that to happen. However, from the lack of responses to my post on this, I assume it is not possible to do something like this at this time.

Failing some kind of batch import/convert to webarchive is there any other thing I can do to speed up the loading of pages. I would swear that it take long sometimes than just loading a web page in my browser particularly when I am getting spinning balls when I delete or move pages (I am going to try rebuilding the database to see if that fixes the spinning ball issue).

cgrunenberg · December 11, 2006, 12:57pm

You might have a look at the scripts posted in this thread:
http://www.devon-technologies.com/phpBB2/viewtopic.php?t=2475

sgmiller · December 11, 2006, 1:03pm

Looks great but will the script work with HTML pages as well as bookmarks?

cgrunenberg · December 11, 2006, 1:06pm

No. But just replace “is link” with “is html”.

sgmiller · December 11, 2006, 1:12pm

Ok, fantastic…I will try it out later.

sgmiller · December 11, 2006, 1:36pm

Ok, I am trying the script now and it crashed before I finished. Is this related to not being able to find a site? What exactly is supposed to happen in that case?

sgmiller · December 11, 2006, 1:37pm

Also, I notice that it now says “God” under my forum name. I assume this is a promotion of some sort?

sgmiller · December 11, 2006, 1:48pm

Ok, here is what seems to be happening. The script is choling on particular pages (no idea why) then creating a 0 byte HTML record and then crashing. Its not related to unavailable pages because the normal, single convert to webarchive works.

I notice a report about this on the forumpage whre the script was located and it said that this was supposed to have been fixed?

cgrunenberg · December 11, 2006, 2:03pm

This should indeed be fixed. Which version of DT Pro are you using?

sgmiller · December 11, 2006, 2:19pm

1.2.1

cgrunenberg · December 11, 2006, 2:22pm

Could you send a crash log (see folder ~/Library/Logs/CrashReporter) to our support? Then I could check if it’s still the old issue or “just” another WebKit issue. Thanks!

sgmiller · December 11, 2006, 2:26pm

Only the script is “crashing” in that it just stops. I don’t see a Crash Report in the folder that is relevant.

cgrunenberg · December 11, 2006, 2:28pm

Each download of a web archive might last up to 60 seconds (the internal timeout) if the server is not responding or very slow.

sgmiller · December 11, 2006, 2:30pm

I don’t think that is the case since I can quickly create a webarchive using the context menu command. This seems to be the same problem as reported earlier.

sgmiller · December 11, 2006, 2:35pm

Hmm, may that could be it because the last choke was on a slow loading page.

Is there a way to modify the script to move on in case of a time out rather than just stopping? I could go back later and look for 0 byte files and try to recreate the archives for those later. As it is, it makes a batch convert kind of difficult as I have had to restart the process 5 times already.

sgmiller · December 11, 2006, 2:52pm

Tested again and I don’t think loading time is the issue.