Handling of IE-saved websites

I have imported my whole digital archive I have collected in 12 years, and so far, I am happy with what DT does with it. But there is a problem: In most of these 12 years, I was a Windows user ( :blush: [size=85]I was young and needed the money…[/size]), and so a lot of what I have collected from the net was saved using the Internet Explorer.

Explication: The Internet Explorer used (I’m not up to date on how it’s handled today) to save a webpage by saving the HTML-file plus a subfolder that contained all images etc.

So, the typical structure for a webpage named ā€œImportant Informationā€ would be:
FILE Important Information.htm
FOLDER Important Information-Files

  • whitespace.gif
  • next_button.gif
  • photo1.jpg
  • etc. whatever…

Because I just imported this stuff into my DT database, I have a lot of subfolders now. Not nice, but the real problem is that DT does not display these webpages correctly - I suppose because the HTML-file looks for images in a subfolder that is not really there. And because images sometimes carry information, I have to do something about it.

So. The clean solution would be to use only web-archives. When I open the original file with Safari, everything works fine, and I could save the thing as a web-archive. But I have about 220 categories, around 60.000 files… It’s impossible to do all this by hand!

I stared on Automator until blood dropped from my forehead, but I found no way to establish some kind of automatic conversion process. (Every time I give Automator a try, the action I’d need are missing… :confused: )

I have zero experience in Applescript; I can only imagine that it must be possible to let the computer do this by itself: Take an old IE-saved webpage and replace it by a web-archive.

Any hints, any comments welcome!

P.S.: Of course this has not to be done in DT itself. I’d export everything, make the conversion and recreate the database from scratch by importing everything.

Indexing instead of importing these files/folders should fix the problem. Or you could convert selected items (with a valid URL) to web archives using the script posted in the thread viewtopic.php?f=2&t=7535.