How to create a database from historical documents?

I’m currently reading a large microfilm collection, and I thought this would be the ideal time to start using DT to do more than just index my Documents folder.

My old way of doing things would be to read through the reel of microfilm, putting all notes into a long word document, titled something like “Painfully long microfilm collection (PLMC), reel 1 of 60.”

I started by creating a new rtf for each document on the reel, but after a while I began to worry about how I would be able to recreate the correct order on the reel if the files got resorted, so I went back and retitled all documents like this: “PLMC:01, 001 1750-01-01 Letter to Dead Guy”

PLMC is the name of the microfilm collection, 01 the reel, 001 means it was the first document on the reel, 1750-01-01 the date, then the title of the document. The documents on the film are not numbered in any way, so I was forced to create my own system.

This would cover my bases, but then it became apparent that I had to create a RTF for every single document on the reel, even the ones I don’t care about. I thought about just numbering the documents I took notes on, but then what if I want to go back and take notes on some that I skipped?

How can I preserve the order on the reel without all of this pain-in-the-butt numbering and creating of empty files? Any ideas?

I don’t think there’s a good solution to your problem. My wife is an acquisitions librarian (currently, changing next week) and she gets to screw around with microfilm all the time and hates it.

I would suggest creating all of the empty files. That might sound crazy, but I’ll explain a little bit more:

I have a very dense folder hierarchy – 1,571 groups. Each and every document has a name like “VI.A.03.b.i::0023. [[Journal]] It’s amazing what one can find in one’s belly button.” I do that so that I can create backups more simply and elegantly – after all, I don’t really need to backup the groups, I just need the documents. With the complete folder hierarchy and correct position within that group of each file in the name of the file, It’s a simple matter to reconstruct anything I need to reconstruct without having to traverse insane hierarchies in the Finder.

So you might consider creating all of the empty files and throwing them in a “Lost & Found” or “Empty Files” or some other kind of group where they won’t clutter up your working groups. Of course, it only works if they have a suitably descriptive name, but it sounds like you are working like that already.

Alternately, you could do what I did and skip the things you don’t want to work with right now. The nightmare would be that you’d work with documents 1, 237, and 350-423 of a series and number them 1 -75 or something. Suddenly, you go back and want to do documents 2-10, which involves renaming about 74 documents. Brutal, right?

Well, as I said above my files are named like “VI.A.03.b.i::0023. [[Journal]] It’s amazing what one can find in one’s belly button.” I wrote three scripts, actually one script with a tweak and then a modification. It grabs the “0023” part and a) adds 1 to that number, b) subtracts 1 from that number, or c) asks for a specified number to add or subtract from that number. So in the above example you’d select the “0002” document (which should really be “0237”), run my third script, and specify to add 235 to the number. Then you’d select documents “0003” through “0075” and specify to add 347 to the number. And in a flash, all the documents are numbered properly.

Those probably don’t answer your actual question, but maybe they’ll give you an idea of a better way to handle the shortcomings of whichever route you decide to take. In general, I think it’s best to pick the method that creates a lot of repetitive, simple, mindless tasks as a shortcoming – an AppleScript could then be written to fix the resulting problems.