My family has stored boxes of old family documents. We would like to digitise, organise and then potentially tease some stories out of this shared history.
Most of it is handwritten and so we have decided that taking photos, rather than scanning, would be best for future proofing our work. For example, perhaps one day handwriting recognition of existing documents might be an option to apply later.
It seems likely that each document will be stored in RAW file format. It also seems likely that there will be JPEG versions of each document created for viewing by the various members of the family as we sort though it all. There seems to be the potential for TIFF versions as well, perhaps even multiple TIFF versions, if a piece of content is used in multiple places later.
I’m wondering how I can keep these versions of each document associated with each other. I would be particularly pleased if metadata (of a file, or associated with a file?) was also associated with the other files. For example, if a member of the family identifies the date a document was written in the JPEG version, can I apply or associate this with the original RAW file too?
So far, at a file system level, it seems that the only way would be via filename. This seems prone to breaking.
Does anyone have any suggestions, or references, that might help me figure out how to deal with this workflow, be it inside or outside of DT3? I feel that this is a problem that must have been addressed many times before, and not just inside the proxy workflows of video editing.
I guess I could enclose associated files in their own folder, and then provide hard links to files for users. Perhaps something like Hazel could help me do this. Doesn’t seem like the right way to go though.
Yes I would use tags and place the files in an hierarchy of time stamped folders based on date or origin, ex:
That’s how I keep my raw photos (since the Aperture days) as an external file system. But I don’t need down to a day, rather 2018/2018-10-02/, 2018/10-03/ and so on, as I don’t get up to 365 folders per year.
Normal OCR software is not very good at handwritten recognition, but if you generate a PDF and upload to OneDrive or Google Drive, they have a service that OCR those kind of documents and, at least while the PDF is in their cloud, you can search inside handwritten documents. Supposedly, Apple Notes do the same but I’ve not tested it.
Main problem here is at the moment you get back the PDF… OCR is gone. They do internally. GoodNotes has a handwritten OCR as well. You can import PDF into GoodNotes, then OCR and then export it.