As a historian, I’m offering my experience with using DT and Gemini in the hopes it will benefit fellow historians.
I understand that the concept of DT was the idea of a historian back in 2004, if I recall correctly. It has obviously gotten way more sophisticated and useful since then. I wish I had known of the program 5 years ago - I simply could not do my work without it.
I have many thousands of newspaper clippings, mostly from newspapers.com but others too. Those files are png files that have to be processed more than files from newspapers.com. I was recently encouraged to try Gemini to transcribe my clippings. I got off to a rough start with Gemini before I found a prompt that works well. My workflow, though it has a lot of steps, is pretty smooth now. I’ve learned a lot about using DT and Gemini and would like to offer the following suggestions:
Do not make long clippings - try to stay around 6 to 8 column inches.
While you can create a Gemini prompt that will enable you to transcribe multiple columns across multiple pages, I haven’t found it worth the time to do so. It is easier just to clip single columns and use the wonderful DT feature of combining and deleting multiple pages. That feature, alone, is worth the price of DT!
When processing multiple files, it is best to try to stay under a 2.5Mb threshold for file size. Gemini seems to “choke” on files larger than that and the output frequently requires a lot of editing. If you think the file you are clipping will be very large, combine and delete only a subset of the files. Once you have processed the file with Gemini, you can then combine all of the files.
Hallucinations: While Gemini is likely the best AI engine out there for transcribing old newspaper clippings, it isn’t perfect. I’ve read it is a little bit over 98% accurate, which I think is correct. But it DOES hallucinate and here are the situations I’ve found where it does so frequently:
The file is too large.
The text is fuzzy and out of focus.
The text is very faint.
In these cases, it is useful to go back and cut the file into smaller chunks.
For those who think OCR is the best way to create a searchable PDF from difficult-to-read images, think again. OCR is essential - I run all of my clippings through DT’s built-in OCR engine first. But it is not the last step. OCR mangles words and then there are all of the mangled words that the reporter or typesetter created. Once you see what Gemini produces from an OCR file, you will wonder how you were able to find anything at all in your database! It is essential that you go over the output of Gemini - it is not perfect!
I create an annotation file for each clipping because annotation files are searchable and also enable you to create links between the clippings and other sources of information in your database.
One thing I’ve discovered about creating annotation files: After clicking the “Export to Docs” option in Gemini (at the bottom of the output document), open it and save to Microsoft Word - the first option. Selecting another option will result in a file that does not span the full width of DT’s view pane.
Happy transcribing!