I have hundreds of hours of documentaries and lecture videos in DT3, and I also have the text transcriptions of each video with timestamps included in the text. I’ve tried pasting these transcriptions into Finder Comments, which does make them searchable, but the search only brings up the video in the list of search results… it doesn’t actually take me to the location within the text and highlight the search term like DT does when searching documents.
While it’s great to be able to locate any video by “searching the dialogue” of the video, I still have to then scroll through the Finder Comments (or copy & paste the text externally and search in another app) to find the timestamp where that word or phrase actually appears in the video.
Is there another place I could put the transcripts that would allow search results to be highlighted, or am I trying to use DT for something it wasn’t designed to do?
Put the transcript text in a plain text file inside DT.
Set the url field of the plain text file to the item link of its corresponding video file.
UPDATE:
The following JXA script (a product of my coding exercise) quickly open the video at the desired timestamp, provided that the URL of the transcript file is the item link of the video.
Usage: Select the timestamp text (format as either mm:ss or HH:mm:ss) in the transcript file, and then run this script.
If the selected text involves multiple legitimate timestamps, only the first will be considered. This allows selecting the timestamp by triple-clicking.
(() => {
const app = Application('DEVONthink 3');
app.includeStandardAdditions = true;
const s = app.thinkWindows[0].selectedText();
let t = 0;
if (s) {
let timestamp = s.match(/(\d\d):(\d\d):(\d\d)/);
if (timestamp) {
t = (+timestamp[3]) + 60 * (+timestamp[2]) + 3600 * (+timestamp[1]);
} else {
timestamp = s.match(/(\d\d):(\d\d)/);
if (timestamp) t = (+timestamp[2]) + 60 * (+timestamp[1]);
}
}
const video = app.getRecordWithUuid(app.thinkWindows[0].contentRecord().url());
const videoTab = app.openTabFor({record: video, in: app.thinkWindows[0]});
videoTab.currentTime = t;
app.thinkWindows[0].currentTab = videoTab;
})()
Just keep the transcripts as separate files, that will work much better.
While the comments field doesn’t have the same restrictions as finder comments and can include more text, I think it’s still intended for limited comments. Not full transcripts of media.
I guess you used the comments field because you want the video and transcript to be connected in some way, but there are better ways to go about that.
Item links. Either in the URL field like @meowky mentions or a custom metadata field (for example called “Transcript/Source”).
Annotations. Create an annotation file for the video and paste the transcript into that. This is shown in the inspector below the comments field, but it is also a separate document.
I think there’s also a way to make a pre-existing text document the annotation of another item, but I don’t remember. Might require scripting.
PS: item links can include timestamps. So if the videos are in a format you can play in DEVONthink… It should be possible to convert all your timestamps into clickable links taking you straight to that point in the video. Maybe Find & Replace with regular expressions is enough, maybe a script is necessary to convert the time format.
… Looks like meowky already figured something out!
It uses only one match and processes the whole match instead of capturing groups (i.e. either hh:mm:ss or mm:ss)
split(:) converts the match to an Array, which reverse then turns upside down
Now we have an Array with [ss, mm] or [ss,mm,hh]
The final line calculates a time value in seconds from that. If no hh entry (reversedTimestamp[3]) exists, it’s replaced by 0.
Caveats: the RE is not very robust, it assumes that time stamp components are always specified as two digit values and that the separator is always a colon. I have no idea how probable three- or one-digit hours are, though.
Remark: You do not need to apply + to every string in your calculation of t. Doing that for the first one suffices because that will turn the whole expression into a numeric one.
Thanks for the lesson! I was actually waiting for your instructions on that part of the script I like your solution. Beautifully clean.
HH:mm:ss is the timestamp format for SRT files. I omitted the milliseconds. My impression is that SRT is the most widely used format for subtitles, and I haven’t actually seen a subtitle file in another format. Therefore I decided not to consider alternative timestamps.
Thank you again for sharing this script. Had no idea doing this was even possible. I’ve been trying it out all evening on some very long transcripts, and this is going to save me HOURS of research time! I’m very grateful.