Search Video Text via Finder Comments

I have hundreds of hours of documentaries and lecture videos in DT3, and I also have the text transcriptions of each video with timestamps included in the text. I’ve tried pasting these transcriptions into Finder Comments, which does make them searchable, but the search only brings up the video in the list of search results… it doesn’t actually take me to the location within the text and highlight the search term like DT does when searching documents.

While it’s great to be able to locate any video by “searching the dialogue” of the video, I still have to then scroll through the Finder Comments (or copy & paste the text externally and search in another app) to find the timestamp where that word or phrase actually appears in the video.

Is there another place I could put the transcripts that would allow search results to be highlighted, or am I trying to use DT for something it wasn’t designed to do?

Thanks.

Try the following:

  1. Put the transcript text in a plain text file inside DT.
  2. Set the url field of the plain text file to the item link of its corresponding video file.

UPDATE:

The following JXA script (a product of my coding exercise) quickly open the video at the desired timestamp, provided that the URL of the transcript file is the item link of the video.

Usage: Select the timestamp text (format as either mm:ss or HH:mm:ss) in the transcript file, and then run this script.

  • If the selected text involves multiple legitimate timestamps, only the first will be considered. This allows selecting the timestamp by triple-clicking.
(() => {
  const app = Application('DEVONthink 3');
  app.includeStandardAdditions = true;
  
  const s = app.thinkWindows[0].selectedText();
  let t = 0;
  if (s) {
    let timestamp = s.match(/(\d\d):(\d\d):(\d\d)/);
    if (timestamp) {
      t = (+timestamp[3]) + 60 * (+timestamp[2]) + 3600 * (+timestamp[1]);
    } else {
	  timestamp = s.match(/(\d\d):(\d\d)/);
	  if (timestamp) t = (+timestamp[2]) + 60 * (+timestamp[1]);
    }
  }
    
  const video = app.getRecordWithUuid(app.thinkWindows[0].contentRecord().url());
  const videoTab = app.openTabFor({record: video, in: app.thinkWindows[0]});
  videoTab.currentTime = t;
  app.thinkWindows[0].currentTab = videoTab;
})()

Tested and worked well on one .srt file.

2 Likes

Just keep the transcripts as separate files, that will work much better.

While the comments field doesn’t have the same restrictions as finder comments and can include more text, I think it’s still intended for limited comments. Not full transcripts of media.

I guess you used the comments field because you want the video and transcript to be connected in some way, but there are better ways to go about that.

  1. Item links. Either in the URL field like @meowky mentions or a custom metadata field (for example called “Transcript/Source”).
  2. Annotations. Create an annotation file for the video and paste the transcript into that. This is shown in the inspector below the comments field, but it is also a separate document.
    • I think there’s also a way to make a pre-existing text document the annotation of another item, but I don’t remember. Might require scripting.

PS: item links can include timestamps. So if the videos are in a format you can play in DEVONthink… It should be possible to convert all your timestamps into clickable links taking you straight to that point in the video. Maybe Find & Replace with regular expressions is enough, maybe a script is necessary to convert the time format.

… Looks like meowky already figured something out!

1 Like

This is awesome. Thank you so much for sharing this!

Never thought of doing it this way. I will try both of these. Thank you!

The double match nagged me. So I came up with the code below instead. Not implying that your solution was bad – I just flinch at repetitive patterns :wink:

  const timestamp = s.match(/\d\d:\d\d(?::\d\d)?/);
  if (timestamp) {
    const reversedTimestamp = timestamp[0].split(':').reverse();
    const t = +reversedTimestamp[0] 
        + 60 * reversedTimestamp[1] 
        + 3600 * (reversedTimestamp[2] || 0);
  }

It uses only one match and processes the whole match instead of capturing groups (i.e. either hh:mm:ss or mm:ss)

  • split(:) converts the match to an Array, which reverse then turns upside down
  • Now we have an Array with [ss, mm] or [ss,mm,hh]
  • The final line calculates a time value in seconds from that. If no hh entry (reversedTimestamp[3]) exists, it’s replaced by 0.

Caveats: the RE is not very robust, it assumes that time stamp components are always specified as two digit values and that the separator is always a colon. I have no idea how probable three- or one-digit hours are, though.

Remark: You do not need to apply + to every string in your calculation of t. Doing that for the first one suffices because that will turn the whole expression into a numeric one.

1 Like

Thanks for the lesson! I was actually waiting for your instructions on that part of the script :wink: I like your solution. Beautifully clean.

HH:mm:ss is the timestamp format for SRT files. I omitted the milliseconds. My impression is that SRT is the most widely used format for subtitles, and I haven’t actually seen a subtitle file in another format. Therefore I decided not to consider alternative timestamps.

1 Like

I agree that in that case, the RE is just fine. And who wants to watch a video that’s longer than 99 hours, anyway.

When I paste this into Script Editor and compile it, I get the following:

Syntax Error
Expected expression but found “>”.

You must set Script Editor’s language to Javascript. Top upper left in the script window.

Wow, I’m a genius. :man_facepalming:t3: Thank you so much. This is an amazing community!!

It is, isn’t it :smiling_face::heart:

3 Likes

Thank you again for sharing this script. Had no idea doing this was even possible. I’ve been trying it out all evening on some very long transcripts, and this is going to save me HOURS of research time! I’m very grateful.

2 Likes

Hello, this looks very promising.

I am new to Devonthink and Mac in general so getting the lay of the land here.

Is there an updated version to this code for DEVONthink 4 pro?

I have MacWhisper which gives me very accurate .srt files.

I wish to have the timestamps not only searchable but clickable so that when you click any timestamp in the .srt file (which currently I have separate to the .mp4 file, linked by URL), it takes you directly to that point in the video and you can play from there.

Would this be possible? I haven’t got a coding background, making an attempt to learn the basics for running scripts etc.

How would I add the script above into my workflow and is it a once off add or does it need to be ran every time I import the new video + transcript files?

Thanks guys

DEVONthink 4 is able to transcribe audio & video files too, see Settings > AI > Transcribe and menu Data > Recognize. E.g. transcribing to annotations includes clickable timestamps.

Yes but from what I seen in the screencastsonline tutorial, the annotations return in large blocks or paragraphs of a few minutes per timestamp. I want something more granular to the second / minute like what an .srt file offers.

Welcome, @theseamaster

.srt is a plain text file. Plain text has no text styling, so I don’t think that’s possible.

macOS can detect URLs in plain text and make them clickable, but a timestamp is not a URL. I guess you could replace all the timestamps with x-devonthink-item:// links, but that would be unreadable IMO. And it would no longer be a proper SRT.

I think that’s one explanation for approaching the issue like @meowky did with the script.

You would need to run the script every time you want to “open” the timestamp as a link. The script creates an item link on the fly based on the selected text and the URL property of the item, then launches it. It doesn’t acutally change any text in the transcript file.

As meowky explained:

I don’t really know JavaScript, but I think it might be enough to change the application name to make it compatible with DT4. That is, instead of
‌const app = Application('DEVONthink 3');
you would use
‌const app = Application('DEVONthink');.

To add the script to DEVONthink:

  • Open the Script Editor application
  • Create a new document
  • Set the Script Language to JavaScript in the pop-up menu
  • Copy and paste the script code
  • Compile the script (click the Hammer icon or choose Script > Compile in the menu bar)
  • Save it with a name that makes sense to you
  • Move the script file to DEVONthink’s script folder. (See Automation > AppleScript > Installation in the manual or built-in Help).
    You can easily find it by selecting “Open Scripts Folder” from DT’s Script menu (the scroll icon)

Now you can launch the script by selecting it in the Script menu. But for something like this, you probably want to use a keyboard shortcut. (See: How to Assign Shortcuts to Scripts)

If you want clickable links, you need to convert the transcript to a different format like markdown or rich text and then convert the timestamps to actual links.

Okay. I had a transcript of a podcast episode lying around and converted the SRT to markdown in DEVONthink 3.

I actually managed to write a JavaScript (my first!) that converts the timestamps to item links with the proper time parameter. I used @chrillek’s code for the time format conversion.
It runs on the currently displayed document, but that can of course be changed.

(It only works on markdown documents, and the URL property should still be an item link to the audio/video file.)

(() => {
  const app = Application('DEVONthink'); // for DT3 use Application('DEVONthink 3')
  app.includeStandardAdditions = true;
  
  const r = app.contentRecord;
  if (!r || r.recordType() !== 'markdown') { // for DT3 use r.type()
    app.displayAlert('No markdown document open');
    return;
  }
  const baseURL = r.url();
  if (!baseURL) {
  	app.displayAlert('No item link provided!');
    return;
  }
  
  const txt = r.plainText();
  const regex = /\d\d:\d\d:\d\d/g;
  
  const newTxt = txt.replaceAll(regex, timestamp => {
    const reversedTimestamp = timestamp.split(':').reverse();
    const t = +reversedTimestamp[0]
      + 60 * reversedTimestamp[1]
      + 3600 * (reversedTimestamp[2] || 0);
    const timeURL = `[${timestamp}](${baseURL}?time=${t})`;
    return timeURL
  });
  r.plainText = newTxt
})()

All the item links add a lot of text. My test file more than doubled in size from 63,6 KB to 154,2 KB.

It’s much more efficient to use a HTML base element instead of repeating the item link a million times. That way the file size only increased to 81,3 KB. But note that these links only work in markdown preview mode – the document needs to render for the base element to take effect.

(() => {
  // ...
  const baseTag = `<base href="${baseURL}">`
  
  const txt = r.plainText();
  const regex = /\d\d:\d\d:\d\d/g;
  
  const newTxt = txt.replaceAll(regex, timestamp => {
    const reversedTimestamp = timestamp.split(':').reverse();
    const t = +reversedTimestamp[0]
      + 60 * reversedTimestamp[1]
      + 3600 * (reversedTimestamp[2] || 0);
    const timeURL = `[${timestamp}](?time=${t})`;
    return timeURL
  });
  r.plainText = [baseTag, newTxt].join('\n\n')
})()

I think standard SRT timecodes do include milliseconds? The script above doesn’t account for that. This modification accounts for milliseconds. Though it could probably be more elegant:

  const regex = /\d\d:\d\d:\d\d(?:,\d\d\d)?/g;
  
  const newTxt = txt.replaceAll(regex, fullMatch => {
  	const timestamp = fullMatch.match(/\d\d:\d\d:\d\d/)[0];
    const reversedTimestamp = timestamp.split(':').reverse();
    const t = +reversedTimestamp[0]
      + 60 * reversedTimestamp[1]
      + 3600 * (reversedTimestamp[2] || 0);
    const timeURL = `[${timestamp}](${baseURL}?time=${t})`;
    return timeURL
  });

wow ok I will have to test this when I am back in the hot seat. would the script above be a one and done run or have to re-run it manually every single instance as was said above? That wouldn’t be practical as I have 100s of videos I will be working with. Thank you for the detailed response. SRT do include milliseconds yes like seen in image below. Also I have Mac whisper app which allows for multiple export formats but .SRT is the only one with the exact timecodes.

to clarify, I wouldn’t mind adding the script to each video / srt to link them that would be straight forward enough what I meant by not being practical is linking every timecode to every point in the video manually using the script above. I gathered from what you said that with your custom javascript it can convert the timestamps in the .srt file to item links when URL property is linked to the media file? That sounds exactly like what I am looking for.