Searching Closed Captions in Video Files

I’m sure there are other workarounds, but it’d be awesome if there was a way for DEVONthink to be able to index and search embedded closed captions in .mp4 or .mov files. I like to save YouTube videos, etc. that sometimes have embedded closed captions to my database for later reference, and it’d be cool if all the content inside that video became part of DEVONthink’s database.

Never used this but if you could post a download link then I’ll check whether this might be doable in the future.

Sorry for the delay here. Did you want an example .mp4 with embedded captions? Or just an .srt file?

I too would like to see DT search videos with embeded close captions. I see that the video player inside of DT can enable subtitles and DT can read SRT files. Attached are both a SRT file with sample captions and a mp4 embedded with those captions.

For the time being, I keep the captions as text inside of the mp4 file in the annotation section. The downside is that if I set the scope to the group containing the mp4, nothing is returned as annotations are kept in a different location. Searching annotations yields results but navigating to the video is too much hassle as it requires searching for the video which clears your search results and when found said video it requires scrolling to the timestamp noted in the SRT file.

It would be very helpful if DT could read a SRT file or mp4 with embedded captions and auto generate timestamp (frame links as referred in DT) inside of the annotion section of the mp4 with the captions. (12.6 MB)

I understand srt is a human readable text file. Would it work if you simply duplicate it, and rename the extension of the duplicate to txt?

I presume the real challenge is in embedded subtitles without an accompanying srt or something alike.

Yep, already doing that by copy & pasting the SRT contents into the mp4 annotations.

Yep, text annotations would be more helpful if they were links that DT auto creates/recognizes from SRT format shown below. Likened to the way PDFs can be OCR’d with a click of a button. The idea would be to take the start time and create a DT frame link. When the link is clicked, it would take you to the video, to that specific frame/time.

This can already be done manually by clicking the gear :gear: button of the video player and selecting copy frame link. However, aint nobody got time to be creating frame links for hundreds of individual captions for one video when you have hundreds of online course videos. :sweat_smile:


Are there any updates on this? Or did you still need me to upload a sample .mp4 with embedded captions?

This isn’t supported currently and the next release(s) won’t change this yet.

Okay thanks for the update.

This would be amazing.
Currently I use for searching transcripts and I can sort of work out a funky way of indexing the Descript folder with DT3, but it’s a bit slow and heavy with the process.

Would be great to see the videos all linked together with the text all native inside DT3.
This post was in 2020. So thought I would just say, I’m still interested in this and think that with all the video materials out there, it would be valuable for anybody to identify relationships between videos using DT3’s AI.



it´s great there is willingness to look into this by DT!
– and I would second this also – for several reasons (of different nature :slight_smile: ) :

one, is because I also have practical demand here.

two out of principal reasons I have given elsewhere

three is, that I just found that my vimeo pro account also allows for automatic captioning; which also shows where the train is running to in terms of information work with video and it´s possibilities.

would be really, really awesome if one could tap into this universe of augmented information, which already exists in text-form.

@jsn: I am also interested in all this, and do think video – especially where it is augmented by some kind of explicit text info – is (potentially) an elemental source for info-work. and just like you, I would be thrilled if the ways to ‘plug’ (‘meta-texted’) video into DT´s AI, and generally into the DT-system!

– Given you seem to be working with videos and DT, and given your interest, I would be very interested to hear how you practically go about working with DT and video ‘as is’ – and what your ideas about possible scenarios are. So, would be great to hear about your perspective and experiences here:

Would be great to learn from each other!


PS: also this invite to share experiences around video & DT also goes to @MosCool_Noel and @joshgibson, obviously

What I am doing is simple.

I save the videos into DT3.
I use Descript to edit and perfect the transcripts at least for the technical terms that are important and often misunderstood by the transcription services.
I export those transcripts to the clipboard and paste them as RTF into DT3. This captures my notes and highlights.

I have also found it effective to use DT3 as an indexer for the Descript folder that is local on my computer. Unfortunately Descript doesn’t use filenames that are the same as the video titles, but that’s okay because the video titles are in the json files that have the transcript, so if I find keywords, I can locate the video.

1 Like