Capturing an entire Reddit thread as markdown?

jmay · January 6, 2021, 8:40pm

I’ve been using the Markdown Clutter-Free option in the DEVONthink extension to capture pages, and it mostly works very well.

On Reddit pages it accurately extracts the main article text, but none of the comments. In some cases I would like to capture an entire thread. Is this something that could be automated entirely within DT? I could write my own command-line script to do this, but I’m hoping someone else has already done all the work and I can be lazy…

BLUEFROG · January 6, 2021, 9:34pm

Welcome @jmay
There is no script available for this that I’ve ever seen. Looks like you’ll have to get of the couch…

iupaish · April 25, 2021, 3:33am

https://github.com/pl77/redditPostArchiver does this perfectly. You can even run this from a pre-built docker image!

BLUEFROG · April 25, 2021, 7:01pm

Weclome @iupaish

Thanks for the suggestion. Perhaps it will be of use to the OP.

PS…

sheesh! Lots of dependencies

jmay · April 26, 2021, 8:05pm

Yikes, lots of bits to install here. I’ll experiment with this. Maybe possible to host is somewhere and put a web UI in front of it.

BLUEFROG · April 26, 2021, 11:07pm

Yeah. We’re not big fans of dependencies in here and avoid them when feasible. That’s why we’re rarely heard saying, “Just install software X and osaxen Y, type in brew install lmnop, power the Mac on and off twice while dancing in the moonlight swinging an old OS/2 mouse in each hand, …”"