I need help replacing em dash (—) with two hyphens (–) over 21,000 + notes. Any help at all would be greatly appreciated. Thanks!
This is indeed tricky but these threads might be useful:
Personally, I would probably drop in to the UNIX world to do that. But that certainly isn’t for everyone. Something like this from the terminal:
Script removed because it could modify binary files if the user is not careful. See next post instead for a safer version.
Naturally, make sure you have a backup first.
I really should have added a few caveats to that script. Don’t let it work on binary files like pictures or PDFs. Obviously you need to change the “/path/to/your/DEVONthink/database.dtBase2” part to match your database location, but you do want to keep the “/Files.noindex” at the end.
This version will only modify txt and rtf files and so should be safe.
cd /path/to/your/DEVONthink/database.dtBase2/Files.noindex
find . -type f \( -name '*.txt' -o -name '*.rtf' \) -print0 \
| xargs -0 grep -l '—' \
| while read file
do
echo "Fixing: ${file##*/}"
sed 's/—/--/g' "${file}" > /tmp/new
touch -r "${file}" /tmp/new
mv /tmp/new "${file}"
done
If you just want to see a list of files that have the em dash without actually changing anything, try this.
cd /path/to/your/DEVONthink/database.dtBase2/Files.noindex
find . -type f \( -name '*.txt' -o -name '*.rtf' \) -print0 \
| xargs -0 grep -l '—'
Is there a purpose for the “rm …” after the “mv …”? That’ll generate “rm: /tmp/new: No such file or directory” errors, unless it’s “rm -f …”.
Doh! you’re right. The rm is unnecessary. I’ll edit the post.
Your script’s ${file##*/} usage inspired me to look up ${parameter##word} expansion in the bash man page and realize several of mine could be using that instead of running a basename process.
Thanks for your help. I’ll try to digest this and let you know what happens.