Renaming files according to a regex pattern

Hi. I’ve read a number of the other posts dealing with this issue, and I seem to be doing what you recommend in those other posts (as well as in the manual.

The regular expression is: (ABC\-\w{3}\-\d{8})

I.e. to match some text like ABC-JLU-08798799

I’ve tried this as a ‘batch process’ (which won’t even allow me to click past ‘ok’. It just refuses to take this.

I can set this up as a smart rule, but there I’m choosing the option ‘content matches’ and I’m unsure if this is doing anything with regex expressions or not.

I’d like to rename the files to match of that regex so I type \1 as the renaming parameter.

Neither of these options works. Can anyone explain what’s going on, or how I should be doing this better?

Thanks!

In the smart rule Actions panel select “scan text” and “regular expression.” In the next (or a following) action line, select “change name” and enter the “\1”.

Edit: I haven’t used Batch Process, but it appears to have the same selections as the smart rule…scan text—>Regular expression, Change Name

@strickvl:
Tested batch process with your example and it works. Smart Rule looks the same.


1 Like

@wmc is correct and I have confirmed it’s working here too.

Thank you @BLUEFROG and @wmc. I’m not sure what was going on previously, but it is working now. Perhaps it had something to do with the folders (documents being batch processed) containing very large numbers of documents. I have replicated this with a very small test batch now. I guess I’ll have to do this with smaller batches each time rather than 100,000+ in one go…

I guess I’ll have to do this with smaller batches each time rather than 100,000+ in one go…

Yes, this would strongly be advocated. As I’ve mentioned on these forums more than once, my father advised me, “You know how to eat an elephant? One bite at a time.”

2 Likes

I seem to have encountered the same problem even within the finder. Some operations (like deleting files) appear only to work with c. 15,000 files at a time. If you give it more than 20,000, it will just spin and spin.

From various other people and stackoverflow I think I hit some sort of memory barrier. It tries to load the entire set of files (or the list of all the names) into memory or something like that and above a certain point it’s just too many. There are other ways of batch deleting files on the terminal, but these amount to doing what you just said: you iterate through the list of files, one by one, performing the deletion of each file as you get them. I guess whatever Devonthink does with the batch process it seems like it’s more of the former than the latter.

In any case, lesson learned :wink:

Hey, I did not want to start a new topic, as I have a similar question. After building and using my Databases for several years now, I noticed a flaw in my file naming procedure. All files are named:
YYYY-MM-DD Originator - Topic.pdf

Originator can be a company, shop, authority - it can be more than one word
Originator and Topic are separated by a hyphen
Topic can be multiple words

Now this sorting is nice and useful for most folders - but I figured that sometimes it makes more sense to have it like:
Originator - YYYY-MM-DD - Topic

And this is my question, can I use RegEx to manipulate the file names? In Finder I used betterRenamer for such things, and that would be a workaround. But I would like to use more of the features in DT… any hint?

Thanks + Regards,
Nils

Yes. There’s a script in DT’s script menu under the heading “Rename” (no surprise there).
(\d{4}-\d{2}-d{2})\s+([^-]+)\s+-\s+(.*).pdf
and then replace that with
\2 - \1 - \3.pdf
No guarantee, of course. And I’m not sure about the final pdf – it’s quite possible that DT does not consider this, so you have to leave it out of the RE and the replacement string.

2 Likes

Thx for this answer - help me a lot to get started with RegEx based renaming :smiley:

Just one remark - \d{4} didn’t work for me, but rewriting it to [0-9]{4}) finally did it.
Of course would have been nice if there would have been any error / log messages pointing to why the renaming didn’t work (source regex not matching? replacement string not working? not enough / to many items??)…

Welcome @fex

There is no known issue with \d{4}. I just tested it with no issue. However, yes the range would also work.

My goal was to reform file names formatted as YYYY_MM_DD_text to YYYMMDD_text

At first I tried (\d{4})_(\d{2})_(\d{2})_(.*)\1\2\3_\4 - and my file names didn’t change at all.

After changing it to ([0-9]{4})_([0-9]{2})_([0-9]{2})_(.*)\1\2\3_\4 my file names finally changed…

So for me it looks like the \d syntax was at fault - but I would be interested to hear what else it could be…

EDIT: DEVONthink V3.8

1 Like

And please ignore the typo in the filename in the RTFD :stuck_out_tongue: The actual filename is correctly shown in the window’s title bar.

Interesting :thinking:
Could that have anything to do with using the Rename with RegEx script vs Smart Rules?

Yes.
Were you using the script?

Yes

This script uses the command line utility sed. Since I have no access to my Mac, I can’t check right now, but it’s possible that this tool still does not understand the newer (like more than 20 years old) syntax. Apple is not known for updating the GNU utilities often.

1 Like