I’m making sheet music for film music recording sessions. This involves a lot of PDFs for a lot of instruments. My software outputs the files based on the piece of music and sorts them according to the instruments in that particular piece. In this example the Intro has Violin 1 and 2, Cello and Contrabass.
What I am doing at the moment is, after everything has been exported, I make Finder-searches for each instrument and move them into a designated folder, so the printer can print the Violin 1 folder in one go.
My thought was to circumvent this process by applying some rules in DT instead. Selecting all PDFs and classifying just pushes them into my databases.
Is there a way to have DT sort them into subfolders based on instruments? The beginning of each filename is “Project_musical cue name_version numbering” and the end of the file name will always be “- XX_INSTRUMENT NAME”.
I would like to be able to drag the lot in to the DT inbox, perform a rule, and drag it all out again in the correct folder structure.
Side question: where do I find more details and explanation about the expressions and the other settings of your smart rule, so that I can learn them and replicate?
For general stuff about smart rules, see Automation > Smart Rules and Batch Processing in the manual. For more detailed descriptions of individual components, see Appendix > Smart Rule Events and Actions. Item scanning (Scan Name & Scan Text) is covered on p. 252 in the most recent PDF version. The File action is listed on p. 253.
I think Regular Expressions/regEx itself is outside the scope of the manual and only mentioned briefly. It is a feature with a long history in computing, widely used across all manner of software. Chrillek has written an introduction here:
I have found CotEditor a big help in learning regEx. It has syntax highlighting helping you read and understand the patterns (I think BBEdit does the same.) It also highlights different capture groups in your text in different colors, so you get instant feedback. And it has a handly little cheat sheet.
It really is an amazing little editor. I appreciate the juicy Find menu which lets you assign shortcuts to all the individual Find & Replace commands…
I just spotted the Multiple Replace, which I’ve somehow overlooked! I’ve actually really wanted something like that.
Using the Multiple Replacement feature, you can process multiple text replacements at once in succession. The replacement rules can be stored as a named preset and reused when you need.
As @troejgaard mentioned, there is some minimal discussion in the cited chapter, but RegEx is a deep magic
That being said, here is a small extension of the OP’s original inquiry using two captured parts of the name for defining a group and a subgroup then applying another part as a tag on the documents before they’re filed…
would love to know this but damm understanding and learning this takes hours upon hours.
wish there was an AI to summarize this kind of info and get down to 20 min max to understand and create your own smart rule/ batch processing etc.
DevonThink has good gems but dammm by the time I understand it all it would be next year.
especially the info below
for instance in the main Inbox. I have items that contain different tags and would like to move each type of tags to a specific folder. all I know how to do now is to create a smart rule for each tag. if I have like 20 tags I have to create 20 different smart rules? lol
would love to know this but damm understanding and learning this takes hours upon hours.
Most things worth knowing take far more than mere “hours upon hours” to learn.
Do you think we learned this by AI? Or do you think we’ve devoted time and energy to learning what we know – yes, even amidst other duties needing to be done? And the majority of people you read posts from in here are not computer science majors. My background is graphic arts and printing and self-taught in tech.
wish there was an AI to summarize this kind of info and get down to 20 min max to understand and create your own smart rule/ batch processing etc.
You don’t need AI to learn to make a smart rule or batch process. Even most smart actions with an Apply Script command don’t have overly complex scripts. And there are many examples of scripts on these forums to examine and play about with.
DevonThink has good gems but dammm by the time I understand it all it would be next year.
You’re not going to “understand it all”, especially something as deep as regular expressions can go. @chrillek’s post is a good basic starting point on our forums. Our help has a few simple examples and a link to the ICU standard we use for regular expressions.
PS: My illustrations show the regular expression I employed, the smart actions I used with the captured groups from the expression, and even a diagram showing how the captured parts of the name were used in creating groups and tagging a document.
And here’s a bonus, with simple language, of what’s going on in the one I prototyped for the OP, including how it parsed the OP’s filename…
Regex has a reputation for being hard to read (especially for the tech-uninitiated). That does not mean it’s hard to learn. Simple regexes (such as the one used by @BLUEFROG here) are straightforward and very easy to understand, once you have a grasp of the meaning of common morphemes (such as \d, which matches a single digit).
This is IMO not really accurate. Learning regex basics takes perhaps 30 minutes. Getting used to using regex takes hours upon hours, which is a process of learning through actually work (i.e. internship), rather than an upfront cost.
In this regard, learning regex is not much different from learning Microsoft Excel, or another feature-rich tool for work.
Agreed… a basic understanding with contrived examples and stacked decks is usually easy enough.
However, getting very good at regex, i.e., applying it to a variety of data sets, many often with non-conforming information – takes a much longer time.
So what you’re learning for your current problem may not be the answer to future ones. It depends on the data to process.
It’s like with a natural language – that too takes time to learn. Or to play a musical instrument.
And above all experience, which comes with using regular expressions or a natural language.
Back to your task to sorting files by their tags into different groups. Regular expressions don’t help you there, I think. A simple script, however, can do that. I’ll try to take a look at that later.
Thanks for the great explanations @BLUEFROG. I feared it might have something to do with RegEx
I can see I needed to be more specific in that the first bit of the text string can literally be anything since projects have their own formatting, version numbering, and also different numbers of underscores and dashes in the file names, so I’m really only trying to capture the two last digits, underscore, and instrument name. They are always right after the last or only instance of " - ".
If you have the time I would greatly appreciate a nudge, but otherwise I’ll try the regex101 website to try for it. I have already build a couple of regex’s before.
By the way: The reason I’m dragging it back into Finder is to share it to the production server.
followed by at least one of anything, which is captured in the first (and only) capturing group.
So, that capturing group would contain “the two last digits, underscore, and instrument name”. I assume (since I didn’t really understand your requirements for the target folder name) that you want every instrument in its own folder. So, instead of using (.+) at the end, you’d use \d+_(.+) to put the instrument’s name in the first (and only) capturing group.
Combine that with - to - \d+_(.+) and you get something that works with… well only those filenames with only one dash in them. But you said you wanted to target the last dash.
Which is the perfect use case for the greediest RE of all, namely .*. That one matches any character any number of times – from zero to potentially infinity. The complete RE would be
.*-\s+\d+_(.+)
translating to
Match anything up to a dash
followed by at least one space (\s+)
followed by at least one digit (\d+)
followed by an underline (_)
Save anything after that in the first and only capturing group
I threw in \s+ to make sure that the RE matches also filenames with more than one consecutive space characters in them.
It all depends on the input and on what “work” means. For the input
“-Intro_v1.2.3 - 01_VIOLIN I” (see your first post)
your RE .*-\s+(.+)
will save “01_VIOLIN I” in the first capturing group. If you then use \1 in the File action of your smart rule, the file will be put in the group “01_VIOLIN I”. Basically, all files mentioned in your first post will be put each one in their own group. Is that what you want?
In your post, you said
which I’d interpret as “I want a folder for VIOLIN I, another one for VIOLIN II and another one for CONTRABASS”.
That is not what your RE will give you with the sample data you posted. But the one I posted previously does.
Unfortunately, your terse description does not explain why the RE I posted did not “work” – perhaps my assumptions were not correct? Was there a syntax error? Was the input different from what you posted before?
Without the necessary information, it’s not really possible to have a fruitful process here.