here is a suggestion/request for another smart-rule action: split at delimiter and/or split using regex pattern. Not sure how feasible this would be, but it seems that most mechanisms are already in place.
I have been dealing with long markdown files that need to be split and doing so using shell scripts is a pain in the arse. Same goes for applescripts.
Interesting concept; how would you set that up? At first incidence of delimiter, or last, or every? Or would that be an optional choice within the rule?
First and foremost, I give this suggestion because I thing it would be a really cool addition to an already very powerful automation tool available in DT3 (one that many apps already have).
@chrillek, this is in my to do list, for sure (but impossible right now).
@Blanc, I think other apps aready provide a nice example of how this could work. See, for instance, Scrivener and/or Tinderbox.
@pete31, thanks, I will take a look at it. Right now it seems that it would be a bit slow though. To update my Wikis glossary (which is not even close to the whole thing) I need something capable of splitting some 1M chars into 500-800 files/records. I will test and post back with the results.
@BLUEFROG, as to my case in particular: I have transfered most of my writting to Scrivener. In order to update my Markdown DT3 Wiki, I now export the files as one long markdown text and split it at h1 markdown headers. With smart-rules it is already possible to properly set the tags, aliases and the name of the records using the option to scan the text with regex. The name for instance is given by the pattern ^# (.+?)\n
As for splitting, this is the only action I need to perform outside of DT3. I have been using a shell script as a folder action:
for d in "$@"; do
cd ~/'Databases/md/MD_Splitter' && csplit -k -n 4 "$d" '/^# /' {1000}
done
Like I said, it is a bit of a pain in the arse, as I am not very proficient at shell scripting. It invariably spits out an error of no match found (but the task gets accomplished nevetheless). I find it slighly annoying that the files are generated without extension, since it throws an error, this in turn means that I needed to add another separate action to change the extension (since DT3 wonāt recognize the files as text files if they donāt have an extension).
Depending on your setup, you could also run the shell script to split the file from a smart rule. Iād suggest a script like this one
#!/bin/sh
cd ~/Databases/md/MD_Splitter && (
for d in "$@" ; do
csplit -k -n 4 "$d" '/^# /' {1000}
done
for d in *; do
mv "$d" "$d".md
done
}
This should take care of the renaming/extension issue (but beware: I didnāt test it at all. Use at your own risk). Also, it changes into your working directory only once. You can of course choose another extension than āmdā in the mv command.
Perl would actually be a bit of an overkill in this situation, since your condition to split is so simple.
Re your regular expression ^# (.+?)\n: Iād go for a $ instead of \n because it matches end of line regardless of the current character(s) used for it. This might be relevant if your file comes from an environment where \n is not used for end of line (Windows comes to mind). The ? shouldnāt be necessary here because you want to gobble up all of your characters until the end of line anyway.
Thanks for the suggestion. Apparently, there is something buggy about the csplit that comes with MacOS. I installed coreutils via homebrew and it works without any problems.
Eventually, there will be two spaces at the end of the line, so the pattern is I am using is actually ^# (.+?)\h*\n, but I guess using $ instead of \n makes perfect sense.
I will see if I can fit it into a smart-rule and how it would work. I canāt remember right now if I can call a shell script directly from a smart-rule or if I have to use either a folder action or an applescript.
something like
name=`head -1 $file | sed -e ās/\h#//ā`
Get the first line of the file with head and feed it to see to remove what you donāt want. As always, I didnāt test it
if either of you have time, could you explain? name=head -1 $d I understand; the -e option for sed Iām not so sure - simply tells sed to execute the following script (or expression), correct? The following s is not part of the regex, but a command, I take it? Subtract, maybe? Then / marks the following \ as a literal character? The // at the end is there because?
In the shell script, what is the meaning of ā&&ā?
(Iām happy for you to say go away, this is not a script/regex learning place; Iām sitting in front of a number of websites, trying to teach myself what it is you have conjured up, but Iām not finding it exactly self evident. As I say, only if you have time on your handsā¦)