Do you guys think that I can use regex or another similar method through apple script and/or automator to split the one text record containing this text into several text files, using the numbers as cues to split the text and create each new file?
If yes, has it been done before? Are you aware of suchlike automation?
I’ve been looking around without success, and I gotta tell you: I got myself a 1300 pages single-document that should be split into 50000 text files… and am really afraid of doing it by hand
The doc is 100% consistent, no problem with that.
It consists of chunk of text separated by numbers.
Good question, there will be some checking needed, but again, the text file is super consistent.
Actually, when I said 1300 page long document, it’s just to give you a rough idea of the size when printed. The file I have is just a single text file. But if needed I can also get the same text already divided in several separate textfiles. I just thought that one text file would be a better thing, you know, a one script thing.
Wow, AWK! I used to introduce that language in my upper level computer languages course 25 years ago. These days, I rarely hear of anyone mentioning AWK.
A lot of guides and documentation will assume you’re using GNU awk(1) — if you get into a corner and can’t figure out why something that should work isn’t and it’s erroring about functions that aren’t reflecting the reality of a guide, you should install GNU awk and use that explicitly in your scripts.
Easiest way to install it is brew install gawk via Homebrew.
Should consider a new thread purely devoted to newfangled text processing tools that slice and dice data — some of my favorites lately are ack (neilb.bitbucket.org/csvfix/) and I don’t know how I ever managed to work with CSV files before csvfix because it’s simply life-changing. e.g. You can treat csv files as if they were SQL and generate statements accordingly. And if that hasn’t already blown your mind, performing operations on block selections from csv data and performing basic validation of the contents of a file will probably do it.
Bleh! Haha! Thanks for the input. Just my own two cents - I always limit my dependencies and don’t usually suggest things that have to be installed. Just my personal preference. s’all good.