Two other users posted a request in two different places as to how to get a single large tab-delineated text file into DT as individual files. As far as I know DT does not support this directly, but I figured there must be a work-around. It got me thinking and I just figured out how to do it and ran a successful test. So, here’s how to do it:
Before you begin:
- You will need MS Word, or similar word processing app.
- You will need to download “PDFpen” from smileonmymac.com (They have a free trial period so if you are only going to do this once you could just get it for this only.)
Here’s the trick – I realized that if you could get the tab-delineated file split BEFORE bringing it into DT, then it would be possible. Here’s how I just did it:
-
Take the tab-delineated text file (mine had over 3500+ records from a FilemakerPro database) and open it in Word. Then, do a Find and Replace action: Find = Paragraph Mark, Replace = Manual Page Break. Ran “Replace All.” This turns each record in my file into a seperate page.
-
Print the file, but when the print dialogue box comes up select “Save as PDF” and save the file as a PDF document.
-
Take the PDF document and open it in PDFpen. PDFpen has a Script menu, in that menu select “Split PDF”. It will then ask you where you want the split files saved. Continue on and it will then proceed to split each page of your entire PDF file into individual PDF docs. (Be warned, this script will take a little while…) Once completed, if everything has gone correctly up to this point you will end up with a folder containing one PDF file for each record of your original text file.
-
Open DT. Select the Preferences. Select the “PDF & PS” tab. Next to “Index and Convert” select “Use built-in pdftotext” AND check “Convert to Plain Text”
-
Now, from the Finder, just drag and drop the folder which contains all the PDF files into DT. It will import them all as individual plain text files.
*The only downside to this method is that each file will be named “Page 001…” and so on. You could of course take the folder of PDF’s into a batch renamer program first and give them a more meaningful name (I suggest the freeware “R-Name” www2.mitsuya.nuem.nagoya-u.ac.jp … index.html but I haven’t figured out how to automatically get a name from the actual record info yet… at least your data will be in DT as individual files and you could slowly rename them as you use them (selecting text and using the contextual menu option “Set Title As” I find very handy for naming individual docs).
Hope this helps. Before you try to do this with a file containing thousands of records I suggest you do a test with a smaller file to make sure it handles the way you want. Perhaps someone will have a better idea as to how to do this. Again, I think the trick is get the records split into individual files before you bring it into DT. Of course, hopefully DT will eventually include importing tab-delineated files as a feature.