difficulties importing

Just a thank you for allowing us to try this most interesting app.I was able to import mail using the applescript.It would be nice if there was some sort of indicator that it was working.It was relatively slow on my imac 500 mhz g3
I tried importing my documents folder from os 9 (145 mb) but after 7 hours it seemed to stall,Any ideas? also an os X documents folder only imported a few items.

regards

    arthur

Importing documents shouldn’t last for hours (a G5 usually needs 1-3 minutes to import 145 MB, a 1 GHz G4 Mac needs 5-15 minutes). Could you provide us some non-working examples (e.g. stuffed files or folders)?

It was the documents folder that os x creates from the os 9 files when you originally install os X.it was basically a sereies of nested folders and files, appleworks, pdfs, images etc…nothing stuffed.approximately 145 mb

  arthur

Did you use TextLightning or pdftotext (see Preferences > Images & PDF)? Because sometimes TextLightning stalls and DT waits forever for TL to finish its job. Just cancel the whole process under such circumstances and import the missing contents afterwards (and you might want to deactivate TextLightning too).

thanks for the help.I didn’t have the text lightning preference checked so i’m not sure what it was.Anyways now that i have a better idea re what can and cannot be imported and what transformation of files are necesssary then i probably will not have the same problem
By the way what other formats will be supported in the standard edition?

   arthur

The next formats will be OPML/mbox (PE) and CSV/TSV (Pro). In addition the support for Word and Postscript documents under Panther will be improved. Afterwards we’ll add Quicktime (audio/video) and TWAIN/ImageCapture support/import (maybe including an OCR interface).

Thanks.The easier it is to get files into devonthink the easier it will be to use the wonderful data management/analytical tools that seem to be the crux of this wonderful programme.personally i wish i could dump my whole drive into devonthink, let it catalogue everything(with automatic updates) and then let me pan for gold or at least  something semi precious.

   arthur

I tried dropping an entire disk onto DT.  The disk contained lots of text and pdf files, source code, and images.  Things started out fairly quickly but soon bogged down.  After one week (!) DT appeared to be about 75% complete.  Each new file would take hours to import while thrashing the disk.  The disk thrashing was making other applications unresponsive.  When I tried cancelling, DT never stopped, even after importing a couple of more documents.  I finally force quit DT.  The four data base files totaled around 500 MB.

I have several comments about this:

  1. I would really like to be able to just drop an entire disk or set of folders on DT and not have it choke and other applications become unusable.

  2. I would really like to be able to drop the same disk or folder on DT a month later and have DT add the new files.  Changed files should retain both the old and new contents - perhaps with a preference to just replace.  Moved files should be updated to reflect their new location.

  3. Multi-threading would work wonders.  At the very least keep the user interface live so that DT does not appear to be dead.  Once the interface is separated from the database, request could be queued.  If the data base allows, multiple inquiries while scanning/importing would be ideal.

How much data did the disc contain? How much memory was available? And finally - how fast is the processor of your computer? E.g. a G5 imports 500 MB within 10-15 minutes and a more common G4/800 needs around 1-2 hours to finish this job.

BTW:
I doubt DT will ever fit to the usage scenario you’ve described - a content indexing application would probably be more practical than a database like DT.

The disk I was importing only contains 4 GB of data and the computer is a 500 MHz G4 with 768 MB RAM running MacOS X 10.2.8.  The database already contained probably 200-300MB from other folders that I had imported.  Worked very well and let me find some files I did not remember having.

The disk ("OldAndInTheWay") contained files  that I had either collected or generated over 5 years ago.  If this had worked, I was going to do the same with a more recent disk ("SeldomeScene") that contains infrequently used files. I occasionally add a batch of files to this disk.  And then there is the unorganized, current folder that contains a variety of pdfs that change daily…

I have not seen a content indexing application that supports as many different file types, let alone web pages, my own comments, and the kitchen sink.  Even for just pdf and text files I prefer the capabilities of DT compared to content indexers.  

It seems to me that DT is very close to being an ideal application for organizing and finding content located on disks and folders.  Why do you say it would not be good for this?  Once the reason for the glacial import is fixed the only problem I see is keeping the database up to date.  If DT works as well as I thought it would, I could even see changing the folder structure to reflect the date the files were added to the database and using DT pro organize the files.  

If DT is the wrong tool, what is the right tool?  MTLibrarian might work for this but it crashes on me while indexing.  EasyFind does not look inside pdf files.  Sherlock does not work at all for me under MacOS X.  What else?

I didn’t want to discourage you but using a freeform database like DEVONthink usually causes a speed and memory penalty (indexing should be faster and need less memory than a DT database). But you’re right - there are not many alternatives out there.

Therefore one possible solution would be to use DT pro and create multiple databases. And the upcoming "Synchronize" command should simplify the task of keeping the database up to date.

Finally, one little workaround is to import only "small" packages (e.g. 100 MB), quit & relaunch DT afterwards and continue with the next part. That way the memory management of OS X gets a break.