Importing suddenly slow

Hello, and a good holiday time to all!

Is someone out there who could give me some advice? I presented myself with DTP for Xmas, and am really happy playing and exploring. :smiley:

But I ran into some problem: For testing I tried to import a large collection of files, but noted that after a while the mac did not release the screensaver – backlight went on but screensaver stayed – requiring a force quit.

The imported folders were archived journal articles, consisting of a HTML article and a number of JPG thumbnails and full-sizers each. I explored with a script (took me a while to learn and write, had some fun) that cut the import into slices (of one journal issue, hence the varying slice sizes) and captured the times required to import each. This is what is said:


Database   Import    Files
Content*   Time**    HTML      JPG       Other     Total
0            108        241        576        124        941
941         168        236        752        132        1120
2061        159        277        566        150        993
3054        223        278        738        148        1164
4218        216        270        632        154        1056
5274        318        314        862        192        1368
6642        362        305        828        168        1301
7943        328        296        688        140        1124
9067    => 345        246        702        136        1084
10151   => 842        274        736        146        1156
11307       1275       263        596        148        1007
12314       1702       266        854        140        1260
13574       1459       270        638        156        1064
14638       2018       286        952        162        1400
16038       1709       282        660        154        1096

* = number of previously imported files
** for the slice of files in that line; in seconds

(I have a nice diagram, but how to get it into this post?)

(If you want to compare mileages: AlBook G4, X.4.10, 1GB, 1.5 Ghz, DTP database and import files on internal harddisk)

I was surprised to see that the JPG had the largest impact on importing time, as the parallels in the later third of the imports show (in a diagram lines of JPGs, Total, and Time are almost congruently parallel). Just for curiosity: why is that so? There should be no words to be indexed. :question:

But what really worries me :open_mouth: is the sudden jump in required time after a certain number of files (between 9000 and 10000) have been imported. What is the cause of this behaviour?
And really important: how can I avoid it? More RAM? (BTW: what would be the max for an AlBook G4?)

All the best for the New Year, (including many updates and new versions.)

Br@

The increasing import times are RAM dependent.

There are two impacts of addition of new content to a database.

The first – and smaller effect – is that the database gets “spread out” somewhat. It’s a good idea after adding a large batch of new content to run Tools > Verify & Repair, followed by Tools > Backup & Optimize. Optimization ‘compacts’ the database, making reading data more efficient.

The second – and often significant impact – is passing the threshold between memory operations in RAM to use of Virtual Memory. Apple’s Virtual Memory allows processes to continue even after there is no available free memory in RAM, but at the expense of swapping data back and forth between physical RAM and the hard drive. As reading and writing data to disk is much slower than those operations in RAM, memory-intensive operations can slow dramatically. So construction of the database Concordance (analogous to indexing the text of new content) will slow down when it happens in Virtual Memory.

Your PowerBook G4 is probably limited to 1 GB RAM. The current MacBook and MacBook Pro laptops have a maximum of 4 GB RAM. There’s an old truism: “RAM is good; more RAM is better.”

There are some tricks to speed things up when Virtual Memory comes into play. Quitting other open applications can help. Quitting, then relaunching DT Pro can restore fast database operation, for at least a while, as can restarting. Creating topically designed databases to keep them smaller may be feasible for some content. As I’m managing more than 150,000 documents I’ve got them split into a number of topically designed databases, and that works well for me. (My MacBook Pro’s 100 GB HD couldn’t hold all my databases at once.) But my next Mac laptop will have 4 GB RAM. :slight_smile:

Hello, Bill,

now that is what I call a fast response. :smiley:
Thank you for the information. I will try to have the script run Repair and Optimize in between the import chunks.

Would quitting&restarting DTP after that be helpful, as a re-initialization of RAM usage? Or does DTP that by Repair or Optimize?

It appears that the main problem is the virtual memory threshold, but now that I know it for my import scenario to be after 9000 files, I could enlarge the chunks to some 8000, couldn’t I? Or could that be problematic in ways I don’t know about?

And, I was told today, I could go as far as 2 GB of physical RAM, which is on the shopping list now.

Greetings,
Br@