Help creating external files

Recently my genealogy database program started to crash. I had it, DA and DTP open at startup and would switch between them. Since I never experienced that behavior during 4+ years, I became suspicious that I had insufficient RAM to run all three (512MB). For a week or so now I have run each separately and have experienced no more crashes.

While working through this problem I noticed my DTP database is already at 2GB and I have barely scratched the surface of storing the many documents that I will eventually need to store.

I would like to turn all of my documents into external files. I think I understand I would then merely index each and thereby reduce the size of my DTP database. Is that correct?

Secondly and the “biggie”, how do I turn them into external files? Do I export them, index them and then import the index? Or do I index them first in DTP and then export the originals leaving the indexed files in DTP?

I also am very concerned about what I need to do with the replicates. Here is an example of what I have done so far to help understand what I am asking. I locate a census copy using DA and put it into DTP as a PDF + text. That gets moved into a group called Census and into a subgroup for the particular year of the census. From that original I replicate a copy for each of the family listed on the page; so for a family of parents and 12 children I replicate the original 14 times. One goes into a group for each of the 14 persons.

Now, my question(s). How do I create the external files and not lose the replicates? Is it possible? Do I need to export the replicates as if they were the original and then index them?

The bottom line is that I have mulled this over to the point that I am totally confused as to what will work and what won’t work. So any help that can be offered may need to be sort of a “lead me by the hand approach”.

Any one willing to offer comments?

Wayne:

If you have been importing PDFs by copying them into the database Files folder, there would not be significant memory saving by changing to an Indexed database.

I don’t know how large your PDF documents (census record pages, I assume) are.

512 K RAM is pretty light for building a big database. I don’t know what Mac you are using, but you might investigate adding RAM, which is pretty cheap these days. OS X alone is using about half of your available RAM.

I’m not sure about the logic of replicating a PDF 14 times for a family of 14. Although there’s only one original PDF file, you are replicating the text content 14 times. If there’s extraneous text (about other people), that will add unnecessarily to your database size. How about creating a rich text document for each person, and using a static hyperlink from a string in the text to the census record document. E.g., for a child a rich text document named “Harold Smith”, perhaps comments, then type and select the string “census data”, control-click on the selected string and select the contextual menu option Link to, choosing the appropriate census page that holds the reference data. That will take less memory, but add the advantages of allowing notes, comments and links in the document. For example, I would imagine that Harold Smith will grow up, marry and have children – this approach would allow you to create and link to another text document for him as the father of children, and so on.

Here are the statistics on my database:
Groups: 291 (2 replicates)
HTML Pages: 1
XML files: 0
Plain Text: 1 (1 replicate)
Sheets: 0
Quicktime: 0
Images: 239 (332 replicates)
Web archives: 0
Links: 0
Rich texts: 100 (106 replicates)
Records: 0
Words: 1,833 unique, 31,743 total

This is one thing I am aware of. I have an eMac with a PPC G4. I can upgrade memory to 2GB and I intend to do so - maybe 1GB at a time. But as I mentioned before, with my database already at 2.05 GB, and I’ve only begun to scratch the surface of what I want to store, even 2GB of RAM might soon be insufficient.

I’ll have to try this approach. From reading it I’m not sure I understand what the end result will be. I’m not sure, if I’m establishing a link, why I need an RTF. Could I merely link to the original document instead for each of the persons? My picture of what I want to end up with is the ability to open a group of documents, for example, 1910 census records, and see all of the records for that year. Conversely, I want to be able to open the group for myself, for example, and see not just that census record but also any other records such as a birth certificate, an employment record, a marriage record, and so forth. None of the information will change and whatever notes, comments, etc. needed I would keep in my genealogy database. That database, BTW, will have the equivalent of a footnote (citation) but also has the capability to link to files outside the database and open them for viewing. That is another reason for thinking about using external files. So you can see I’ve come full circle in my thinking. Maybe I should ask the question this way, forgetting about the size of my DTP database for a moment, what advantages and disadvantages are there to taking all my existing files inside DTP and making them external? Would I even need to index them? might I just link to them as necessary?

Adding 1GB of RAM will make a world of difference in your daily interaction with your Mac. It’s the cheapest solution with regards to time spent on your project. I’d start out with that before trying anything else.

Hi Community,

I do realy agree with anard. All I’ve read is that UNIX need as much as RAM as it get
And DTPro and DA (with an internal Archive) too. Thats o.k. Its a Database you are working with.

So hear my experience: I have a Database of 1 GB + 222 MB of Files into it.
So I think is importent to check the size of your database. Take a look at the “Files” Folder and substract the internal Backup.

Next I checkt the need of RAM-usage for example with this little free app “ExpireSEO” and maybe the free app “Raging Menace - MenuMeters”.

You have control over your RAM and the “page in and outs” and much more like the virtual memory and swapfiles.

I use 1 GB RAM and for a database of netto 1 GB is’nt it enough (till V2 of DTPro!?). After a working time of any hours and openings and closings of several databases I lost more and more of free RAM, and my PB starte to “page in and outs” and begun to swap files (take a look at private/var/vm). The search starts to run slow. And the “pizzawheel of death” appears.

So I decide to buy an aditional 1 GB more of RAM (in the verry near future) as ONLY solution.

If I understand the functioning of DTPro there are any handycaps, e.g. searching of phrases if you 're only indexed external files.

O.k. thats my experiences. And I’m interrestet in to hear about the experiences of other users: How many RAM your are using, the NETTO-Size of your database (NETTO means without internal backups plus Size of Files-Folder)

Thank you and a nice day.

Nevertheless many thankx to all the guys of DEVONthink for the marvellous apps and and support.
My credo for DEVONthink: On contents it depends.

Using OS X 10.4.6
Using DEVONthink Pro 1.1.1
Using DEVONagent 2.0.1

Since v1.1 the only limitation is that indexed files can’t be edited but everything else is possible (See Also, Classify, Phrase Searching) if the files have been indexed by v1.1 or later.

Do You have checked the need of RAM if I used a only indexed database?

Need I less RAM maybe? It’s only a question for my information and maybe the rest of us.
I personaly need the ability of editing inside my main-database?
But I do have other databases for a read only using.

Thanks a lot.

Yes, indexing needs less RAM (usually around 50% of importing).

Dear Mr. Grunenberg,

do I finde a complete comparison about the pro and cons of a “real” database and a only indexed database (as seen above: phrase search and the need of RAM and so on…)?

Maybe you say: Read the manual?

Thanks again

There’s no comparison yet available but the major differences are…

  • indexing requires less RAM (around 50%) and creates smaller databases
  • indexed files aren’t editable
  • indexing requires the original files (as they’re referenced and not added to the database)

I’m still trying to understand some of the things with external files. Here is my latest question.

My Genealogy Research Database is 2.05 GB when I get info. In preparation for some testing, I duplicated that folder and as expected the copy was 2.05 GB. I then opened DTP and selected everything in the copy and exported to a folder on my desktop. Lo and behold, I have somehow lost .82 GB of data because the new folder when checked with Get Info is now only 1.23 GB. Obviously, even though I used the command Select All to export either not all were selected or the export was not 100%?

Actually I’m not sure if you’re talking about folders in the Finder or about groups in DEVONthink. If you’re talking about groups in DEVONthink, then the size shown in the Info panel is not comparable to the size of (exported) files/folders.

E.g. the size in the Info panel includes meta data, indexes, thumbnails etc. but not external files. In addition, DEVONthink uses 16-bit unicode internally.

I guess I don’t know the answer to which I’m referring to. I checked everything by using the Get Info command in the Finder window. E.g. I checked my database folder which is located in my documents folder using the Finder for the first comparison. I made the second comparison on the copy of my database using the Finder. I then created a Folder on my Desktop, opened the copy of my database in DTP, used the Select All command from the Edit menu, exported and then used the Finder command to Get Info on my Desktop Folder. That showed less GB.

I’m OK with this even if I do not understand it so long as I am not missing information in the folder containing the export. The folder holding the exported data has some extra files called DEVONtech_storage. I thought from other posts I saw that these files would have the meta data, etc. and therefore that folder size should be the same as the original DB and/or the copy. Unless there are some indexes created internally, that should not be a factor. I have no indexes in my original DB.

Database packages contain indexes & backups of course, the sizes aren’t comparable.