Best alternatives for not having replicants across databases

dennisrhidalgo · May 29, 2016, 1:46pm

Previous discussions on topics about splitting databases have touched on solutions for avoiding duplicates in your hard-drive after the split. Most of the posts I found are dated before recent updates. So, I wonder, what is new?

In all honesty, I can’t yet realize the good in keeping smaller databases (let’s say, 40 m. words), but I am still splitting them out of necessity-- to avert catastrophic crashes and data loss. The upside, I think, is that the AI works a bit better in small sizes. As I suspected, however, now I am losing, not data since there are no more crashes, etc., but time. Jumping from one database to another in search of the right keyword combination is simply time-consuming.

Replicating across databases would have been a pleasant solution. But since that is not possible, could we find a middle ground, perhaps, with indexing? What are the benefits of picking up a few documents, overloaded with words, from one database and index them in other databases where I might want them to show up in case they become relevant to a search there? I suppose that it will not inflate my hard drive, but would they increase the database’s word-count? And if so, would that be no different from (re)including them in full (and so defeating the splitting purpose)? I hope that searching for a solution to this problem does not become the quest for the Holy Grail.

Thanks

BLUEFROG · May 29, 2016, 2:53pm

I can’t picture what you’re describing here.

Also, are you meaning you’d index the same files in the Finder in more than one database? If so, then yes, people already do this. And yes, it does add to the word count of both databases.

dennisrhidalgo · May 29, 2016, 3:16pm

I meant to choose a document from another database through the finder and index it in a new database rather than adding or duplicating it. To your last assertion, that’s a bummer.

To help illustrate the jumping picture:
I had to split the original database into 10. The original was about half a billion words.
They are now divided into regions, primary and secondary sources, types of sources. For example, Haiti/PS/Newspapers; Haiti/PS/Books; Haiti/SS/Articles, etc. So, if I want to do a search on a specific name combination (e.g., Granville-insults) across all of the sources (primary, secondary, and regions), I need to jump from one database to another. Then, I must collect/open them with Preview to keep just a few databases open at the same, and thus avoid crashing. Not familiar? Perhaps I am dealing with an exceedingly high number of words?

I thought that indexing a few of the most needed documents into other databases I would still be able to search them without having to open new databases while still keeping the word count low.

BLUEFROG · May 29, 2016, 3:24pm

But logical.

Are you using the Full Search (Tools > Search…) or the searchField ?

dennisrhidalgo · May 29, 2016, 3:47pm

This is such a useful and yet so basic piece of information. It shows how little I know about using DT. I have only searched within a single database before (with the exception of Devonsphere Express).

After playing a bit with Full Search, I am beginning to appreciate the use of smaller databases. But, would having all databases open at the same time while doing a Full Search invite a crash?

Thanks

Greg_Jones · May 29, 2016, 5:07pm

Unlike documents that have been imported into a database, you can duplicate an indexed document across multiple databases, and the resulting entries in the databases will still point to the single instance of the document in the Finder. In other words, duplicating indexed documents works just like replicating documents that are imported in the database.

BLUEFROG · May 29, 2016, 5:43pm

Not necessarily. It depends on available resources, current system stability. More RAM is always good. Rebooting regularly is nice for your machine too.