I've imported for five years... time for more indexing?

padillac · August 9, 2015, 11:57pm

(I did a GA search for ‘index import’ - no quotes - and read everything… but filtered out similar results so it was only 10 or so I may dig deeper later…)

As mentioned in the subject, I’ve used DTPO for a little over five years, and I’ve always imported.

I’ve recently started using Ulysses for writing, and I love it - and if I want my writing to show up in DTPO, I need to use Ulysses’s “external folder” feature, and index that folder into DTPO. No problems there.

Here’s the scenario that bugs me…

Ever since I started using DTPO, I’ve wanted it to “see also” across multiple databases.

I know, Bill D. considers the single-database “see also” restriction to be a feature - but my brain doesn’t work the same as him, and I don’t agree.

I want the “see also” pane to list the open databases, with checkboxes alongside each one, and let me select which databases to include in see also on the fly. No reason Bill can’t check just one db (or “current database” to maintain existing functionality)

Alas, every time I upgrade, I comb the release notes for this feature… only to be disappointed.

So, why do I want this feature in the first place? Well, I like to scope “see also” to different levels, on the fly. I would like to, anyway. Consider this setup:

Primary sources database
Notes database
Project A database
Project B database

I’m working on project A… and at any given point, I’m curious about the following things relative to a document in project A.

What other documents have I created for project A that relate to the current document?
What documents from my old project B relate to the current document?
Which of my previous notes relate to this document?
Which primary sources relate to this document?

Basically this lets me widen the net a bit… starting with only the stuff I’ve created for this project, then looking at other projects, notes, and eventually my widely-captured sources. And then zoom right back in on only project A.

That’s the dream… and should DTPO ever implement see also across databases, where I get to choose the databases, I will be even more in love with it than I am now.

But it doesn’t, so I’m left wondering what to do…

Should I move all of my database contents to external folders, and then index them into DTPO databases?

Here’s what it would let me do:

Maintain same database structure as before, but with everything indexed instead
Create an Active project database, and duplicate folders from relevant databases to the Active Project database.

So for example… I’m working on Project A. I duplicate the indexed Project A database folder to Active Project. Now I can answer “which project A documents relate to this?” (same as I could do before…). But now I want to see which project B documents relate, and so I duplicate the Project B database folder to Active Project, and now I see the document in relation documents from both contexts.

It will be a lot of duplicating and deleting, but I’m sure I can script parts of it to speed it up…

This has the added bonus that I can actually “zoom in” even farther, by duplicating only a particular folder from a given database.

I don’t really want to give up importing - I really like the fully-contained nature of DTPO databases, and with indexing I pretty much have to give up on replicants (replicants disappear if I move a file, unlike tags which seem to track the file id instead of path). But… my killer usage scenario for DTPO is to be able to “see also” across multiple databases. And I think the only way to do that, for now, is to move everything out of DTPO and index it instead.

Is this the right way to go about it?

There’s also the possibility of using DEVONsphere for wider “see also”… but I haven’t really used it, and don’t know how well it works. Plus now I’m intrigued by the finer-grained possibility of duplicating specific folders to an Active Project database.

korm · August 10, 2015, 12:43am

I’d also like multi-database See Also. Maybe someday?

It’s pretty easy to index the same folders into multiple databases. Not sure I understand your case for large amounts of duplicating, though. Put the common data in a file hierarchy and then index that hierarchy everywhere it’s needed. It’s easy to test – so I’d suggest you do that; starting on a small scale.

On the other hand, if you find that a big chunk of each database is shared with other databases then perhaps you should merge those databases. It’s an analysis you’d want to do from time to time.

DEVONsphere is a good option – and a reasonable price that it’s not too dear of an experiment. (I don’t get royalties )

padillac · August 10, 2015, 1:26am

It’s not that I’m duplicating large amounts of data… it’s just a quick way to index something in multiple databases.

Create folder A on disk, index it to database A.
Within database A, select folder A and duplicate to database B… it’s now indexed in B as well.

At least if I understand things right, that’s the same as viewing the folder in the finder, and indexing to database B. Or doing File->Index and using the dialog…

I want to be able to adjust the “see also” context on the fly - maybe changing the view up after 5 or 10 minutes of working. Right clicking on a folder in A, and duplicating to B, is the quickest way to index that specific folder in B.

And if I’m horribly wrong about this, I’d love to know!

korm · August 10, 2015, 3:12am

Got it. It’s like going through the back door to get to the front door – but it’s only one door in both cases

Allsop · August 10, 2015, 4:41am

I understand the principle but embarrassingly I can not work out how to index (1) between databases and (2) within the same database, I have always used copy item link for this. An explanation on how to index in both these cases would be appreciated. Cheers.

BLUEFROG · August 10, 2015, 6:03am

You can’t index between in inside a database. You can duplicate indexed instances, but it is indexing the filesystem, not your database contents.

Allsop · August 10, 2015, 7:39am

Ah! I am guilty of not reading the post closely, I missed the words “on disc” in padillac’s “create folder A on disc, index it to database A” Thanks for enlightening me.

FROBGOBLIN · August 10, 2015, 10:15am

I’d like the option to connect different databases, and the lack of integration across them is one reason why I tend to throw almost everything into a single one. For example, I have a large amount of my notes in my journal group, and these notes span lots of different projects, personal, and research interests. It just isn’t possible to separate all of that stuff out unless I begin writing in multiple journals (that isn’t going to happen), and anything that isn’t in the same database as my notes can’t take advantage of the AI. The same issue comes up again and again for other groups as well. There are actually very few things that aren’t somehow connect to everything else. My life is a mess?

But, to be honest, everything in one database is working fine right now. The only minor problem is that I can’t take advantage of having multiple databases. I have several tens of thousands of items in my main DT database right now and it runs OK, but I imagine it will encounter hiccups at some point as I pour more content into my DT. That’s when I might run into a problem and need to “archive” stuff.

BLUEFROG · August 10, 2015, 3:51pm

Instead of creating monolithic databases, maybe you should look at creating smaller databases and just copying the data into them.

Cross-database replicants have underlying technical issues that don’t seem to have any real solution at the moment (just as hard links can’t be used across filesystem boundaries). Also, cross-database linking will be plagued by problems if all required database are not available.

When I worked in Graphics Production / service bureaus (for years and years), the lesson installed in me (to which I hold to this day) is: Create a structure you can hand off to someone else, where no other resources to reproduce the job are required. Even if this means replicating data from other jobs (i.e. “We used that logo on the brochure already. Why are you copying it again??” “Because it’s what’s required to reproduce the job from one package.”)

I guess it depends on what exactly you are doing, but this mindset still helps me in many ways today. (And I promise, I never get any calls that some part of a prohect is missing. )

Sorry, if off-topic a bit.

FROBGOBLIN · August 10, 2015, 4:40pm

good advice about duplicating data and thinking like an outside observer who isn’t immersed in the task at hand (your future self).

the problem i have with multiple databases is that just about everything is connected, and this is why i have replicants scattered throughout my database. if the ios app ever works with tags, i’ll probably abandon most of my groups, and rely a bit less on replicants. admittedly, not everyone can tolerate working in such a muddled database! but, a combination of tagging, smart folders, and searching can obviate the need for a lot of organizational drudgery.

for my use case, it’d be sufficient to have multiple databases accessible using see all and the classification (basically, accessing all of the existing databases for these particular features). replicants and links across databases would be nice, i guess, but i imagine it’d be a nightmare to manage on the backend, and i doubt the time invested (assuming it could be done) would be worth it.