folder structure/smart folders

buridan · September 19, 2006, 9:19pm

because my database has become to big for my little albook to manage without spinning disk of doom for minutes. I decided to break the data down and reimport it. I use mostly smartgroups and i have over 1000 of them heirarchically arranged. They work great… as long as the database is small enough, i’d just like to have the same one across all of my databases. Is there any way to move my smart groups/folder structure from my old database to my new databases?

Bill_DeVille · September 20, 2006, 12:05am

I doubt that I fully understand the structure of your database.

Are you hierarchically nesting “smart” subgroups inside of smart groups? I’ve never done that. I’m not sure I’ll try that, as I have some lingering reservations.

My initial response is that of course you could select your existing hierarchical structure, export it, and then import the content of the exported folder into a new database. Your organizational structure would be retained in the new database, and the smart group script would still be attached to the exported group(s). But simply doing that would result in a database at least as large as your current database, if your smart groups have comprehensively captured the content of your database. I’m really not sure about that. I’ve only experimented with exporting a single smart group to a new database. In that case, there were no replicants inside the group in the new database, as there was only one instance of each document.

My major confusion is that I think I understand that you wish to break your big database into topical databases, each one smaller than the original. I don’t understand how your thousand or so smart groups could fit into databases with different, selected contents and still be meaningful. I tend to think of organizational hierarchies built to fit the contents of each database, rather than fitting content into a pre-designed structure. But perhaps I don’t understand the nature and purpose of your database.

To do what you want to do, I would suggest that you make a copy of your database and simply delete all of the documents, leaving empty smart groups. Then I would make named copies of each of those empty databases to hold each of your new “split” databases.

Next, I would make a copy of your existing database and open it (use a copy in case you make a mistake). Now run one or more searches designed to filter the contents according to a topical subset of the existing contents, replicating the results of each search to a new group that you will export for import into a matching topical database. You will have to exercise some judgement about the types of search queries to use to make these topical “breakouts” of your existing content.

When you import that “filtered” material into a new empty database, your smart groups should “attract” the items appropriate to each smart group.

This will be a bit slow initially, as you are working from a large and slow database. But if you delete the “filter” groups after each export, choosing to delete all instances of their contents, your working copy of the database should gradually grow smaller and faster.

Will you move every item to its most appropriate new database home? No guarantee, which is why I suggest you retain the original database. You might find it useful to explore it once in a while to look for material that may not have been moved, or that might fit in more than one of your new databases.

I’m still curious about your database structure and purpose.

buridan · September 20, 2006, 12:32am

.

well if i have a smart group called jackson, then i have smart groups janet jackson, joe jackson, etc. then the smart groups will appear within the other smart group, nested, though they exist in their original location too, which would be people.

can’t do that, it exports to several times larget than it is, and by several. i mean the db reads as 6gb, and i stopped the last export at 20gb and it wasn’t 1/2 done. exporting doesn’t work because i have 12000+ pdfs, and each one is in many smart groups, which when exported, seems to make each copy of those pdfs real on the drive. it takes forever.

yes, in any case, that would not resolve the issue, because the imports would be fairly large in themselves. i just want to be able to copy the groups that i have to a new db, either a db with data, or a db that i can import data to.

easily, i just then have to check 4 dbs to get comprehensive instead of one.

my break up is not topical, topical will not work, so i broke my big db into 4 sections of relatively the same size based on data size.

i thought about that, but it would take forever to empty all of those groups, whereas all i really want to do is copy the data structure to a new group, which should be easily do-able.

[/quote]

the model is the filtered garbage collector, everything that i want goes in, the smart groups grab the content and put it into the right categories, and then when i need something i just look in the catgory and don’t search the garbage collector. every now and then i add a new category. the new model just pluralizes the garbage collectors because the one big one has several faults. it is slow. it crashes on import of more data, and there are actually 14000+ items in the garbage collector, but the database reports 12000+. breaking it up is easy.

the idea is that the data structure should copyable separate from the data itself. i should be able to drop my data structure onto your data and get some things popping into some smart groups. but i can’t figure out how to move a data structure separat from the data, other than by deleting from a copy.

Bill_DeVille · September 20, 2006, 1:45am

buridan wrote:

Sorry, it doesn’t work that way. I can think of only two ways to recreate that organizational structure in a new database: manually recreate it, or copy the database and delete all of the contents of the groups. Actually, the latter mode should be very easy! Just delete all of the unorganized material. Presto. Your smart groups are empty.

I’m awed by the amount of work you’ve done. In the very early days of computing it would have been necessary to do that sort of detailed categorization. I do use some organization of content by groups (a form of tagging), but nowhere near your level of detail. Some people still do additional tagging, such as by keywords (in early databases, that was a necessity if one wished to find anything). I do that only very rarely.

But I could do a search on your raw, unorganized data, with no categorization, for “joe jackson” very quickly, and the database would be much more responsive without that much organizational detail.

I don’t have any special categorization for Janet Jackson in my main database, which contains about 20,000 documents. But I just ran an Exact Phrase search for her name and found four results in 31 milliseconds (the results were all about FCC regulations and a Super Bowl event). But my database doesn’t have any content that mentions Joe Jackson.

I don’t need to create a special category for Isaac Newton in order to look for information about him. I just did a search for his name, which produced 41 results in 16 milliseconds. But if I want to look just at Isaac Newton’s interest in alchemy, I can produce 9 results in 0.191 seconds.

And when DT Pro version 2.0 is released I can use even more powerful search queries to precisely find what I’m looking for in the database.

Note: the artificial intelligence features of DT Pro are enhanced when one uses a good classification scheme for content. That’s obvious, of course, for the Classification feature, but also helps See Also. Although you don’ use Classify, I’ll bet See Also works very well in your database.

buridan · September 20, 2006, 2:28am

Bill_DeVille:

buridan wrote:

the idea is that the data structure should copyable separate from the data itself. i should be able to drop my data structure onto your data and get some things popping into some smart groups. but i can’t figure out how to move a data structure separat from the data, other than by deleting from a copy.
Sorry, it doesn’t work that way. I can think of only two ways to recreate that organizational structure in a new database: manually recreate it, or copy the database and delete all of the contents of the groups. Actually, the latter mode should be very easy! Just delete all of the unorganized material. Presto. Your smart groups are empty.

maybe that will work, but I’ve tried the deletion thing before and i think it took forever to delete all instances across the groups…

I’m awed by the amount of work you’ve done. In the very early days of computing it would have been necessary to do that sort of detailed categorization. I do use some organization of content by groups (a form of tagging), but nowhere near your level of detail. Some people still do additional tagging, such as by keywords (in early databases, that was a necessity if one wished to find anything). I do that only very rarely.

it is not so much doing work, as finding relationships in work and using devonthink pro as a memory tool. I have conceptual categories and i have people. I also have topics, but those are used to organize in a different way. The point of people and topics is how i think about and remember schools of thought, discourse, and related matters. Concepts is the topic, it contains currently 84000+ documents in about 400 concepts. Those are concepts that I use in my writing and thinking. If i am reading a paper and it pops up with a new idea that I like. I will make a new concept and see who else and what else uses that idea, so that later i can decide to use that idea or not, but i will always remember it.

Note: the artificial intelligence features of DT Pro are enhanced when one uses a good classification scheme for content. That’s obvious, of course, for the Classification feature, but also helps See Also. Although you don’ use Classify, I’ll bet See Also works very well in your database.

classification and see also are extremely slow and they actually work somewhat poorly. my reading ranges across about 8 disciplines in the humanities, social sciences, and information technology, and it is quite diverse, so when you do a see also, it might bring up a paper on ontology from philosophy, when actually i need a paper on ontology in library science. Diversity can be probematic for see also and classify in my database. I’ve never seen them work well, but i started with about 6600 documents that i imported because spotlight does not search them well. devonthink pro searches them fine and smartgroups categorizes fine, it has just got ungodly slow even with 1.5gb ram and plenty of free drive space. so i need to break things down as i described. I had hoped that the design of devonthink would be so that i could just open up the package, and find a group descriptors file that i could then just copy and insert appropriately in new databases and ‘tada’, but apparently there is no possiblity of doing that?

current database is: 1,158,000 unique words and 1.8 billion words total in the 12k document database, and I’d like to put it up to 16k which is about the amount of data i really have.

buridan · September 20, 2006, 4:02am

tried this… and no, it did not empty my smart groups, it left them full of zero byte pdf images that i could not delete or manipulate at all…

Bill_DeVille · September 20, 2006, 6:11am

Interesting. Have you done a Tools > Verify & Repair operation?

Try again with a copy of your database. Delete all but one of your unclassified documents. Then rebuild the database. Are the smart folders empty now?