Change imported PDFs to indexed: How to?

Hi,

I have a database for PDF – collection of all academic papers on my computer. I imported the PDFs, but I did not intend to auto-classify etc… I installed a group where at least one replicant of the PDFs is stored like on a bookshelve, not to be moved anywhere.

Now I changed my mind from imported PDFs to indexed PDFs. I want to store them all in a folder where my DTPO database resides as well. So it should be easier to work together with Bookends.

But: How to change the records from imported to indexed?

And: I want to create the same folder structure as with the groups in my database.

Thanks for any advice. I am sure, it is a beginner’s problem again…

Maria

very interesting question.

I’m not sure if that works (without any loss of information),
but you could try the following:

  1. export the whole database to a new folder in Finder,
  2. create a new empty database (without touching the old original DB wit h the imported files for safety reasons!) and
  3. try to index this exported folder.
  4. tell us, if everything worked as expected :wink:

Martin

Martin,

your suggestion put me on the right way: I just exported the “bookshelf” folder and indexed it afterwards, it worked well and kept all meta data,

--------------but not the uuid of all documents!

Here is how I checked everything: I had the originally imported documents kept in the database, so after indexing, all PDFs were duplicates, and they turned up nicely in the tag groups I had created. So this was OK, and I could delete the imported files.

But there is one serious drawback: The UUID of the PDF documents is different. I can use links in documents to other documents, but the link that it created when using the notes template (where you can click on the document name to return back to the document) does not work any more.

I understand why this happens, so I would ask for another way to change imported documents into indexed ones (one integrated command): Then DT can take track of the UUIDs.

What do the developers think?

Best,
Maria

Hi Maria,

thanks for the feedback.

did you index the pdfs with the “old” database?

What would happen, if you index them with a completely empty db?

Martin

Martin,

now I see why you suggested to import into a new database. I am sorry, but now I did it already and deleted the other files.

The problem is not too big in my case, since I am still in the phase of setting things up. I can fix the links in a few minutes, but think, that others who want to do the same step with data that have a longer history, will appreciate some better behavior.

Thank you very much again,
Maria

There are plans to add a “consolidate” feature (moving indexed stuff to the database and therefore importing it) but as far as I remember, this is the first request in the other direction.

Thank you for the information.

To me both directions seem plausible. Perhaps this is a problem that can be dealt in one process: Change indexed to imported and Change imported to indexed?

Cheers from a non-programmer!

Maria

Maria, as has already been mentioned, I believe the challenge is that the old, imported documents were still in the database while they were indexed. I have converted a database from entirely imported documents to entirely indexed documents and in my experience, it is not necessary to index the documents into a new database. However, the old documents must be removed from the database before indexing. IMHO, changing the UUID away from documents that currently exist in the database while indexing would not be ‘better behavior’!

I don’t think I understand what this means.

If you export some documents, the DEVONtech_storage file is created which contains the UUIDs originally assigned by DT on import, but the indexing function doesn’t check this Property List to assign UUIDs to content it’s indexing.

In order to do so, it would have to check all databases to make sure that the UUIDs were unique, which in the example cited above is precisely not the case: having not deleted the originals, all UUIDs would have been duplicates. Now imagine that the original database was closed, and DT had no access to it other than opening it and checking…

So DT should assign unique UUIDs in this case. This seems to be “better behavior.”

Best wishes, Charles

Greg,

I wrote that I know why this happened. I did not wish to get into the technical details. I wanted to perform a task as simply as possible: Change the same documents from being imported to being indexed. In this case, no documents would exist in the database.

And I see that, had I proceeded with more care, I would not need to fix 5 or so UUIDs, but this is no problem for me. Others who want to do the same in the future, should be careful and better follow Martin’s instruction more precisely than I did – if DT will not offer such a command.

But, as Christian wrote, there is something in the making.

Cheers,
Maria

Charles, I believe we are saying the same thing. I did not understand what Maria was requesting, but what I was reading was that the UUID from the documents that are still in the database should be assigned to the documents that are now being indexed. So yes, a unique UUID is indeed the better behavior.

Fair enough.

I think Maria may have had some x-devonthink-item:// links set, and these didn’t survive the export.

A problematic situation, but I’m not sure there’s a solution (except to write an Applescript).

Best wishes, Charles