Would appreciate feedback from experienced users about whether I am approaching my deployment of DT Pro v1. properly. Does the following make sense as a way to organize our information and make it accessible via DT? I suspect it is a fairly common situation.
I am creating a large database of pdf’d articles from 5 annual conferences of the International System Dynamics Society, an assortment of simulation model files, and URL links to articles accessible through the online version of the System Dynamics Review. Devonagent is working flawlessly to capture web archives of the abstract pages for each article. All the local files (pdfs and models) are copied to our server, where they are accessible via our network.
We will deploy three copies of DT on laptops that connect to the server when the users are in our offices. We want the 3 DT databases to be identical. The pdf documents will not be housed within each database on the laptops, but will reside only on the server. The databases will simply link to the copies on the server, saving hard disk space on the laptops.
The pdf articles came on CDs, each corresponding to the year of the conference, but they deal with a whole range of topics, transportation, energy, health, project management, environmental, systems theory, etc. On the server, the pdf file content from each CD resides in a separate folder, labeled by the year of the conference (i.e., one folder per year). I have imported each folder into a laptop-based DT database. The files are grouped into a folder whose name is identical to the one on the server. The file path shown in the DT database leads back from the laptop to the original folder on the server.
I really want to be able to access all these files by the subject matter that they relate to, not the year of publication. I could leave them in the groups categorized by year of publication and rely on DT to search by content, but that isn’t generating sufficiently narrowed results. So, I have created topical groups (transportation, energy, fisheries, environment, business, project management, etc.) Originally, I created replicants for every pdf, and then displayed the first page of each pdf in the “vertical split” view of the database, determined which topical group the document most related, then manually moved one of the replicants for that file into the topical group folder, leaving the second replicant in the original group relating to the year of the conference. I did this so that I could easily see where each paper originated from, since this is not included in the pdf document itself.
I now believe that I don’t need to use replicants at all. I should just sort the files into topical groups in the database, leaving them in their original arrangement by year, on the server. I could rely on the path to tell me what conference they came from, but it is hard to see in the column view. I have opted for adding the conference year to the spotlight comments field for all files in a folder, using an automator “add comments” script, which means that when the files are imported into DT, the spotlight comments show up in the comments field in the DT database.
So, the DT database will be grouped topically, but the files in the database will link to files on the server that are grouped by Conference year.
First: Does this approach make sense?
Second, if I copy the database on the first laptop and put the copy on a second laptop, will the paths be correct or will I have to amend them all to reflect the fact that the DTP database is on a different computer and possibly in a different location within the HD directories?
Third, is there any way to gain some efficiency in sorting the files into topical groups? By far most common content element in all the files is “systems” so, DT seems not to know how to distinguish content, which means that the “auto group” command pretty much leaves them all sitting in the same group as they were originally. Interesting, using only one year of publications as a test (200+ articles), I found that the “see also” feature worked pretty well finding related articles, in random testing. Not sure why that would happen, since I assume the same AI logic is being applied to the content of the files.
Fourth: When DT displays a web archive page, the URL links within the page are active and generally seem to work fine. However, when I attempt to activate a link on a journal publisher web site to download a pdf file of the article that is abstracted on the web archive page, DT crashes. Putting aside the fact that crashing shouldn’t happen, is there something I could do to avoid this problem? I suspect that it might have something to do with the fact that when I access the publisher web site via DT, I have not logged in as a paid subscriber, so do not have permission to download the pdfs. However, in Safari, this would simply lead to a window explaining that I can’t download since I am not a subscriber. In DT, it results in a crash, at least on the site I am dealing with.