Making injested file/folder structure avail to other OSes?

Hello DT folks,

I am using DTPO on one system in a six system (3 os 10.5, 3 xp) network. I run DTPO on my main “writing”/“administation” system (separate from film editing, sound editing systems, transcoding systems, etc.) and am uncertain what I should do in order to create a physical file hierarchy that is at least modestly clear when navigating to this system from one of my other systems. (I also need to grab pdfs etc to read on other systems VERY frequently)

I know that I can share references to dt stuff via a local web interface, but usually I’m actually needing project file linked in DTP (like Final Cut Pro files, etc.) in an external program (enjoying DTPO’s metadata/comments and groups structuring while within DTPO to manage/admin that information).

I have a heck of a project to tackle this week: about 450 GB of documents, data, code, pdfs, images, sounds, etc. on … um … seven harddrives – and I’m nervous that just injesting everything into DTPO and organizing it there will be ONLY of use to me and hell on the rest of my colleagues or for access from other machines. I also have the matter of needing to think carefully about where physically to locate the DTPO so that there is room to receive this information.

Am I missing something about DTPO 2.0 pb that might resolve these issues?

My colleagues keep telling me that I should be using something like Evernote as a better global web inbox (which seems like a lot of wasted web traffic in and out to me, though the cell phone stuff is very cute) and use Path Finder as an almost-as-cool-as-DTPO solution (with the core AI excitement cut out) for managing files locally within file structure so that there is no secondary file management concern.

I hear them … but as far as stuff restricted to my main system, I love DTPO and have a very comfortable workflow throwing open a new text file and grouping things with just enough structure to make my searches useful. I kinda wish DTPO was a full finder replacement and its metadata hierarchy translated directly into physical file folders accessible from other systems.

–Matt

I’d recommend buying a 1TB drive and throwing it all into DTP… everything’ll be accessible to everyone through the web interface with a nice search interface.

Okay… so that means the injested files are all stored “flat” in the alphanumeric db capture folders, right? So not that easy to search from outside unless you generate the link from within DTPO?

Not a terrible idea, though.

Matt

Experiment with this:

In DT Pro Office Preferences > Server, click the Start button.

In a database, choose File > Database Properties. Check “Share Database”.

Now launch Safari. In Preferences > Bookmarks, check “Include Bonjour”, “Include Bookmarks Bar” and “Collections, Bonjour”.

Now in Safari choose the Bonjour tab in the Bookmarks bar and select your “broadcast” DTPO2 database. A remote user can click on the Loupe (magnifying glass) symbol to invoke the search options, or browse the database. Multiple databases can be accessed via the Web server.

Apple provides (on the Apple Web site) a downloadable utility that allows PCs using Internet Explorer to use Bonjour.

Well, I should’ve first asked if you are still using 1.5 or if you’ve moved to 2.0. The way you phrased your question made me think that you looked at the feature list, saw a few things, and thought “okay… I don’t see anything that’ll help me there, so I’ll ask on the forums.”

Which is good, just to be clear. (And if you’re using DTPO 2.0 and have tried the browser interface, I apologize, and you can stop listening to what I say from now on)

The 2.0 interface has a link, so if someone needs to download some file then it’s easy. Search, click button, “boom.” No problem whatsoever.

But if you need to edit files in an external editor from across the network… in other words, someone views your web interface and opens a file much like they would in Finder or Explorer and can write and save much like they would in Finder/Explorer… that gets difficult :slight_smile: I think that the natural restrictions on the whole browser interface system might make this impossible except through things like WebDAV, which I don’t pretend to know anything about.

It might be possible to ask the DEVONtechnologies developers to add in a “Reveal” button to the web interface, where you click it and it shows the location of the file in Finder/Explorer. I’d expect (being an idiot non-programmer) that this wouldn’t be excruciatingly difficult for them to do, and it would get around some of the hurdles that people have been complaining about with DTPO not being a multiuser system (which it’s not and wasn’t designed to be).

So if that makes sense and you think it’d help you, I’d recommend going to the HelpDesk or the Suggestions forum and formally requesting such a feature. If nothing else, Christian or Eric or someone else would probably explain why it’s not feasible or explain a better way to go about solving this problem.

Just a thought.

As for this:

I think it’d be possible to make an AppleScript that would create a “fake” folder hierarchy of real folders and aliases to the files stored within the DEVONthink database. It’d actually be very simple, and it should work flawlessly. All you’d have to do is be sure to run it frequently, or have a loop in the script that would run it frequently.

tries

Okay, here’s a way of doing just what I said. It grabs every single content record and then goes through the list and creates all of the directories necessary and then creates an alias to the file. You end up with a folder hierarchy mirroring (minus replicants) your DEVONthink database and containing aliases to the original files.

  1. Change thePosixRootPath to wherever you want these files to go.
  2. Run the script.
  3. Refrain from suing me if anything goes wrong.

This isn’t a very fast script, but I think it’s about as stripped down as it can be and still be legible. A mutant like erico or Christian might be able to improve it immensely, but I’ve done just about everything I could do. It deals with spaces and strange characters in file names.

Problems:

  1. It ignores replicants. It will only create one alias to each record – the first replicant of that record in the database. That might make this totally useless. I’m not sure how to fix this, since I use “every content” (which works) and have tried “every record” (which does not).
  2. If you move a file within the DTP hierarchy, it’s up to OS X to fix the alias for you. It should work fine. Anyway, any time you make sufficient changes, you can just blow away the whole directory structure and do it over again.
  3. It works with the current database. You’ll probably want to change this to work with a specific database so that you don’t show your erotic poetry to your coworkers.
  4. This took me way too damned long to do. I spent a couple hours trying to do this with the Finder and it just wasn’t working for me. This way is rather inefficient (since it will repeat a for every single file in your database), but oh well.

tell application "DEVONthink Pro"
	set thePosixRootPath to "/Users/ndouglas/FakeTree"
	set theContents to every content of the current database
	repeat with thisContent in theContents
		set thisPath to the path of thisContent
		set thisLocation to the location of thisContent
		set theNewPath to quoted form of (thePosixRootPath & "/" & thisLocation)
		do shell script ("mkdir -p " & theNewPath)
		try
			do shell script ("ln -sf " & (quoted form of thisPath) & " " & theNewPath)
		end try
	end repeat
end tell

Cool idea for a script, Kallisphoenix! I hadn’t thought of automating this, dunno why, but that’s cool.

If you are worried about speed, here’s a python/appscript version of the script that links my entire 24 gb database in about 1 minute. Applescript took about 20 minutes. This script dumps to the current shell directory, but could be modified to do otherwise.

have fun mutants all! :smiling_imp:

erico


#!/usr/bin/env python 
# -*- coding: utf-8 -*-
# devonthink dump to links in file system by Eric Oberle
# based on idea from Kalisphoenix
# Uses Appscript for Python and Devonthink Pro by Devon technologies
# dumps entire current database to current directory, creating all necessary 

import os
import sys, EasyDialogs

try:
	from appscript import *
except:
	os.system("sudo easy_install appscript")
	from appscript import * 

# The following just shows the args passed.  We are not doing anything with them right now.  
for arg in sys.argv: 
    print arg
    
    
dt=app("Devonthink Pro")
base_path = os.getcwd()

all_records=dt.databases[1].contents.get()
for this_record in all_records:
	the_path = dt.get(this_record.path)
	the_name = dt.get(this_record.name)
	print the_path + "   " + the_name + "\n"
	this_location = dt.get(this_record.location)
	new_path = base_path +  this_location 
	try: os.makedirs(new_path); 
	except: pass;
	 
	print "Linking " + this_location  + " to " + new_path + "\n"  ;
	try: 
		os.symlink(the_path,new_path+ "/" + the_name)
	except:
		print "trouble linking "+ the_path + "\n"



See, I knew you could improve it, and I knew I’d learn something from it :smiley:

I never would’ve thought that the increase in speed from Python over AppleScript would be that immense, though. Gee whiz.

Edit: Holy crap, that is really fast. In the blink of an eye. Apparently, I’m going to have to learn Python now.

One note is that the code is missing a before


print "Linking " + this_location  + " to " + new_path + "\n"  ;

which I blame on phpBB. Another note is that easy_install didn’t work for me, so I had to install appscript by grabbing the .egg file directly and ing or whatever. Just notes for other people who might have no experience with Python.

Yes, python is extraordinarily fast for moving large amounts of data around and the makedirs module basically aggregates all those mkdirs in a way that far outstrips applescript’s do shell script calls. It’s all internal, so therefore very fast.

If anyone were truly interested, I could tinker around with further optimzation by pushing the the paths into an parallel array with the record indices. It probably wouldn’t be that hard to modify the script to mirror a branch of the dtpro tree structure.

Good catch on the easy_install. I edited the script above and I think it is fixed. : :unamused:

-erico

Thanks folks – these are some very exciting ideas. I’m setting up to test the python script on a db – and am really crossing my fingers that I can make it work. Because this is INDEED exactly what I need. Just a touch of structure to make navigating into this archive is all that I really need. I’ll then be doing most of the heavy lifting/admin / writing in DTPO 2.0.

Wow. This is very exciting, actually. I’m thinking this seems to be a solution. And this kicks my tail into learning python again. A language famous for being the most “elegant” and for quick adoption … very powerful for this purpose. That said, I’m even more thankful that you two have jumped in and coded a solution. Much better than most of the finder-level and low-end scripting stuff I was trying.

–Matt

Thanks folks for the encouragement and ideas. Okay, well, now we’re getting dangerous.

Attached below is what you might think of as the companion script to dt_shadow_export (above), which I’m calling dt_shadow_import. It uses python to scan the current directory, which it presumes (!) was created with dt_shadow_export. Going through this directory, it traverses the folder mirror looking for any “real files” that have been added (via the Finder or another program). If it finds such a file, it imports it to dtpro.

Basically this is a 5% version of the two-way “rooted folder” synchronization that I would so like devonthink pro to have. It allows you to export the dtpro directory structure to a file system mirror, allows you to add files to it, and to reimport them. Please note that it has no provision to allow you to change the folder structure with the Finder, so don’t even try it. But one can add files in the finder and they will show up in devonthink, and one can edit file links that were put there by dt_shadow_export and the files will be fine. And you can of course occasionally rexport the whole database using dt_shadow_export again.

What I’d encourage everyone to do with this is first backup your data. This is really dangerous stuff, and you could easily trash everything.

Secondly, if you find this useful, make suggestions, and maybe we can update this to be script triggered inside devonthink.

Third, I’d encourage us all to try to express to Christian how grateful we’d be if he’d consider allowing for a “rooted folder” type within devonthink pro to allow for two-way synchronization that didn’t require this kind of manual triggering, but worked continously and automatically ( via FSevent monitoring). If there were a special group type in DTPRo that had a root in the file system, exported and imported new files automatically, it would allow Devonthink to be used to maintain websites and the like through rsync. This is a far lesser solution than what could be done internally in Devonthink, but it gives you an idea.

Please, remember to backup and let’s keep a dialog going about what we think about the two-way sync.

cheers,

erico
p.s. This wouldn’t even be remotely possible without dtpro 2. Yeah, devonthink 2!!


#!/usr/bin/env python 
#dt_shadow_import by Eric Oberle (erico) version .01a
# traverses a directory created by dt_shadow_export, looking for all directory entries that are not symbolic links or directories.
# if it finds any real files, it tries to import them to the appropriate location in dtpro. 

#warning: this is a proof of concept, has no error checking, and should not be trusted!!!

import os.path
import pprint
import re
try:
	from appscript import *
except:
	os.system("sudo easy_install appscript")
	from appscript import * 
base_path = os.getcwd()
dt=app("Devonthink Pro")
print "Looking for files to import in " + base_path + ".",

def visit(arg, dirname, names):
    #print dirname, arg
    print ".", 
    destination = ""
    for name in names:
        subname = os.path.join(dirname, name)
        if os.path.isdir(subname):
        	pass
            #print 'Found dir  %s/' % name
        elif os.path.islink(subname): 
         	pass
         	#print 'Found link  %s' % name
        elif os.path.isfile(subname) and name <> ".DS_Store" : 
        	print '\nFound file %s in %s ' % (name,dirname),
        	#subtract the current directory from the path
        	devon_destination_regex = re.compile('(' +base_path + ')(.*)',re.I)
        	devon_destination = devon_destination_regex.match(dirname).group(2)
        	if (dt.exists_record_at(devon_destination + "/" +name) == False) :
        		print "Importing " + dirname + "/" +name + " to destination " + devon_destination,
        		destination = dt.get_record_at(devon_destination)
        		new_rec = dt.import_(dirname+"/" + name,to = destination)
        		if new_rec <> k.missing_value : print ". Imported Successfully."  
        	else :
        	 	print "Skipping, because already in database." 
        		#could check to see if record exits, get it, check modification datetake later one
        		# but maybe better to flush out database....
        	

os.path.walk(base_path, visit, '')