Migration from Mariner Paperless to Devonthink

OK here’s a question for the script capable.
How do you write and debug scripts, like the one above, while working in DT?
I’ve got a test database of files I imported from Paperless and I want to see if I can learn a bit about how this script works so I can customize this one and, later, maybe write my own scripts.

Do you point the rule at a file (as opposed to embedding the script) and edit that file in script editor?
Is there some nuance or trick I am missing?

First recommended step, read the chapter in the DEVONthink Handbook entitled “Automation”, starting on page 186 of current version. That should get you going.

2 Likes

Typically I don’t do that because DT caches external scripts referenced in rules; that means that when you change an external script DT won’t recognise that change until the next restart. What I sometimes do is to test the script without a rule; that is, I write a script in Script Editor using the following format:

tell application id "DNtp"
	set theRecords to selected records
	repeat with theRecord in theRecords
		# do stuff
	end repeat
end tell

I can then select records manually and work with the script until I am happy. To be able to embed the script in a smart rule requires minor changes:

on performSmartRule(theRecords)
	tell application id "DNtp"
		repeat with theRecord in theRecords
			# do stuff
		end repeat
	end tell
end performSmartRule

So basically in the initial script we are telling the script to work with the records we have selected, and in the second we are saying work with the records sent by the smart rule.

Note that actions undertaken by a script cannot be undone! Use a test database, test files, and backup your files prior to experimenting. It is terribly easy to create a loop, or to fail to limit the actions to the requisite subset of records. You could end up with every single record showing the same date or name.

1 Like

How do you write and debug scripts, like the one above, while working in DT?

Apple’s Script Editor.
I’ve used it for 20+ years.

I generally start with a non-smart rule script in Script Editor, then modify it for smart rule (or other) uses, if needed. In most circumstances, this is usually just a modification of changing the theRecords variable.

For your testing, here is the code…

set od to AppleScript's text item delimiters

tell application id "DNtp"
	set theRecords to (selected records) -- THIS IS WHERE THE VARIABLE IS SET IN THIS STANDALONE VERSION. REMOVE OR COMMENT OUT THIS LINE FOR USE IN THE SMART RULE**strong text**
	set AppleScript's text item delimiters to "="
	set tagList to {}
	repeat with thisRecord in theRecords
		set theTags to (tags of thisRecord)
		repeat with theTag in theTags
			set attribs to (text items of theTag)
			if (count attribs) = 2 then
				add custom meta data (item 2 of attribs) for (item 1 of attribs) to thisRecord
			else
				copy theTag to end of tagList
			end if
		end repeat
		set tags of thisRecord to tagList
	end repeat
	set AppleScript's text item delimiters to od
end tell

See how it’s almost identical to the smart rule version?

1 Like

Well that explains the last hour of minor frustration.

Always. I have a database of dummy receipts and data that I am messing with. Because I am new to this kind of work I close all my non-test databases to prevent any major disasters. Once I have a working script I like and my DevonThink database looks like my Paperless database, I can create a proper database and workflow with some confidence.

@Blanc and @BLUEFROG both suggest similar workflow so I am going to try that. Thanks.

It’s been a day of testing work/data flow from Paperless to DevonThink. Solved most of it.
The one thing I haven’t fixed is making sure the data from amount and sales tax categories ends up as a currency field in DT.
Using the above script, it will only work if the amount metadata field is set as one line text, in which case the Create Expense Report tool doesn’t see it as a value. I think it won’t populate the amount field as a currency value if the dollar sign is in the string being entered using that script.
So I have to find away to strip off the dollar sign from the string.

P.S.
Doing this work is reminding me why I want to get my data out of Paperless. It’s really awkward to use and I only ended up using it because Neat turned into a flaming pile of what seemed like Hedge-Fund-acquisition based customer mining and feature removal.

So I have to find away to strip off the dollar sign from the string.

This would be more easily done before the data comes in. However, with some delimiter shuffling (in a handler), you could do it this way…

on performSmartRule(theRecords)
	set od to AppleScript's text item delimiters
	
	tell application id "DNtp"
		set AppleScript's text item delimiters to "="
		set tagList to {}
		repeat with thisRecord in theRecords
			set theTags to (tags of thisRecord)
			repeat with theTag in theTags
				set attribs to (text items of theTag)
				if (count attribs) = 2 then
					set val to item 2 of attribs
					set theKey to item 1 of attribs
					if (val as string) begins with "$" then
						set val to my setCurrency(val, od) -- Pass the value and cached delimiter to the handler
						set AppleScript's text item delimiters to "="
					end if
					add custom meta data val for theKey to thisRecord
				else
					copy theTag to end of tagList
				end if
			end repeat
			set tags of thisRecord to tagList
		end repeat
		set AppleScript's text item delimiters to od
	end tell
end performSmartRule

on setCurrency(val, od) -- Strip $
	set AppleScript's text item delimiters to od
	return ((characters 2 thru -1 of val) as string)
end setCurrency

Note: I have made some more minor modifications to the script than just the addition of the handler.

I was trying to do this before I asked.
It would require me learning to script either a PDF app or Paperless to have the data be clean of $'s before it hits DT. Trying to focus on learning how scripting can help me in DT as that’s where I have work to do in the next few years.
I am still a little at sea in the scripting world but I am getting a few signs of improvement.

I wouldn’t worry about trying to mess with scripting Paperless at the moment. My revision will strip the dollar sign already.

1 Like

EDIT:
I’ve pushed on and figure out how to do subcategory and date as bespoke metadata but there’s something about that category statement that isn’t working. Probably a rogue character or something. At least I know that two if statements in a row does work. Whether it is best practice is TBD.

So here’s what I’m trying to do with all the information and scripts you have so generously provided.

I wanted to have the receipts in DT use entirely unique custom metadata and not use existing metadata fields.

Why? I get the feeling that I will be using things like category and subcategory for other purposes and, as far as I can tell, there isn’t really a silo around the custom metadata for individual databases.

So I tested out creating a receipt specific identifiers in custom metadata and using a short if statement to port the date and category to the new custom metadata fields.

I got it to work with the just one if statement for date to daterec
but when I added the second if statement only the date one worked. I am furiously reading script guides trying to find the syntax guidelines but I am at a loss here.
I’ve put the whole script at the bottom so it’s clear what I’m actually running in Script Editor

                if (theKey as string) = "date" then
					set theKey to "daterec"
				end if
				if (theKey as string) = "category" then
					set theKey to "categoryrec"
				end if
set od to AppleScript's text item delimiters

tell application id "DNtp"
	set theRecords to (selected records) -- THIS IS WHERE THE VARIABLE IS SET IN THIS STANDALONE VERSION. REMOVE OR COMMENT OUT THIS LINE FOR USE IN THE SMART RULE**strong text**
	set AppleScript's text item delimiters to "="
	set tagList to {}
	repeat with thisRecord in theRecords
		set theTags to (tags of thisRecord)
		repeat with theTag in theTags
			set attribs to (text items of theTag)
			if (count attribs) = 2 then
				set val to item 2 of attribs
				set theKey to item 1 of attribs
				if (theKey as string) = "date" then
					set theKey to "daterec"
				end if
				if (theKey as string) = "category" then
					set theKey to "categoryrec"
				end if
				if (val as string) begins with "$" then
					set val to my setCurrency(val, od) -- Pass the value and cached delimiter to the handler
					set AppleScript's text item delimiters to "="
				end if
				add custom meta data val for theKey to thisRecord
			else
				copy theTag to end of tagList
			end if
		end repeat
		set tags of thisRecord to tagList
	end repeat
	set AppleScript's text item delimiters to od
end tell

on setCurrency(val, od) -- Strip $
	set AppleScript's text item delimiters to od
	return ((characters 2 thru -1 of val) as string)
end setCurrency

Just popping back in here to wrap this up.
I’ve got almost everything working now. I’ve ended up using two scripts.
One is essentially the script above that does the following:

ports the PDF keywords to DT tags
    (actually via import setting not the script)
ports each DT tag to custom metadata
    (also strips  "$" off of strings for currency values)
Clears the DT tags from each record 

Then the second script:

on performSmartRule(theRecords)
	set od to AppleScript's text item delimiters
	tell application id "DNtp"
		set AppleScript's text item delimiters to ","
		set tagList to {}
		repeat with thisRecord in theRecords
			set tags of thisRecord to (get custom meta data for "tagsrec" from thisRecord)
		end repeat
		set AppleScript's text item delimiters to od
	end tell
end performSmartRule

takes the text from the Tags custom metadata and turns it into DevonThink tags.
I’m sure this could be done in the first script at the same time that the values for the Paperless tags are being turned into custom metadata but I couldn’t figure out where to put that statement.

It’s all working as a smart rule, triggered on import, running the two scripts sequentially.
I’ve got a bit more testing on a wider set of test documents from Paperless before I begin testing creating new docs directly in DevonThink.
Once I’ve got all this working:

  1. importing old Paperless archive into DevonThink
  2. creating new docs in DevonThink via scanning, emailed receipts, iOS photos etc
  3. generating reports that my accountant can use
    Then I can safely move on from Paperless and add the highly coveted
  4. automating as much of items 1 to 3 as possible

Thanks for everyone’s help.

Glad you have something working as you hoped and you’re welcome for the hand up. :slight_smile:

FYI
Here’s a fun hitch I’ve discovered today just in case anyone comes down this road after me.

Paperless can import and add it’s own internal metadata to secured PDFs I got from, for one example, downloaded bank statements.
But when you export those secured PDFs it can’t write the metadata to the exported files so no metadata gets exported.
Thankfully this doesn’t represent a large amount of my files stuck in Paperless.
Maybe there’s a way to export a CSV of the metadata and bring that metadata into DT but that’s pretty low down on my list of priorities.

I’m finding this horrendously complex to ‘follow’… but for the most part, I understand what is going on, because I ‘need’ to do exactly the same thing.
As I drag PDF’s out of Paperless and into DT3 - I too, am getting DT3 tags created from Paperless fields. I too need to get these tags into my DT3 Custom fields.
I’m struggling to follow along with this conversation.
I think I understand that I need a ‘Smart Rule’ that points to an Apple Script, that will extract my keyword tags to my custom fields.
As a noobie to this forum I’m limited with screenshots…

@MickeyT9
I had to pause my work on this personal project for work reasons. I need to get back to is soon and when I do I can figure out where I left off and try to help out. Might take more than a few days though.

ok thanks, I appreciate the response… in the meantime… I’ll plod along with a few experiments.
I must admit, I’m having 2nd thoughts about migrating to DT3 now… unless I can somehow migrate the last 12 years in Paperless, and make it look and feel ‘at least’ as good… it might not be worth the effort?..

Inspectors are global and display content based on the current selection. They don’t belong to any one database.

1 Like

This answer may come across as a little bit patronising; if so please feel free to simply ignore it.

In my experience trying to make one app look and function like another is doomed to fail. It raises the question as to why you would want to give up the source app in the first place. And it forces you to bend the receiving app to become something which it was maybe never trying to be, whilst at the same time failing to utilise the advantages inherent to that app.

I transferred all my data from Paperless to DEVONthink a few years back. It took me quite a serious amount of time to do, and I chose to forego the metadata. Like you I had no experience of scripting at the time, and didn’t think to ask people in this forum for help.

I wonder whether it might be worth your taking a step back and defining what it is you actually want to achieve. “I want this apple to look like an orange” may not be the way to go about it. Ignoring the optics of Paperless, what are your goals?

To be more precise, or to provide you with an example of what I’m trying to say: why do you want to have a custom metadata entry for the merchant? It won’t help you search for a receipt, it won’t make grouping receipts easier, but it would make it possible to show the merchant in a list. So whether or not it is worth the effort is very much dependent on what the goal is.

If it is important to group a receipt according to the type of expense, it would be perfectly sufficient to leave the receipts appropriately tagged. You can rename tags so that, e.g., category=Food & Drink could become Food & Drink.

5 Likes

Following on to @Blanc’s comment, how often do you need to reference 12 year-old files? Rather than hauling the entire Paperless archive into DT, might it make sense to only bring in the last year or so?

3 Likes

It’s ok… I’m not troubled by what you’ve written. Thanks for your input.

I can agree with this.

Paperless is not well supported, and has some gaping holes in the coding and application. There are so many flaws and gremlins in the app… I’m seeking out a ‘better’ way. You’re right though, it may not be worth it? I hesitated a long time before buying DT3.

The ‘other’ major reason is that I wanted ‘all’ of my financials in ‘one’ place. One database. I can probably do that in Paperless too?.. But like I said, Paperless has a lot of flaws. I’ve lost a lot of data because of font issues on the PDF’s…

I think I mentioned further up the thread, our Financial Year runs from Jul 1st to Jun 30th spanning 2 years (eg, Jul 21 - Jun 22)… so, on occasions when I want to go digging for a receipt, or warranty, I often have to open several databases to try to find it. Plus, whenever I try to ‘remember where I was’… years ago… a receipt is often a good place to pinpoint my location. It’s a great reference point for memory.

I think this is purely a familiarity thing? At least until I can manage a way to achieve the result I’m looking for in DT3… which is a full consolidation of financials… and a bread trail of receipts. Perhaps I have an ‘archived’ and ‘active’ database?..