I have successfully transferred my data from Paperless to DevonThink. I will explain below. The procedure I used will only be useful to those familiar with both Python and AppleScript. Additionally, there was no attempt to make this general or generic to handle the myriad of ways Paperless is used. My usage of Paperless was uncomplicated. It didn’t take much effort to translate that to DevonThink.
The only part of this process that I may question was creating a custom metadata field called “Title” so I could associate a custom title do my document. I did not want to change the name of the document itself and did not see an alternative field.
I started with the following python script:
import sqlite3
import os
import json
PATH_TO_DB = "/Users/ericg/Desktop/untitled folder 2/DocumentWallet.documentwalletsql"
BASE_PATH_TO_DOCUMENTS = "/Users/ericg/Desktop/untitled folder 2"
QUERY = '''
WITH DOC_TAGS AS (
SELECT tags.Z_14RECEIPTS1 as ID, ZTAG.ZNAME as NAME
FROM Z_14TAGS tags
JOIN ZTAG ON Z_PK = tags.Z_18TAGS
),
GROUPED_DOC_TAGS AS (
SELECT DOC_TAGS.ID, GROUP_CONCAT( DOC_TAGS.NAME, '<<[]>>' ) as TAGS
FROM DOC_TAGS
GROUP BY ID
)
SELECT ZRECEIPT.Z_PK as ID,
DATETIME(ZRECEIPT.ZIMPORTDATE + 978307200, 'unixepoch') as IMPORTED,
ZRECEIPT.ZMERCHANT as TITLE,
ZCATEGORY.ZNAME as CATEGORY,
ZSUBCATEGORY.ZNAME as SUBCATEGORY,
GROUPED_DOC_TAGS.TAGS,
ZRECEIPT.ZNOTES as NOTES,
ZRECEIPT.ZPATH as PATH
FROM ZRECEIPT
LEFT JOIN GROUPED_DOC_TAGS ON GROUPED_DOC_TAGS.ID = ZRECEIPT.Z_PK
LEFT JOIN ZCATEGORY ON ZRECEIPT.ZCATEGORY = ZCATEGORY.Z_PK
LEFT JOIN ZSUBCATEGORY ON ZRECEIPT.ZSUBCATEGORY = ZSUBCATEGORY.Z_PK
'''
def outputScriptLine(result):
filePath = f'"{os.path.join( BASE_PATH_TO_DOCUMENTS, result[7] )}"'
notes = f'"{result[6].strip()}"' if result[6] is not None else '""'
importDate = f'"{result[1].strip()}"'
tags = result[5].split( '<<[]>>' ) if result[5] is not None else []
tags = [ result[3], result[4], *tags ]
tags = filter( lambda x: x is not None, tags )
tags = [ f'"{x.strip()}"' for x in tags]
docTags = f"{{{','.join( tags )}}}" if len(tags) > 0 else '{}'
title = f'"{result[2].strip()}"'
print( f"my addDocument({filePath}, {notes}, {importDate}, {docTags}, {title})" )
print( "delay 3" )
con = sqlite3.connect( PATH_TO_DB )
cur = con.cursor()
res = cur.execute(QUERY)
results = res.fetchall()
for result in results:
outputScriptLine( result)
Its job is to generate strings of text that can be pasted into an AppleScript script. Those strings point to the file on disk and provide the metadata I needed to preserve. I put the delay 3
in there between import calls because I was having some trouble getting it to work otherwise. The delay seemed to allow the system time to work a bit before the next import was generated. The delay of three seconds is arbitrary otherwise.
The base AppleScript is:
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
on addDocument(filePath, notes, importDate, docTags, title)
tell application id "DNtp"
set dp to import filePath to current group
set comment of dp to notes
set creation date of dp to importDate
set tags of dp to docTags
add custom meta data title for "Title" to dp
end tell
end addDocument
--
-- insert the lines from the python script here
--
Just copy the lines output from the python script into the AppleScript and run. Note, there may be a size limit for AppleScripts, so you may need to divide the lines to page into multiple chunks and run them separately.
If anyone has any questions, let me know. Otherwise, I will be testing out my transfer for the remaining trial period I have for DevonThink.
I do wish I could hide the Name
column, but that does not appear possible.
Enjoy.