Apparently, MarinerSoftware has gone out-of-business, so I need to find an alternative.
I was wondering if DevonThink provided any features to make importing information from Paperless easy?
Apparently, MarinerSoftware has gone out-of-business, so I need to find an alternative.
I was wondering if DevonThink provided any features to make importing information from Paperless easy?
What options do you have in Paperless to export your data and files? what format? are the files unchanged on your computer without using Paperless so that you maybe can simply drag and drop? what have you tried that hasn’t worked as expected?
There aren’t any useful export options. There is one menu option to export to a CSV, but that is just a document containing some metadata about what is in the library.
The paperless document is a package and the format looks fairly straight forward and all of the documents are contained inside. There would likely be work to figure out how to associate the metadata with the documents. The metadata appears to be in a SQL document inside of the package.
I was wondering if the Devon software engineers have done any work to determine how to import the paperless stuff directly. It does not look like Paperless will do much to help out.
Do you mean a sqlite file?
yes. I can use a sqlite app like DB Browser for SQLite to open and browser the data, etc.
Welcome @eric_g_97477
Sorry but no, at this time there is direct import from Paperless. The only options are whatever export functions are available to you in Paperless.
When I switched I did do an export of my at-the-time in progress Paperless docs and try to get some of the metadata out. If I recall correctly it involved the metadata from Paperless being stored in each PDFs key words (I think that’s what they’re called) and extracting that using scripting to get it into Custom Metadata with a TON of help from people here on this forum.
I still have about 4 years of Paperless libraries, that I never bothered with at the time, for old financial information, that would be great to turn into DevonThink databases.
Not crucial as they’re on the way to becoming older than the required record retention limit. But it would be nice to have the personal information hidden in there available to the great DevonThink search.
I’ll follow this thread in case anyone who knows how to get stuff converted.
Just wondering if the meta data for old info that important. sounds like just to get the documents source would be success.
What kind of import features does DevonThink offer?
Can I provide a file to import along with metadata (tags, etc.)? Information I know I need to maintain:
Then one can write a script to extract the data from the database.
Hold the Option key and choose Help > Report bug to start a support ticket. Please do a sample export to CSV and attach it.
PS: Bear in mind, it is the weekend, so it may not be looked at until Monday.
I have been putting that off, assuming at some point I would just bring in the PDFs and that would be that. Hearing that Paperless is gone for good makes it more likely.
I will keep an eye on this thread because there’s still that part of me that doesn’t like to lose (meta)data and the work that went it to acquiring it.
Perhaps someone will write the code for you? !
With 30 years of Software Engineering experience, I am capable of figuring out how Paperless is storing the data and get it into a format that DevonThink can import while preserving the metadata. I just need to know what that is assuming it is possible.
What what is? Please elaborate.
Well, let us know how you make out.
What what is?
How to supply a file with the metadata I need to preserve when transferring paperless documents.
I don’t think there’s any special file format in DevonThink, it’s more that the import process allows DevonThink to add the new files to its own database with associated metadata and file system.
For me it would be:
PDF with the fields from Paperless (various amounts or splits, VATs and sales tax, currency, tax catagory etc) mapped to
DevonThink Custom Metadata fields, probably with similar field headings.
Allowing for the fact that I know very little of software engineering I would infer from my little experience that anything you could write would read, either directly from the Paperless SQLite source or an exported CSV, via scripting within DevonThink (JavaScript JXA or AppleScript) and write to new files in DevonThink.
Basically, you’d
You can use JavaScript or AppleScript for that. Not a biggy.
If it helps anyone, as I evaluate how well a transfer from Paperless to DevonThink will go, here is a SQL query for Paperless that grabs all of the information I need.
WITH DOC_TAGS AS (
SELECT tags.Z_14RECEIPTS1 as ID, ZTAG.ZNAME as NAME
FROM Z_14TAGS tags
JOIN ZTAG ON Z_PK = tags.Z_18TAGS
),
GROUPED_DOC_TAGS AS (
SELECT DOC_TAGS.ID, GROUP_CONCAT( DOC_TAGS.NAME, '<<[]>>' ) as TAGS
FROM DOC_TAGS
GROUP BY ID
)
SELECT ZRECEIPT.Z_PK as ID,
DATETIME(ZRECEIPT.ZIMPORTDATE + 978307200, 'unixepoch') as IMPORTED,
ZRECEIPT.ZMERCHANT as TITLE,
ZCATEGORY.ZNAME as CATEGORY,
ZSUBCATEGORY.ZNAME as SUBCATEGORY,
GROUPED_DOC_TAGS.TAGS,
ZRECEIPT.ZNOTES as NOTES,
ZRECEIPT.ZPATH as PATH
FROM ZRECEIPT
LEFT JOIN GROUPED_DOC_TAGS ON GROUPED_DOC_TAGS.ID = ZRECEIPT.Z_PK
LEFT JOIN ZCATEGORY ON ZRECEIPT.ZCATEGORY = ZCATEGORY.Z_PK
LEFT JOIN ZSUBCATEGORY ON ZRECEIPT.ZSUBCATEGORY = ZSUBCATEGORY.Z_PK
The reason for the 978307200 is Paperless is using CoreData (CD). CD timestamps start from Jan 1, 2001. However, the Unix Epoch starts in 1970. 978307200 converts the CD timestamp to a Unix Epoch timestamp.
The one thing I am finding odd is that Paperless is showing the timestamp of a document as 3/12/2019. The CD timestamp in the database is 574130186.540019 which is 3/13/2019. Why it is off by one day, I do not know. The vast majority of the dates shown by Paperless are correct. I am guessing this is a Paperless bug.
[UPDATE: initial query wasn’t pulling the local path to the actual document…this one does]