Moving from Mariner Paperless to DevonThink

NoThymeToThink · December 25, 2023, 1:04pm

When does Daylight Savings Time occur in your region or in the Time Zone that you have the macOS System set to? Assuming that Daylight Savings Time in your region is observed.

Maybe the issue is potentially related to Daylight Savings Time?

I don’t know if you’ve resolved the off by one day issue but I recall when I tried to import vCal files exported from Palm Desktop for the Mac into the Calendar program on the macOS that events that fell outside of Daylight Savings Time were off by one hour.

whazzup · December 28, 2023, 5:25am

I, too, have thousands of documents. How to I get proper dates to import? Takes long time manually

strong text

chrillek · December 28, 2023, 7:39am

What’s wrong with the scripts posted here?

voilaxaxa · January 9, 2024, 10:31am

thanks chrillek for this script. Worked for me after creating a new paperless database and importing data from the old one (tips from nomanisan).

whazzup · January 28, 2024, 3:17am

I tried copy and paste into Script Editor and got syntax errors. I’m sure I did something wrong

cgrunenberg · January 28, 2024, 6:55am

What exactly did you paste? There’s Python, AppleScript and JavaScript code in this thread.

RobCSS · January 29, 2024, 7:28pm

Is there a way in DevonThink to add to an existing .PDF from the ScanSnap scanner? This was a feature in Paperless that I used all the time. I know I can scan seperately and then move the newly scanned page or pages into an existing document but if I could scan directly into the existing document it would save a step or two.

cgrunenberg · January 30, 2024, 7:26am

View > Sidebar > Import > Image Capture offers such a possibility but this is limited to scanners, cameras, iPads/iPhones etc. which are compatible to image capture and the ScanSnap isn’t. Therefore the easiest option is to merge the scans, see Tools menu.

dxr · February 17, 2024, 6:33am

@chrillek thank you so much for this script. It really helped. The one thing I am stuck with is how to get a custom field into DT. I have a “name” field that I believe is stored in ZCUSTOMRECEIPTITEM.ZSTRINGDATA and with your script it imports into the DT’s “Keywords” along with a bunch of other custom fields. Is there a way to map the Paperless ZCUSTOMRECEIPTITEM.ZSTRINGDATA “Name” to the DT Custom “Name” Field?

chrillek · February 17, 2024, 7:43am

You probably mean the tags when you say keywords?
And do you really have a custom metadata field name as well as a record’s name? Why?
The modification should be straightforward, retrieving the field from the database and adding the value to your custom metadata field with addCustomMetaData()

dxr · February 17, 2024, 9:29am

Sorry, very new to DevonThink. Here is how it is importing. I’m trying to get the name and date to import into a useful form. (The category is already also listed as a tag (which is great). If there is an easy way to have the name and date be either the record name or a custom field that would be great. Really appreciate the help.

chrillek · February 17, 2024, 10:04am

These are PDF metadata, I’d say. And since the script doesn’t do anything with them, these keywords are probably already contained in your original PDF.

Edit See post below for a better way, avoiding PDFKit
For that, you’d have to read the PDF metadata field “Keywords”. That is possible using Apple’s PDFKit framework: You have to read the PDF into a PDFDocument object, get the documentAttributes dictionary and retrieve the value of the PDFDocumentKeywordsAttribute key from it. That should give you an NSArray with the “keywords” as NSStrings. In this case, every keyword is a key-value pair. Looks terrible to me, but at least it’s a lot easier to retrieve than XML data or whatever else people put in their PDFs nowadays.

As I said: It’s possible. But I’m not going to write that script. I suggest searching in the forum for PDFDocument, that might return some posts with sample code.

chrillek · February 18, 2024, 9:39am

In my previous post, I was a bit off target. DT records have a metadata property that should contain the keywords in your case.
So, in the function addDocument, add this at the end (not tested, since I don’t have appropriate documents), replacing newRecord.name = title;

const keywords = newRecord.metaData().kMDItemKeywords.split('\n');
const kwName = keywords.filter(k => k.includes('Name');
if (kwName && kwName.length) {
  const name = kwName[0].split('=')[1];
  newRecord.name = name;
}

This snippet assumes that

the metaData property contains a kMDItemKeywords key. If that’s not always the case, you must check for its existence before applying split to it.
the kMDItemKeywords entry contains keywords separated by newlines. If that’s not the case, change split('\n') accordingly
the Name entry in kMDItemKeywords contains the name you’re after
The string Name and the name itself are separated by a = sign. If that’s not the case, change split('=') accordingly.

The Paperless people must have been a bit weird, storing the dates in US format (the most illogical of all possible formats). But then their AppleScript implementation is borken, and they still sell their software in the App Store although the company has disappeared.

dxr · February 18, 2024, 10:03am

Thank you so much. I was actually able to get the name from the sql database directly (ignoring the pdf metadata that is in a very weird format) and make It the title as you suggested. Learned some sql along the way :). I’m on to trying to get the “amount” field from paperless into DT, which I’m close to… I can get it into the imported file as a random field (ie, URL) using:

(() => {
const app = Application("DEVONthink 3");
const targetGroup = app.databases['test'].incomingGroup();
function addDocument(path, notes, date, tags, title, amount) {
  const newRecord = app.import(path, {to: targetGroup});
  newRecord.tags = newRecord.tags().concat(tags);
  newRecord.creationDate = date;
  newRecord.name = title; /* that's _different_ from the path in DT! */
  newRecord.url = amount;
}
  
  
const queryString = `WITH DOC_TAGS AS (
		SELECT tags.Z_14RECEIPTS1 as ID, ZTAG.ZNAME as NAME  
		FROM Z_14TAGS tags
			JOIN ZTAG ON Z_PK = tags.Z_18TAGS
			
	),
	GROUPED_DOC_TAGS AS (
		SELECT DOC_TAGS.ID, GROUP_CONCAT( DOC_TAGS.NAME, '<<[]>>'  ) as TAGS 
			FROM DOC_TAGS
			GROUP BY ID
	)


SELECT 	ZRECEIPT.Z_PK as ID, 
		DATETIME(ZRECEIPT.ZDATE + 978307200, 'unixepoch') as Date,
		ZCUSTOMRECEIPTITEM.ZSTRINGDATA as Title, 
		zdatatype.ZNAME as CATEGORY, 
		ZCATEGORY.ZNAME as SUBCATEGORY, 
		GROUPED_DOC_TAGS.TAGS, 
		ZRECEIPT.ZNOTES as NOTES,
		ZRECEIPT.ZPATH as PATH,
		ZRECEIPT.Zamount as amount
		
FROM ZRECEIPT
	LEFT JOIN GROUPED_DOC_TAGS ON GROUPED_DOC_TAGS.ID = ZRECEIPT.Z_PK
	LEFT JOIN ZCATEGORY ON ZRECEIPT.ZCATEGORY = ZCATEGORY.Z_PK
	LEFT JOIN ZSUBCATEGORY ON ZRECEIPT.ZSUBCATEGORY = ZSUBCATEGORY.Z_PK
	LEFT JOIN ZCUSTOMRECEIPTITEM ON ZRECEIPT.Z_PK = ZCUSTOMRECEIPTITEM.ZRECEIPT
	LEFT JOIN zdatatype ON ZRECEIPT.zdatatype = Zdatatype.z_pk
`;
const basePath = '/Users/drenshon/Desktop/test/EMRtest.paperless'
const DBPath = `${basePath}/DocumentWallet.documentwalletsql`;
const curApp = Application.currentApplication();
curApp.includeStandardAdditions = true;
const rawDBResult = curApp.doShellScript(`sqlite3 -tabs "${DBPath}" "${queryString}"`, {alteringLineEndings: false});
const resultLines = rawDBResult.split('\n');
resultLines.forEach(r => {
  const resultColumns = r.split('\t');
  if (resultColumns.length === 9) { /* basic sanity check */
    const plDate = new Date(resultColumns[1]);
    const plTitle = resultColumns[2];
    const plCategory = resultColumns[3];
    const plSubCategory = resultColumns[4];
    const plTags = resultColumns[5].split('<<[]>>');
    plTags.push(plCategory, plSubCategory);
    const plNotes = resultColumns[6];
    const docPath = resultColumns[7];
	const plAmount = resultColumns[8];
    addDocument(`${basePath}/${docPath}`, plNotes, plDate, plTags, plTitle, plAmount);
  } 
  })
})()

but am stymied by how to add it to the custom metadata field “amount” which is set up in DT as a decimal number.

Appreciate your help. I’m very new to this.

chrillek · February 18, 2024, 10:30am

No harm in that My SQL days were some years ago, which made me shrink away from working my way through that statement.

Setting the custom meta data field amount should work like that

  app.addCustomMetaData(0.0 + amount, {to: newRecord, for: "amount"});

0.0 + amount converts the string in amount to a number. Necessary, since all values from the Paperless database are returned as strings.

I suggest digging around in DT’s scripting dictionary (open it by dragging the DT icon on a running Script Editor). That’ll tell you more than you ever wanted to know about it

dxr · February 18, 2024, 6:54pm

That 0.0 + conversion to a number was what was causing me the trouble! Thanks so much.

chrillek · February 18, 2024, 7:08pm

JavaScript can be mean sometimes. There’s also the other conversion, so that "" + 1.0 results in the string “1.0”`.

whazzup · April 13, 2024, 11:08pm

Thanks for the script, but it hasn’t;t worked for me. I don’t know how to do any of this.

Someone suggested using DevonThink moving forward, and I think I understand how the program works, now. I can import new documents. There are scripts built in to change date to the document’s date. This works – most of the time. For bank statements, it often chooses the date from the start of the period; for example, statement dated 3/31/24 covers 3/1/24-3/31/24, and DT takes 3/1/24 as the document date.

Overall, it’s a very snappy interface and it’s working well for me. Maybe I can get someone to help me do the conversion. As long as Paperless runs, I can search documents. And, when it stops running, I can open the package contents and extract my pdf’s.

Glad I made the switch.

CBeaud5596 · May 31, 2024, 11:05pm

I wanted to weigh in here from a non-coding lay(woman) who only has a basic understanding of how DEVONthink works, knows no coding outside of basic html, is coming from Paperless (and prior to that Neat) and just wants all paperless documents in one program. For a variety of reasons I do not use the full power of these tools, and rely 100% on the search feature provided by Optical Character Recognition (OCR) capabilities to find what I’m looking for. The coding above and conversations being held are WAY over my head so I had to find another way to simply get my files into my system as easy as possible and I wanted to share incase this may help someone else.

Disclaimer…

I have found where the Paperless PDFs live in my computer (I’ll post it down below incase this helps you as well) and will imported those files/folders into DEVONthink to solve my issue.
I do have some files that I had added meta data/details too but I’m not at all worried about that being retained because OCR works so well for me. I’ve already tested importing a folder into DEVONthink and searching for the document and it locates it without any issue.
My files are in folders by year/month. I’m unsure if they came over that way from my initial “Neat” program or if this is something Paperless did. I’m being extra lazy right now and just importing the folders with all the files in that same structure. Again, if it bothers me later I’ll deal with it then. Your files may be organized different but it should be located within the same master folder which I’ll explain below.

WHERE ARE YOU FILES WITHIN PAPERLESS?

1 - Locate your paperless file, which is held in your main “Documents” folder, you can see what mine is named above. Right click this file and select “Show Package Contents”.

2 - Select the new “Documents” folder that now appears and this is where your files are located. Poke around in there and you will find your PDFs…

3 - See my disclaimer above for how my files read to understand if yours look different.

4 - This images shows my PDF nested within the folders.

NOW WHAT?

1 - I’m copying all my folders sitting directly inside that 2nd Documents folder to a new folder on my desktop, it’s my temporary dumping ground as we can’t import directly through the paperless file.

2 - Then go into DEVONthink and select File/Import and navigate to folder on your desktop, select the files/folders you want to import and follow the import prompts.

3 - Now wait and watch all your files populate. Once it’s done, try searching some of your files to ensure they are there and then delete the temporary folder you put on your desktop. You don’t need to retain those files because the originals are still sitting in the Paperless folder and now you have a new copy within DEVONthink.

Hope this helps someone, I’m off to finish this process. Fingers crossed.

BLUEFROG · June 1, 2024, 1:36am

Welcome @CBeaud5596
Thank you for taking the time to share your thoughts on this.