Migration from Mariner Paperless to Devonthink

wahltho · September 26, 2018, 12:57pm

Dear All,

I am evaluating to switch from Mariner Paperless to Devonthink.

As I have been using Mariner Paperless for several years, data migration is a big issue for me.

Is there a possibility to import my mariner paperless database into Devonthink or some other way of data migration?

Thx + Best Regards

Thomas

BLUEFROG · September 26, 2018, 2:09pm

Sorry, but Mariner has not made data migration to other apps a priority. There is no direct import of Paperless data and you’d have to check with Mariner regarding export options.

mdubnick · January 26, 2019, 9:40pm

If you want to get access to Paperless records in order to move to DT, it is possible. Warning – I am a true amateur when it comes to doing this…
Process involves getting inside the “package” of paperless library. I’d do a backup first just in case…

Find your xxx.paperless file in finder an right click, use “show package contents” option and then open the “Documents” folder – there you should find a folder for each year, and each contains subfolders for each month, with each containing subfolders for each day for you entries. I make copy of the Documents folder and import from the copy into DT.

Each file extracted this way retains its Paperless titles, although each pdf does carry the relevant date (e.g., 78BD8D68-AA84-4E68-AF5F-2 - 12-2-17.pdf). I guess there are ways to use DT to deal with that problem once they are imported.

Hope that helps…

MickeyT9 · December 24, 2020, 4:22am

I am also attempting to make the ‘jump’ from Mariner Paperless to Devonthink. 2020-12-23T13:00:00Z

I’ve just about ‘had it’ with Mariner Paperless. As good as it is for a lot of things, I find there are so many problems with it, I’m just ‘over’ it.
But as with anything that I’ve invested a LOT of time and effort in creating, I’m both hesitant and confused about ‘how’ to make the change.

I have financial records going back to 2002 in Paperless. I’ve broken them up into years (pic 1), and then the years are broken out into months (pic 2). Our financial year starts in July and ends in June in Australia. It’s sometimes very confusing tracking back to documents because you can get the year wrong as it may be in the first half of one year, or the second half of the same year, in a different file.

(photo deleted)

So, the first thing I’m asking myself, is whether I consolidate ALL financial records in one file, or keep them broken up like this? One problem with Paperless is that I have to ‘guess’ what year I think the record is in, then, open that file and begin my search. I’m hoping that D.T. may be able to help me with this by having one ‘Financials’ database. Then perhaps, year by year, month by month?

Never having used D.T. before I have to figure out how to make my first database in this way and then, how to import the PDF documents stored in Paperless from a ‘sealed’ Paperless file.

For the most part, I have gone through the incredibly lengthy process of ‘Save Individual Receipts’ for each month (pic 3). But because of the mundane process, my records are NOT complete as standalone PDF backups. (1 or 2 years not saved as PDF).

(photo deleted)

The individual PDF’s are stored on an external hard drive.

(photo deleted)

But, Paperless has used its own naming and date system to name the PDFs.

I see that mdubnick has suggested a workaround that I will investigate. Thank you.

Well, I just tried to post this reply but hit a forum ‘new user’ brick wall:

“New users can only embed one item”…

I’ll have to delete all but one of my pictures…

Blanc · December 24, 2020, 4:41am

Welcome In my experience DT is worlds more powerful than Paperless. I made the switch too - basically I did the whole thing manually, renaming and regrouping the files as I went along, cursing myself for not having used DT in the first place.

Regarding your data, if I understand what you are doing correctly, you could e.g. put all financial files in one database, and use smart groups for calendar years (e.g. based on the creation date of the document; that is assuming you even need a collection by calendar year) and financial years (again, using the document creation date your smart group can collect documents dated e.g. July 2019 to June 2020). The same documents can be shown in as many smart groups as you want.

DT. Is. Infinitely. More. Powerful. (although I have to admit I quit Paperless a few years back, it may have become more powerful since then. But DT is massive - you’ll find things in seconds).

MickeyT9 · December 24, 2020, 7:22am

Thanks!!

So, I’ve started importing the folders as it is keeping the categorization intact.

I’m not sure I fully understand what you mean about the ‘grouping’ just yet.

I’m limited to one picture as a noobie… I wanted to show the naming and dating which seems to be a problem. There are 3 x areas in D.T. that show dates for my PDF. But it seems the only place I can keep the date printed on the invoice, is in the D.T. title. It’s hard to explain in words. I’ll have to create another post to upload a picture.

rmschne · December 24, 2020, 8:01am

Before you go too much farther with DEVONthink, worthing taking a look at DEVONthink 3.6.1 Documentation, page 8 about “Groups”. There are Ordinary Groups and Smart Groups. Also skim/read “Take Control of DEVONThink 3”.

The manual might be installed with the app. It’s the same content as in Help but easier for me to point to a page number for you. You can get copies of both these documents at https://www.devontechnologies.com/support/download/extras

Put both into your new DEVONthink database.

Blanc · December 24, 2020, 1:04pm

In addition to @rmschne’s suggestions which I second, I would suggest you do the following (and note I use group, smart group and smart rule; as the manual will tell you, these are different entities):

set up a second database - call it test or something along those lines
duplicate a number of files from your Tax Records database to the test database
close your Tax Records database
play with the test database
– try changing the created date of a document to the invoice date printed on the document
– try doing the same with a smart rule (change creation date to document date)
– set up a smart group which only contains documents created from date x to date y

Each of your documents has a number of dates:

the date printed on the document (the “physical date”)
the date created (initially the date the file was put on your device)
the date modified
the date added (to DEVONthink)
any custom dates you might set up

I personally change the creation date of every document to the physical date of the document on import. For many documents you can do that with a simple smart rule (using the action above); if that works reliably for the documents you are using, you’re home and dry. For some documents which put the date in an atypical position or which contain many dates, DT does not know which is the “right” physical date; automating those documents is more complex and requires scripting, regex or similar (but is very rewarding).

The advantage of putting similar records in the same group (say all invoices in one group, all tax returns forms in another group) is that DT’s AI will pretty reliably suggest where to put the next document of that type which you import. If, however, you have 10 groups with similar invoices, the AI has no chance of knowing which of those groups is the one you want to use. So I would not be putting the documents in groups per year, but in groups per type of document. I would then (and actually have done exactly this) set up smart groups which e.g. look in your group called invoices and then display those invoices whose creation date is between x and y (matched to you tax year); as I pointed out, you can have multiple smart groups displaying the same documents.

Ideally you should decide on a structure before you continue, although I admit the I (and others in this forum) have changed my structure as I went along. That’s relatively simple, especially if you haven’t come too far along the line.

Whilst I’m sure you are aware of this, I really recommend playing with test files before using the same rules & actions on your operative files. There is no way to undo the actions of a smart rule (so testing on a subset is essential before you go and now do this to all my files). And that brings me to the importance of backups; this will not be news to you, but I think it bears repeating: it is essential you should have a backup strategy suited to the importance of your documents).

You will find details of backup strategies, smart rules, scripts and so on here on the forum. Do also feel free to post back with questions. You will not “learn” DT in a day - it’s more powerful than that.

SlickSlack · June 21, 2021, 9:53pm

I am biting the bullet and doing this conversion from Paperless to DT, with testing of course.
I have paperless databases for each reporting year, broken up into folders for each category my accountant requires for each years return preparation. I just dragged a folder of one category from Paperless main window to DT main window of a new test db and all the docs popped over instantly.
Interestingly the fields I need for reporting and filing are all there in Keywords as value statements (vendor=MediaTemple, paymentType=Visa etc.)
Am I right in guessing that I should be able to script a method to extract those keyword values and automagically enter the values into custom metadata fields I have created?

BLUEFROG · June 21, 2021, 11:38pm

Yes, it’s theoretically possible to do but I have no such documents to test with.

SlickSlack · July 7, 2021, 6:01pm

I’ve tried to figure this out but haven’t got past my non-existing scripting skills.
There’s a PDF of a receipt here
When I bring it into DT it has the data I need to go into custom metadata fields but the data is in keywords.
The keywords field looks like this
Screen Shot 2021-07-07 at 1.57.13 PM
I imagine that a script would have to strip everything up to and including the equals sign and enter the rest into the metadata field matching the bit before the equals sign.
Anyone want to point me in the right direction?
Thanks

BLUEFROG · July 7, 2021, 6:45pm

Note there is no attribute for Restaurant in your example.

Actually, the forward slash is treating this as a parent and child tag. I’m not sure how you’re getting the tag you’re showing.

What options do you have enabled in Preferences > Import?

Oh wait… is that screencap from Paperless?

SlickSlack · July 7, 2021, 7:12pm

The category is written as Meals/Restaurant in Paperless which is a holdover from a previous accountant’s spec. I can bulk change all the files with forward(or back) slashes in them before I export from Paperless if that makes it easier.

That’s a screencap from DevonThink → properties tab of the inspector. I imported that PDF after saving it as an individual file export from Paperless. I get the same result if I drag and drop a record(s) from the main window of Paperless into DevonThink

My import prefs are like this

BLUEFROG · July 7, 2021, 7:13pm

A hyphen would be a better for this particular situation, provided you want to keep the category custom metadata with Meals and Restaurant.

Also, I would enable Preferences > Import > Tags: Convert Keywords to tags in this scenario.

SlickSlack · July 7, 2021, 7:18pm

That I can do as it’s within my skills to change all the category and preferences in Paperless and DevonThink.

What I don’t know how to do though, is get the Keyword values, or tags for that matter, into custom metadata.

BLUEFROG · July 7, 2021, 7:22pm

Would there be any other tags on the PDFs or just these key:value pairs?

SlickSlack · July 7, 2021, 7:35pm

There are a few sales tax variations that I have to track as well as a subcategory (i.e. category=Software, subcategory=subscription).

This may be wild overconfidence on my part but if I can get a script to do it I can probably figure out how to adjust it for new or different metadata categories.

THAT LOOKS PROMISING!

BLUEFROG · July 7, 2021, 7:38pm

Here’s a smart rule script that processes the tags converted from the keywords. It also should preserve any other tags, if existing.

on performSmartRule(theRecords)
	set od to AppleScript's text item delimiters
	
	tell application id "DNtp"
		set AppleScript's text item delimiters to "="
		set tagList to {}
		repeat with thisRecord in theRecords
			set theTags to (tags of thisRecord)
			repeat with theTag in theTags
				set attribs to (text items of theTag)
				if (count attribs) = 2 then
					add custom meta data (item 2 of attribs) for (item 1 of attribs) to thisRecord
				else
					copy theTag to end of tagList
				end if
			end repeat
			set tags of thisRecord to tagList
		end repeat
		set AppleScript's text item delimiters to od
	end tell
end performSmartRule

And the smart rule setup…

SlickSlack · July 7, 2021, 8:08pm

Holy s**t, that worked! Thanks so much.
I’ll have to very carefully figure out how to get the last few years records into DT and keep it all tidy and neat.
Then it’s making sure that adding new data is sticking to the same patterns and the data stays clean.
THANKS, AGAIN!

EDIT:
I’ll change the slashes to hyphens as well to avoid any weirdness.

Correct me if I am wrong but I would sum up the script as:

for each record,
for each tag,
split it at the equal sign,
everything before is 1,
everything after is 2,
make a Metadata field called 1 and set it to a value of 2
next tag,
next record,

BLUEFROG · July 7, 2021, 11:33pm

You’re welcome and yes your dissection of the script is correct. However, it also puts any tag that only has one component, i.e., no equals sign, into a list. That list is applied to the record as tags. This would preserve any potential tags.

Cheers!