Paperless app Replacement (for Receipts)

Amazon_Rainforest · December 8, 2024, 4:04am

I have been using DEVONthink for about a week and I want to run this by the experts. What prompted me to buy this app was the end of support for the Paperless app and the DT sale price. I want to create a separate thread since the other threads I read didn’t address my implementation. I welcome all feedback and I am not averse to changing my implementation if there is a better one to be had.

Here is my replacement for Paperless:

Ran Paperless and exported the list of receipt column data as a TSV file.
Used DT to create a Receipts database in a specific location on my MBP (under Documents folder).
Copied all the PDF receipt files into the folder associated with the DT Receipts database. Note: I changed the directory structure slightly to use fewer folders and renamed the PDF files to something more meaningful.
I indexed this directory structure so that I could access it within the context of my DT database. Also, I did NOT want these receipt files placed into the database. I want to keep my Finder access since I will be manually putting future receipt files into this structure.
I created a Template for all the data columns. Note: I was disappointed at the lack of documentation on Template creation.
I created an empty Sheet file (Kind: TSV Document) using my new Template.
I opened the Sheet file with Excel and populated it by importing the TSV file exported from Paperless.
I saved the file and all my receipt records miraculously appeared in DT.
I made some modifications to the columns. For example, I created a numerical count column and manually numbered each line starting with the oldest item (item 1). I use this field to reverse sort the receipt records so it shows the newest one first.
I wanted to link each line with its PDF receipt file so I created a new column called “Receipt” and defined it as an “Item Link” field. I was able to use this field to point to a particular file in the PDF directory structure I had previously indexed.
Unfortunately, I could not do anything with this new field, i.e., I could not open the PDF file by right clicking. (It would have been nice to see an “Open” context menu item.)
I decided to change the field type from “Item Link” to “URL” and DT automatically replaced each item link entry with its equivalent URL. Nice!
Now, I could single click on an entry in the Receipt column to select it then I can right click and use the “Open Link” context menu item. DT opens the PDF receipt file in a local viewer. From there I can use PDF Expert to further open/edit the file.

I am attaching a screenshot of the final result. I have not finished manually linking each record to its PDF file.

I ran into some issues importing from Excel which I feel are bugs:

“Amount” values $1,000+ were munged and did not import correctly. I had to manually correct these. Any amount under $1000 imported OK. Note: could be that it chokes on the comma, e.g., “$1,234.99” fails.
I could not enter the € or £ currency symbols in the Amount field even though I defined it as currency. It would only accept the $ sign.
Most of the “Date” fields imported well but a few did not and required manual correction. For example, in Excel, a date was displayed as “11/23/24”. This was imported as “Nov 23, 24”. I had to manually add “20” to make it show “2024”. Only a few entries were like this, thankfully.

Other things:

I should not have to define my Receipt field as one type and then change it to another just to set it then use it.
If I click on a field, the line becomes highlighted and I can enter a value for that field. If I press , the field below the current field is opened and I can change it next. However, the line highlight does not follow where the open field is. The line highlight should always follow any field being edited.
There must be some way where I can go directly to any field on a line and right click and get that field’s context menu. Right now, the context menu for the entire file opens. It should not take a two-step process, i.e., select the field then right click.
Using Excel is better than using Apple’s Numbers app. If I make a change in Excel and save it, the change instantly shows up in DT. In Numbers, it asks me to specify a filename and place to save the updated file.
It would be nice to see a tooltip when I hover over a particular field in my file. Some fields are truncated and it I would like to view the entire contents.

Questions:

Does anyone know how to display the full contents of a single highlighted line in a separate side pane? Editing there would be nice too.
Is there a way to quickly select and make my file read-only to prevent accidental changes and read-write when I want to add a new receipt?

That is all I have. I welcome all comments. Thanks!

DTLow · December 8, 2024, 4:45am

It would have been useful if the subject title indicated your Receipts subject

I also store my receipts in Devonthink; pdfs, emails, scans, …
tagged as required; date, amount, budget category, vendor, …
generic database, no special folder

I don’t maintain an ongoing spreadsheet
The receipt data in Devonthink is sufficient for my needs

I do maintain a monthly budget spreadsheet (screenshot at Imgur: The magic of the Internet)
using an Applescript to refresh the transaction table from my Devonthink receipts data

In Numbers, it asks me to specify a filename and place to save the updated file

I use Apple Numbers for my budget spreadsheet
Stored in Devonthink
No problem with updates

BLUEFROG · December 8, 2024, 6:35am

Welcome @Amazon_Rainforest
I think you’re coming into the situation with a lot of preconceived notions and expectations which seems to have lead you down a rather circuitous path.

Note DEVONthink is not a replacement for Paperless in that it’s not a bespoke receipt application nor is it Excel/Numbers. However, there are many people who have transitioned to DEVONthink and happily use it for filing their receipts and financial documents.

I would start here in a new database and examine the results. Then proceed from there…

chrillek · December 8, 2024, 11:56am

In my words:

You create a folder ~/Documents/ReceiptsFolder;
You create a database Receipts in this folder (~/Documents/ReceiptsFolder/Receipts.dtBase2);
You copy all PDF receipts into ~/Documents/ReceiptsFolder, preserving the original folder structure;
Then you do something to your folder structure (still in ~/Documents/ReceiptsFolders) and the names of the PDFs in there

All this begs one question: Why do you want to use DT? What do you expect from it over simply having your PDFs in a self-defined folder structure on disk?

Why? What is the point of having DT at all? (And I think your argument is circuitous: You decide to put everything in Finder, and then you decide that you don’t want to import into DT because you want to add things to your Finder folders – it all boils down to: “I want to put everything in Finder anyway”). But if you imported everything into DT, you could equally well put future receipts into DT. And you’d avoid a lot of the complications indexing entails.

It’s not that indexing is inherently bad, and there are certainly use-cases for it. But yours, as you describe it, isn’t one, IMO.

Also, having your files in the same directory as your database is, again IMO, not optimal. It pollutes the database directory with stuff that should be separated from the database. Moreover, if you “indexed that folder structure”, it is not clear if you indexed ~/Documents/ReceiptsFolder or a folder below that. In the former case, you’d be indexing the database into itself, which is at least weird.

I suggest that you take a step back and think through what you want to do with your receipts, what you need them for. And then figure out how DT might help you to achieve your goals. Trying to replicate the behavior and workflow of another app will probably cause a lot of unnecessary friction. And it’s also probably not even possible.

If you dive deeper into DT, you’ll see that you do not need Excel or a template or Paperless to create a summary sheet. That can be scripted from within DT, which is less convoluted than template, TSV, excel, DT, modify TSV etc. etc.

Amazon_Rainforest · December 8, 2024, 5:59pm

These are all valid points raised, thank you. I’ll try to explain further where I am coming from and the reasoning behind my choices.

I wanted a self-contained solution to replace Paperless, i.e., an app. Support stopped years ago and even though my copy works, it’s buggy and could stop working with the next macOS release. Then it will be too late to switch to something else. I chose DT not solely to replace Paperless but because I can also use it for future projects.

I needed more than the management of PDF receipt files. These files are secondary, necessary baggage, you could say. The more useful part is the receipt records.

A receipt record in DT contains information that is not found anywhere else (including its receipt file). For example: username, license key, references to other data, etc. Though I rarely look at a receipt file, I still need it. For update purposes, proof of purchase, warranty claims, etc. You can’t get that from a database record.

I indexed the PDF receipt files for easy Finder access and because I didn’t want to bloat the database with a lot of PDF files. Perhaps I should reconsider storing them inside the database. This is, after all, one of DT’s uses: document management.

There are other, some would say simpler, ways to manage my receipts. I was never looking for the easiest solution; I was looking for the best solution. Otherwise, I could use the existing receipt file structure and store my receipt records in an Excel spreadsheet. But for me, it is a matter of convenience. I don’t like using spreadsheets, and throughout my career (I’m retired), I used them only when I had to.

I don’t completely understand BLUEFROG’s comments. He’s right, DT–out of the box–is not a replacement for Paperless, but nothing says DT cannot be configured as a close approximation. What caught my attention when I was searching for Paperless alternatives was that DT could be used for a good many other things as well as a Paperless replacement.

BLUEFROG · December 8, 2024, 6:05pm

That’s all correct and nothing contrary to what i said

Did you read the link I posted and perhaps try the method described there?

Amazon_Rainforest · December 8, 2024, 6:09pm

This could have something to do with whether the file being edited is stored in the database (as you do) vs indexing the file (as I do). Not really sure why it works for you and not me.

chrillek · December 8, 2024, 6:29pm

But then why are you mimicking a spreadsheet with this list of receipts? In DT, you could use custom metadata fields for your “username”, “license no”, “warranty”, whatever data. And you can search for these fields.

What you’re essentially doing is replicating information besides the documents they belong to. That’s error-prone because you must think of updating another something when a document changes. Or when a new document arrives. Or when you delete a document.

It would be nice if you quoted the part of a post you’re referring to – otherwise context gets lost. Here, you’re commenting on @dtlow talking about editing their data in a Numbers sheet. Which works perfectly fine from within DT. That is, you create an empty sheet wherever you want and import it into DT.

Then, when you open it from within DT (right-click, “Open with…” or just set DT’s preference to open the default app on double-click), you can do whatever you want with it in Numbers and save it afterward – it will still be in DT, with all the changes you made. No one will ask you for this file’s name ever again.

Amazon_Rainforest · December 8, 2024, 7:02pm

Yes, I already read this thread in my initial search of your forum. The solution is file based. Import the receipts then OCR scan them. The problem here is that the receipts contain no standardization. For example, price might be called amount, vendor might be called merchant. Then there is the problem of additional metadata, i.e., information which is not contained in a receipt file but is only known to Paperless. Would that data have to be manually entered into DT as a user-defined meta field?

My solution is record based. A record contains all the needed meta data imported from Paperless as well as a link to the receipt itself. Records never change; I only add more records.

I will investigate further this alternative implementation and see whether it will serve my needs. Thanks.

chrillek · December 8, 2024, 7:23pm

That might be scriptable. Paperloss stores its data in a SQLite database, and people have published scripts here that read this data and use it to populate DT databases.

And your “record” is something that exists independently of the data it records. It’s like a very basic relational database, not containing the real data, but only references to it and metadata. Which still is something you don’t need DT for. Or rather: You can use it with DT. But you go to great lengths to avoid DT’s advantages.

BLUEFROG · December 8, 2024, 7:32pm

You’re welcome.

The poster’s suggestion wasn’t about doing OCR on the documents. It was copying items to the desktop then importing into DEVONthink.

Will it make custom metadata? No. However, if DEVONthink’s Settings > Files > Import > Tags > Convert keywords to tags is enabled, it should yield tags on the documents if keywords are present.

Also, though it’s not our code, there was a code-based approach someone was messing about with earlier in that thread.

SlickSlack · December 8, 2024, 10:18pm

I’ve lost track of the OPs reasoning behind their methodology but when I made the move to DT I went with reproducing pretty closely everything I needed from Paperless in DevonThink using custom metadata and tags and groups. Maybe had too many custom metadata fields at first but once I got in the swing of it and did my first set of reports for tax purposes, I narrowed it down and it works pretty well.

The metadata from Paperless is also written into the keywords metadata of the PDFs that are exported from Paperless.

BLUEFROG · December 8, 2024, 11:32pm

Do you recall format of the keywords, e.g., category:entertainment?

Amazon_Rainforest · December 9, 2024, 1:26am

This is interesting. Does DT do anything with PDF metadata?

BLUEFROG · December 9, 2024, 2:13am

Keywords can be converted to tags.

SlickSlack · December 10, 2024, 4:46am

This link is from a thread where you, @BLUEFROG helped me with this same situation. There’s a screen grab in there of the keywords of a PDF exported from Paperless and imported into DT. Although the keywords are visible in any PDF app I looked at, if I remember correctly.
There’s also a script you gave me that I used to turn those keywords into custom metadata.

BLUEFROG · December 10, 2024, 5:57am

Ahh, indeed. Thanks for the refresher. Crazy that was 3.5 years ago already