[help][about zip files]-Is it possible to decompress zip files by double clicks zip files in DTP's database window?

BLUEFROG · September 1, 2020, 2:14pm

Threads are generally left open for future commenters. You don’t need to continue replying to the thread if you’re satisfied.

Cheers!

skip108 · February 16, 2023, 3:44pm

I frequently import blocks of nested folders (downloaded, not created by me) that contain a mixture of PDFs, JPGs, and ZIP archives (the latter of which generally contain more PDFs and JPGs).

Currently, I have to either 1) manually go through and expand all the archives before import, or 2) after import, drag out the ZIPs, unarchive them, and drag them back into DT.

As either a feature or a script request – can this be automated? A flag in Settings to expand imported archives would make sense. Alternatively - shouldn’t a script for this be fairly straightforward?

(or is there an option in Smart Rules for this?)

chrillek · February 16, 2023, 3:52pm

Kind of. You have to do the unpacking outside of DT to avoid polluting its database package. In any case, I’d write a simple shell script for that, no need for Applescript or JavaScript.
Something like

mkdir tmp
cd tmp
find .. -name "*.zip" -exec unzip {} && rm {} \;

And import then

skip108 · February 16, 2023, 3:59pm

Thank you. Although I think (my shell scripting knowledge is only so-so) that this would only unpack the parent directory, right? As I mentioned, these folder trees I’m receiving may have a few levels of nesting.

How do I do that with the shell? Or do I need to use Applescript for this?

chrillek · February 16, 2023, 4:03pm

Please check out the man page of find. Assuming is not helpful in this context.

Put the commands in a file, make it executable and feed it to the shell with sh -c file or simply ./file. The net is full of info on that, BTW.

skip108 · February 16, 2023, 4:37pm

Yes, the “net” certainly is full. Of something, anyway.

And I couldn’t agree more in re: assuming. I did, in fact, consult the “net” at a rather high (if potentially unreliable) level. Before responding, I asked ChatGPT to interpret the shell script. The dialogue appears below:

*please explain what this shell script does: *

mkdir tmp
cd tmp
find … -name ".zip" -exec unzip {} && rm {} ;*
This shell script creates a new directory called “tmp” in the current directory and changes the working directory to the newly created “tmp” directory.

*Then it uses the find command to search the parent directory (indicated by …) for files with names that end in “.zip”. The -name option specifies the pattern to match and .zip is a wildcard that matches any string that ends with “.zip”.

The -exec option of the find command is used to execute a command on each file that is found. In this case, the command is unzip {} && rm {}, which unzips the file and then deletes it. The {} is a placeholder that is replaced with the name of each file found by the find command.

The ; at the end of the command is used to terminate the exec command.

So, in summary, this shell script creates a new directory called “tmp”, changes to that directory, finds all the files in the parent directory that end with “.zip”, unzips each file, and then deletes it.

chrillek · February 16, 2023, 5:01pm

Thanks for letting me know. I’ll simply ignore postings referring to ChatGPT from now on.

If people really prefer this software to human interaction and consulting the relevant sources themselves – so be it.

I’m sure ChatGPT will eventually figure it all out.

BLUEFROG · February 16, 2023, 7:03pm

A flag in Settings to expand imported archives would make sense.

Technically speaking, it wouldn’t.
And no, there is no built-in decompression function. DEVONthink is not a filesystem like the Finder. It is a database application that presents data in a familiar way, i.e., like you see it in the Finder. Many of the similarities end there.

skip108 · February 17, 2023, 6:02am

I don’t wish to be argumentative, but maybe the purely technical perspective is not the best way to look at this question. I’d like to make the case (as I often have for electronic health records) that the technology should follow human workflows, and not the other way around.

Folks use Devonthink as a means for capturing and organizing all sorts of data. And in many respects, DT is a terrific tool for this. The web clipper, for instance, is exceedingly useful and ergonomic. And the tools for transforming data are equally robust (e.g., the OCR and PDF–> text functionality). If DT recognizes a file type, it can generally handle it and manipulate it in a versatile variety of ways…which both mirror the OS file system and improve significantly on it.

What seems out of oddly out of place, then, is the fact that DT will import ZIP (and other archival format) files, and it recognizes the filetypes correctly…but then is unable to do anything at all with them. The native file system readily packs and unpacks them. And folder automators like Hazel have Archive and Unarchive functions. Houdahspot can search inside ZIP archives (presumably because Spotlight can, too). Why then would DT, which is such an intelligent file handler for so many different filetypes and extensions, treat them as inoperable objects?

One purely prospective solution (which I just developed, tested, and used all day today) was to create a Hazel folder (a rule) that automatically unarchives everything you put it in, no matter how many levels of children the parent folder has. If I get a complex folder structure with embedded ZIPs, I can “launder” it merely by dropping it into the Hazel folder before dragging it into DT.

I also dragged some groups out of DT and into the Hazel folder to expand their nested archives…then dragged the whole structure back. Obviously, this wouldn’t be hard to automate, as long as the brief lag period for Hazel’s rule activation was accounted for. Or I suppose I could even use this folder in place of the Global Inbox in the Finder, and add a rule to forward files to the Inbox after unarchiving, But this feels like a pretty awkward, non-ergonomic solution to something that DT could do natively.

I realize that DT is, first and foremost, a document manager – but it handles all sorts of filetypes that are “document-adjacent” – emails, bookmarks, images, even MP3s. And documents can certainly live in compressed archives. It sure would be nice for DT to be able to manage these without having to pre-process or “launder” them with other programs.

Thanks for listening. I really appreciate all that you do for this community.

Skip

rmschne · February 17, 2023, 6:22am

Me, I just put DEVONthink in a window on the left, and Finder in a window on the right. The integration “app” is the macOS, not DEVONthink. Trying to keep it simple.

BLUEFROG · February 17, 2023, 12:09pm

Not argumentative at all and thanks for the nice comments!

However, this has to be looked at from a technical perspective.

Why then would DT, which is such an intelligent file handler for so many different filetypes and extensions, treat them as inoperable objects?

Again, because DEVONthink isn’t a filesystem and also ZIP files aren’t merely simple files. Compressed files must be decompressed before they’re imported. That decompression has to happen in the filesystem, i.e., the Finder.