This is controlled by the web site. If they send you huge images, then … I use PDFPen to “Create Optimised PDF…” Or one can use Ghostscript to compress PDF’s–a method when I get time I hope to automate.
GS has options to compress files. As it’s command line driven, provides an opportunity to write some code to automate it.
I just haven’t gotten around to it other than to test feasibility. I think it is, but other priorities keep me from progressing the idea. Meantime, I save many things in Markdown which strips away images to links to back to site. If I really want to save the images, I convert the MD to PDF, compress, and save again. I don’t make a big deal of it.
As you may already know, the biggest contributor to the file size of a page converted to PDF is the presence of embedded media, such as images. Sometimes it’s not obvious that a page will turn into a very large PDF file. It’s basically impossible to tell at a glance how large the images on a web page really are: the page layout may scale an image to a certain size and it will look small to the eye, but the actual image (the one fetched from the server and sent to your browser) may be very large.
In theory, it would be possible to downscale images during the html-to-PDF conversion process, but most converters that I’ve encountered don’t offer control over that. There are some tools that can be used to post-process an existing PDF file. @rmschne mentioned a couple above; as it turns out, Preview on macOS can also do it, and when I tested it with your sample web page, it went from 7.1 MB to 890 KB – which is pretty darn good!
You still have to open the file manually in Preview, export, choose the Quartz filter for reducing the file size, and save the result (you can overwrite the original file), but maybe that’s not too bad for dealing with occasional bloated PDFs. If you want to control the compression parameters, a posting on the Apple Stack Exchange site from 2020 demonstrates how simple it is to create a new filter using the ColorSync Utility in macOS.
There are online tools for compressing PDFs too; Adobe even offers a free service, but I would not recommend uploading sensitive material to any third-party site like that.
Thanks for that. I used the Preview Quartz Filter for a while, but I found that the results were often more fuzzy than I preferred. PDFPen does a pretty good job along with more control of how the compression is done. The settings I use are as shown below. Yes, it’s an app that one has to pay for. Apple’s Preview “free” with macOS. As @mhucka says, there are others.
Maybe it’s time to dust off my little project to automate this with Ghostscript. It’s also “free” and available to be driven by a script. I’d be looking to get the compressing as good as PDFPen.
I think that might be the result of the default parameters in the Quartz filter for reducing file sizes. With the ability to control them by creating another filter definition (mentioned above), it might be possible to improve on that.
(PDFPen or Ghostscript are also fine. The integration of Quarts & Preview into macOS may offer some advantages, but third-party tools can offer better or other features.)