Search through text of a webarchive impossible in DTTG?

Solar-Glare · November 8, 2022, 11:50am

I might look over a feature somewhere, but how does one search through the text of a webarchive after it’s opened?

Specifically large extensive webpages like national laws. Here’s an example of the German Civil Code:

Following the update to iPadOS 16.1, Safari is very - and I mean very - slow with finding words in the text of large webpages and PDFs.

I thought I could work around that by creating a webarchive and importing it into DTTG, only to find out it appears to be impossible to search through the webarchive in DTTG. When convert the webarchive to PDF, I can find words immediately however.

chrillek · November 8, 2022, 12:42pm

Why do you save law texts as web archive? Wouldn’t a pdf be the more natural choice?

Solar-Glare · November 8, 2022, 10:36pm

Because such files can contain more information than the PDF and have a better lay out.

But the main question is: why isn’t it possible to search through a webarchive?

It sounds terribly easy to me.

chrillek · November 9, 2022, 7:22am

Web archives are essentially HTML. So, for a sensible search, one has to filter out all HTML elements. Not something I’d consider „terribly easy“, but feasible.
You’re aware that web archives are an Apple-only thing and that Apple announced there depreciation? Though they’re apparently still there…

cgrunenberg · November 9, 2022, 7:49am

And Apple even added new APIs to generate web archives to macOS 11.0 and therefore after the deprecation

chrillek · November 9, 2022, 8:50am

I like how consistent they are.

Camfella · January 30, 2023, 8:11pm

Hi I have the same problem, I’ve been saving all my Safari webpages as webarchive in DTTG, I’ve been doing it for years and I’m positive I used to be able to search them in DTTG, but now not only can I not see how to search but I don’t think they’re even being indexed and included in a global search anymore, did something change? Did you ever find out anything? I don’t want to have to convert every single file

eboehnisch · January 31, 2023, 4:24pm

Version 3.6.3 has a bug that might keep you from searching the content of documents. Please try to search for content: and see if it produces a result. If it does, version 3.6.4 fixes it.

Camfella · January 31, 2023, 8:10pm

Yeah using the prefix works, without it I’m only getting results from file names no matter what format, thanks for the help

eboehnisch · January 31, 2023, 9:23pm

Thanks for confirming. Expect an update in due course.