web interface - not for automatically generated docs

EricMc · January 19, 2009, 6:07pm

I use dt pro office with one db that contains mostly automatically scanned and OCRd documents. I don’t organize these documents or change their automatically generated file names. I rely on search to access them based on their contents.

With dt pro office 1.5, the web interface started with a search box and then went to a list of results. This paradigm is well suited to accessing these automatically scanned docs. It did have some issues that hampered usability though. These were the limited search functionality (addressed with 2.0), the small thumbnails in the results, and the inability to sort the results. Even so, it was usable.

I tried the 2.0 beta. The web interface addresses the search issue, but takes steps back in usability for the automatically generated doc type of db.

First, starting up with a list of all docs is problematic on a db with thousands of docs. This locked up Safari when I tried it. I had to create a new, very small, db to test out the web interface.

Second, the results list shows file names and icons. With a db full of pdfs with file names that are just timestamps this makes it impossible to distinguish one doc from another unless you happen to know the exact time the doc you are looking for was scanned.

Third, if there is only one db to be shared, why take up ~20% of the available real estate with a list of dbs?

Bill_DeVille · January 19, 2009, 11:23pm

As a test, I just opened via Safari Web sharing a group that contains 9,295 items. I had to tell Safari several times to continue loading, but it eventually (in about a minute) made all the items available. Pretty RAM-intensive, though. With 4 GB RAM installed and about 1.4 GB free RAM when I told Safari to do that, I started getting Virtual Memory pageouts.
True, without informative file names, the only information one has about the items displayed in the search results list is that they met the conditions of the search query.

I do a lot of scans, and do not by any means rename all the captured items. But for things that are likely to be most useful, I’ll use the CM option Set Title As, to quickly give a meaningful name based on selected text. (I always turn off the Set attributes option in Preferences > OCR, as that lets me get a lot more scans done in a session – there’s a tradeoff.)

That’s a point, but I find that most of the time I’m sharing 3 or more databases, and it’s useful to identify them.

EricMc · January 20, 2009, 2:34pm

Well, yours worked better than mine. My DB was ~2500 items and I gave it more than a minute. Even so, waiting a minute (while the rest of Safari is non-responsive) and telling it to continue several times is not really usable. For 2) and 3), I’m not making any claim that my usage pattern is the only one, only pointing out that the new interface breaks the way I use the tool.

With 1) and 2) the way they are now, it is for me unusable. We use it as a web document server for automatically scanned docs, a behavior which 2.0 is still advertised to support I think. It doesn’t really support this though unless you rename your docs, which with 1.5 was not required.

eboehnisch · January 28, 2009, 7:10pm

You can easily also use the older interface by adding a /search to the web server’s URL (try clicking the “Simple” button in the new web interface, it takes you to the search-only one).

EricMc · January 30, 2009, 8:27pm

that’s good to know. I did try beta two and the switch to only showing 50 docs in the list at a time made it so I could use the large database. Thanks!

Now with larger thumbnails and the ability to sort results it would be nearly perfect. I’ll go ahead and upgrade now and try the scanning and ocr changes. Thanks for all the effort on the new version.