Why are we still reading? Moving from text tags to "Visual Speed-Reading" (The Cover Flow + AI Icon Concept)

Hey everyone,

I’ve been thinking about a massive bottleneck we all face in apps like Obsidian, DEVONthink, Notion, and Scrivener: information overload caused by text. Right now, if you have 10,000+ notes or documents, you are forced to organize them using nested folders, endless text tags, or search queries. When you browse through them, your eyes have to physically read the “hieroglyphs” of fonts. It’s slow, it causes immediate eye strain, and it completely ignores how our brains evolved to process information.

We need to stop thinking like 1990s database engineers and start thinking like animators.

The Core Idea: The “24 Frames-per-Second” Visual Interface

Humans process visual images up to 60,000 times faster than text. The goal is to create or revive a high-speed file browser—similar to Apple’s old Cover Flow—but supercharged with AI-generated dynamic visual anchors.

Instead of reading filenames or small tag lists, a user should be able to scroll through 100–200 file previews in 30 seconds, catching dynamic changes like frames in a cartoon.

Here is how the ergonomic layout works:

  • Static Anchors (The Center/Background): Large, highly recognizable silhouettes or colors that stay identical across a whole project/category. Your eyes don’t analyze them; they serve as a spatial anchor.

  • Dynamic Triggers (The Four Corners): High-contrast, monochromatic, dead-simple icons (ideograms) placed with pixel-perfect consistency.

    • Example: If you scan a legal/investigative archive, you don’t read status updates. One corner flashes a handcuff icon (arrest), another flashes a cash icon (bribery), or a mole silhouette (treason).

Why AI Makes This Possible Now

We don’t need developers to manually draw 5,000 different icons. Local or integrated AI can scan the data/text inside the note, automatically understand the context, and “stamp” the appropriate visual ideogram onto the file’s preview cover.

The Tragedy of Modern UI

Apple abandoned Cover Flow years ago (partly due to patent wars that are now long over), and modern UI design fell into the trap of “sterile minimalism”—flat tables, tiny fonts, and chat-bots where you have to type text to find text. It’s a loop of fatigue. We are on the verge of a visual breakthrough where we should be using fast, instinctive symbols (like modern Egyptian hieroglyphs) instead of boring strings of text.

Imagine a plugin or a dedicated media-player for your vault/database that lets you fly through thousands of documents using purely peripheral vision and color-coded reflexes.

Why hasn’t anyone built a modern, AI-driven visual stream plugin for our favorite PKM (Personal Knowledge Management) tools yet? Is anyone else craving this level of speed, or are we just doomed to read text tags forever?

Would love to hear your thoughts or if any developers here see a way to implement this via canvas/3D plugins.

you are forced to organize them using nested folders, endless text tags, or search queries.

You aren’t “forced” to do anything. Those are available options, of which one or a combination, allowing for fast and efficient organization and retrieval of your information.

When you browse through them, your eyes have to physically read the “hieroglyphs” of fonts. It’s slow, it causes immediate eye strain, and it completely ignores how our brains evolved to process information.

You are talking about two very different things here: If I am in the savannah, I need movement and edge detection to avoid that lion in the grass. I don’t need to know the species or subspecies of the animal about to eat me. Similarly, I don’t need to know the scientific name of the mushroom that made me sick - only the shape, color, and texture.

Processing documents and information is not a primitive survival skill. It is high-level processing of visual indicators including the text to identify, show relationships and context, signify importance, etc.

You can already apply at-a-glance metadata, e.g., color labels, flagged state, ratings, etc. for these quicker reactions to your data. However, this is a first step in a refining process, not the end of the reaction. I will notice a group I labeled with red but unless I KNOW it’s the only such group, I don’t blindly choose it. I have to confirm or deny it’s the one I want by reading the name.

PS: Unless it’s a foreign character set or in cases of some infirmity, we don’t read character by character, trying hard to piece together each word. We identify silhouettes of words and even phrases to read and our brains are incredibly good at this. Why do you think “speed-reading” is a thing? While some of it is snake oil, some techniques are scientifically shown to be effective.

As a personal example, I don’t “speak German” but I can visually identify many German words or phrases at a glance. I can skim a support ticket in German and my eye picks up on these silhouettes so I get a feel for what’s going on. In some cases, it’s even enough for me to draft a response.

PPS: Not to rain on your parade, as people are welcome to make suggestions and we DO read them. Just noting there are some parts of the argument that aren’t quite accurate. :slight_smile:

10 Likes

Not with any degree of nuance it can’t.

To use your example, it might be able to identify “arrest” items, but that’s only the beginning of any archival search. Who is being arrested? By whom? For what alleged crime?

3 Likes

And then we’d also need more than four corners with divorce, copyright, patent, traffic litigation…
I already don’t grasp the meaning of 90% of the emojis. What would be the advantage of introducing even more icons, regardless of them was they are generated? Not to mention cultural differences – red might indicate warning in western societies, while it stands for luck and/or money elsewhere.

2 Likes

Not to mention all the SF symbols in Tahoe’s menus.

In India libraries use, apparently, a classification system based on six degrees, like a hexagon. That, however, requires knowledge about the six dimensions and good knowledge about what on the axis represents which aspect of classified item. And it’s certainly more than a simple icon.

Ah, I am still blissfully ignorant of all things Tahoe.

1 Like

To kind of touch on a little bit of what everyone said and not to rain on your parade like @BLUEFROG said, this is not something I would be personally interested in at all.

You suggest, e.g., a glyph system but glyphs only go so far. So, maybe I understand this certain glyph means it’s a divorce contract underneath, but that glyph tells me literally nothing about what that actual divorce contract is about. I need to then go underneath the glyph to the text to learn more. That initial step was just a waste of my time.

What you’re proposing is just adding another layer of required memory not having anything to do with understanding what it is you’re actually processing. With that glyph system, all you’re doing is mentally and spatially categorizing with “speed.” But the reality is that speed is getting you nowhere fast.

5 Likes

Thank you for such a detailed response and feedback! I completely agree with your P.S. and P.P.S. — we do read word silhouettes and phrases, and the brain does this incredibly fast. It’s also great to know that the team actually reads and analyzes user suggestions.

Let me clarify my point a bit, as the savanna lion metaphor might have sidetracked us.

I am not claiming that information processing is a primitive survival skill. On the contrary, it is a highly sophisticated process. But precisely because it is so complex, the brain always looks for ways to minimize cognitive load.

Here are three points where I feel the current options (folders/tags/search) create unnecessary friction:

  1. Recognition vs. Recall: Search is powerful when I know exactly what I am looking for (a specific keyword or title). However, searching requires me to recall the wording. A visual interface (with good previews, cards, spatial separation) allows me to recognize a document even if I forgot its exact name. The brain expends significantly less energy on “seeing and recognizing” than on “recalling, typing, and filtering.”

  2. Glanceability: Yes, we read word silhouettes. But to understand the context of a text-based list, I still need to scan the lines sequentially (even if quickly). Visual cues (document thumbnails, specific layouts on the screen, unique covers/previews) are processed in parallel and subconsciously. I can instantly assess the structure of a project with 10 documents visually, but I would have to actually read through a list of 10 text strings to achieve the same.

  3. Spatial Memory: Folders and tags are artificial structures. In real life, we excel at remembering that “that specific contract was in the far right corner of the desk, under the blue folder.” If an interface gives elements individuality (unique previews instead of identical icons) and allows spatial arrangement, it leverages our evolutionarily powerful topographical memory.

By using the word ‘forced,’ I only meant that currently, the text-based path is practically the only deep way to interact with the archive. What I am advocating for is synergy: where the high-level text processing (which you mentioned) is supported by powerful visual and spatial cues, reducing cognitive fatigue during long working hours.

What are your thoughts on utilizing spatial memory in document organization?

1 Like

This is not actually true. I had occasion to test this recently, trying to recover a vaguely-remembered paper from seven years ago from my DT archive. Search got me a list of about 15 candidates, skimming the titles did the rest.

This is an interesting concept. I certainly rely on spatial memory to find things in the Real World, and I’m fond of mind-mapping and concept-mapping tools for navigating digital information. It’s really hard to scale those ideas to hundreds or thousands of items, though, in either the physical or digital world. For smaller quantities, manual organization is both more tractable and more effective: if I construct a map myself, it ties directly to the conceptual structure I’m building in my head. If AI does it, then all I have is the visual artifact, which may or may not match my own concept of the material.

5 Likes

I believe you’re thinking of Borges’ short story, “The Library of Babel”. Fiction, as opposed to Ranganathan’s “Colon Classification” scheme. Real, but not intuitive.

This OP is closer to the Spatial Data Management System built at MIT’s Architecture Machine Group in 1977 — “Dataland” — where users flew over a visual landscape of their documents using a joystick. DARPA funded it. Nobody shipped it.

For spatial relationships, TheBrain (esp. TheBrain 15) is a good approach. Not better than DEVONthink, just different.

3 Likes

You’re right, I was thinking of the faceted classification. We talked about this at university, many moons ago.

Perhaps I am just being unimaginative, but I don’t really see how this would add value to my own life (which isn’t to say it wouldn’t add value for someone else, of course). It seems like the implied assumption in the question (“how to utilise spatial memory in document organisation”) is that we’re not using spatial memory in DT (or digital document organisation more generally) currently, and I don’t agree with this.

For context, my main use-case here is my database as my research library.

I have a formal folder/group structure in my database (by choice!). I know where things are filed. In an example like the one shared where you can remember what a document looks like but not what it is called, I already know how to navigate to where it’s likely to be and which files it’s like to be “near”. And for me, seeing the title in the list and the front page once I’m in the vaguely right area is enough - and DT is already providing that. And like @kewms I’ve had occasion to test this (I’m sure it happens to all of us at some point!) and the combination of DT’s search, my structure and what I remember was enough to get me to the right document within a reasonable amount of time.

I think I’d also like to challenge the fundamental premise behind the whole concept as well. Why is “speed” necessary, or what is to be gained by improving it? I can navigate to the correct folder in seconds if I know specifically what I’m looking for. If I don’t know what I’m looking for, I can navigate to the area of interest in a few minutes with a search or some clicks (I call this “mooching about”, like a cat :grinning_cat:). It’s when I’m thinking about a question, but I don’t really know what information I need to answer it. It’s the digital equivalent of browsing along bookshelves in the relevant area of a library. The very act of “mooching” is part of the thinking process. I don’t gain anything by having the right text “magically” placed in my hands, and that’s not what this concept is offering anyway. The point of managing my database is to enable me to do research and think, and speed is very rarely a part of that process, I find.

(It’s probably obvious, but I’m part of Team “But what is lost when we automate everything?”. I consider friction a vital part of our lives and for knowledge careers it can often be where the magic that makes us human starts to happen.)

4 Likes

I think you are probably also talking about the value of finding things that you weren’t expecting while looking for something else. In my research I once casually turned to the blibliography of a book, which led to nine months of work on material I hadn’t known about.

I also think that a lot might depend on the field a person is working in. When I was researching in history it wasn’t just the document I was looking for, it was something buried deep in a document. And I usually didn’t know what I was looking for until I read it.

2 Likes

What you describe does not make much sense to me. First, many documents look similar as previews. Think US court docket documents, invoices, bank statements, tax returns – there is nothing discerning that would permit “Recognition vs. Recall”.

The same goes for “glanceability” – where would the unique covers/previews come from if the documents look more or less the same? Even scientific articles are very similar now, with many people using (La)TeX to produce them. I can easily recognize that they worked with this software, but in order to understand what the text is about, I have to read at least the abstract.

Perhaps. Perhapt not. How is that important? If the brain needs more energy, it tells me to eat something (preferably something healthy).

They are abstractions of very real structures. And why you may remember that a certain contract was in the far corner of the desk, under the blue folder" this kind of memory collapses if you move the blue folder. Or put a red one on top of it, concealing the blue one. This concept just does not scale if you have 100 contracts .

Again: How would the interface do that with documents that look essentially identical?

1 Like

Unfortunately, I don’t see any point in continuing to communicate… I started writing in Ukrainian to better express my opinion… but the administrator started threatening me with a ban due to international standards, which are very strict… if the language of communication is a problem that annoys creators, then why bother about software progress and visualization proposals… sorry, it’s not my fault that I won’t respond to you anymore… I hope this isn’t a big loss for everyone))) Hello democracy!!!

2 Likes

We didn’t threaten you in any way:

Well, the language of this part of the forum is English. If you have difficulties expressing yourself in it, why not use a translation service like DeepL or Google before posting?

Obviously, it is a lot easier and sustainable if you translate your text once than having everyone else who wants to understand it translating it on their own. And if you’re afraid that nuances get lost in translation – well, that’s how it is. It will happen to readers translating from Ukrainian as well as for you.

But forcing everyone else to do your work and than complaining about inexisting “threats” and linking that issue to “software progress” is taking it a bit too far, imo.

2 Likes

Having been there, since my early years as an Interaction Design master’s student at Carnegie Mellon University (loooong ago), and always interested in what the MIT’s Media Lab, and many other groups were doing, and having read uncountable papers on search vs browse vs information visualization, etc., etc…

I am still amazed at how people keep proposing new magical ways of doing things.

I will just say that people are very different, individually, on how they grok things, and that the context sand tasks make an enormous difference in terms of how information is presented and how it can be consumed (from perception to cognition to long-term storage).

Why are we still reading?

Because language is still critical. Not to say that non-verbal information isn’t.

The lion(ness) got the prey - but did not print the Gutenberg Bible. Or write the Magna Carta. Or the Universal Declaration of Human Rights.

5 Likes