Why are we still reading? Moving from text tags to "Visual Speed-Reading" (The Cover Flow + AI Icon Concept)

Just to say, I am totally seeing your point.

All this is the reason that for some time some things are establishing themselves in the PKM world:

  1. canvases and mapping as part of the knowledge gardens
  2. card-systems (basically schematized, condensed information for better cuing/browsing)
  3. semantic object typologies, for scaffolding/structuring vast lakes of disparate information
  4. UI concepts that scaffold all this, and align it with (deep/document-form) textual information

All this also to say: while I understand your point about visual information display and scaffolding in principle, I think with information spaces the question is more complex than the basic sensory modes (text; visual; auditory etc.), and putting them against each other (or “side by side”). Even these basic “modes” are heuristic system anyways, and as we know now (and since long time) from UI studies, knowledge studies, psychology, biology, philosohy, literary studies… and all the fields out there: human knowledge structures are intrinsically multimodal by nature.
Thus the – worthwhile – problem/challenge you make a topic really becomes more complex. Which is why – in the context of knowledge systems and their presentation/UI+UX-structures – things like “intermediary/virtual structures” count, which connect the content to the UI. This is the reason semantic structures and hierarchies, structured models of knowledge etc are so important. To give just one example: once you have object/semantic types, you then can think of the appropriate ways to visualize/cue/scaffold them in your system (UI/UX/functional architecture).

So, as you say – it´s not “an either or”, and you should not let yourself be forced to enter the “either-or”-fallacy, which is plaguing some of the more conceptual discussions here.

The good: there are some ways that DT achieves such helpful structures for navigation and orientation (high level reading of the knowledge structure/space) – aside from files and folders. There are things like tag clouds, the (now improved) graph), concordance etc.

The caveat: all these things are mostly derived from a text-based paradigm, not a visual or spatial one in particular. Think databases.

Part of the reason: DT comes out of a paradigm of early personal computer use culture, and was initially conceived around text as information and documents as mainly text-documents (a little simplified, but true in principle here…:grinning_face_with_smiling_eyes:); plus it is more rooted in 1980s and 1990s programming and scripting cultures than in (newer) UI/UX cultures. So things are read through that lense by established core of the forum, IME.

Of course some new users often either struggle with this, or bring another cultural experience of UI/UX and using mediated knowledge and information etc.
This is where discussion culture, which you also mention/experience here, enter:
Discussions are approached mostly from the merits and long legacy of what DT brings, represents and indeed achieves. (text based; document database etc). This is defended as you see :smiling_face:.

A mode of “I have a conceptual idea…”, or “I come from a different cultural/concetual angle, and this is how I approach it/this is my idea…” is not really embraced easily here, esp when it comes to (conceptual) UI/UX questions. It´s also mostly not treated as constructive contribution, initial impulse or invitation to some shared thinking (-- though this is also obviously depending on the composition of contributors in a thread). Instead the “onus” of “proof” and/or expected “precision” – measured by those established/entrenched conceptual (older/technical) paradigms – is regularly put on the “disruptors”. So, DT forum is not the right place to open up imaginative and open conceptual discussions, really.
Take it as an entrenched/implicit communication standard of the forum.
If you callibrate your expectation horizon/approach accordingly, this will save you from disappointment. – The forum is still good for “hand-on” help, clarifying things about as-is-DT, minor technical/functional propositions that don´t touch the existing architecture… and sharing very general stuff less related to DT :grinning_face_with_smiling_eyes:

So, just to say, I am sorry for your experience. There is an obvious mismatch here in cultural communication :smiling_face:
But I also value you opening up initial discussion with some contribution and ideas shared in constructive spirit, obviously.
I certainly appreciate(d) it! (As I like this kind of open thinking and constructive speculation etc.)

PS: also think Korm pointed you in right direction: if you are looking for some PKM-apps that put focus on flow, and easy/more visual navigability, visual schematization you have to look in other (cultural) corners, like the Brain – even though their – ingenious – concept also stems from old (“structural mapping”) approaches in late 1990s. – For newer PKM systems/apps that go more in the direction you are pointing to (high-level spatio-visual and more symbolic/schematic structuring/UX as primary mode) you might also want to look at these:
Tinderbox (ancient, though), Heptabase, Obsidian (w/ Canvas and Bases), Capacities, XMind, … Or take a look here: https://infinitecanvas.tools

I’ve experimented with several of these, plus a few more. My experience is that the visual metaphor simply doesn’t scale beyond at most a few hundred “nodes” (whatever the particular software calls them). My DT databases have thousands of items and millions of words.

Fortunately, modern computers can have more than one application! Once I’ve found what I need in DT, it’s simple enough to extract those items to a more visual tool.

3 Likes

There’s also the problem of the enormous energy consumption of LLMs, especially relative to the human brain. Whatever the (arguable) benefits of LLMs, energy efficiency is not one of them!

5 Likes

Agreed, that scaling things works differently in different logics.

But to my experience all is dependent on context.

I would personally agree for some contexts, like when talking about canvas, concept maps and the like. Though there are some technologies that try to work on the scaling of this, while keeping ergonomics in place (Fibery´s whiteboard approach would be one; also Tinderbox obviously tries to scale while staying in the spatio-visual paradigm, somwhow…).

Then, things again do depend on what is meant by “visual”, as this is a rather large sphere, or better: mode. E.g. I find that the card mode in Capacities does really scale well into large numbers, giving chances to browse any set of objects (documents/notes), as long as people are interested in a visual browsing mode. Also, there are context where it´s especially the duality and “tandem” of both doing the work – take example of the local/contextual graph (as it exists in DT as well as in Capacities): you arrive by identifying a “node” (document, collection, tag) in any way … then you can visually contextualize it and follow the connections for whatever purpose. Similar in TheBrain (which integrates logics to the max (mapping, documents, auto-schematization, visual cues, metadata etc.) – I don´t know about your experience, but with theBRain this for me works well into the thousands…; and and the logic most probably is a combination of (textual) search, pinning/bookmarking and visually browsing the structure (which is connected to a certain kind of scaffolded comprehension, or “reading relations”)

But what you point to is something real: visual orientation, whatever the interface/environ for it, has some kind of numeric capacity that is not comparable to (machine-)searching text (or automated search of visual features, by now). – But then text has no (real good) way of setting up spatial relationships and leverage the benefits of visual and spatial orientation, which are especially relevant to human understanding
… so again: things, IMO, shouldn´t be put up against each other every time as a matter of principle, but rather be taken and understood in relation to each other, and often in combination.
Hybridity is much stronger than any “pure” mode or logic… as it can encompass and leverage different modes/logics next to each other, fuse them, use them side by side – also within one application…
– At least that is my understanding and experience after some years of trying things.

And maybe it would be more to the point when we talk about scaling, comparing automation to human individual processing, than to map these things onto the binary of “text vs visual”… I think that is also a lesson that modern multimodal models in “LLMs” (sic! for “language” in here :smiling_face:) teach us, currently…
Because, lets be honest: how many thousand books would any individual here on the forum really read himself/herself in a lifetime…?

I did, but I forgot most, which is fun, so they are a surprise when I re-read them, and cost no $$$ :joy:

1 Like

… the many ways of beating the system :smiling_face:

… PS: did they have images and/or elaborate layouts and visual structures? Maybe we can derive something more here?! :grinning_face_with_smiling_eyes:

Many had elaborate layouts and visual structures, yes - (science, art, design). The only thing I can derive is that all Knowledge is Grey :slight_smile:

2 Likes

I really like this.

The trend to see everything as either black or white has been suspicious to me for a long time. The answer to a lot of questions is “grey” or 42 :rofl: .

1 Like

Instead off writing about a visual idea like this, why not create a mockup? It might express way more.

Right now I’m confused.

1 Like

To put it simply, that is exactly the problem. The more material you have, the bigger the problem becomes

This seems unsolvable, no matter which organizational method you use, I’m afraid.

2 Likes

First, this seems to be a step back in redefining the more complex proposal and argument(s) made thus far to “visual or not”, “visual everywhere or nowhere” (related to a certain number/threshold of nodes or data). I think, also given the initial description as well as real arguments thus far, it would be better to approach this as a more “grey” (colorful, mixed, hybrid) issue. Also, just to remember: even DT uses interactive graphs etc. as part of dealing with information digest and management. Side by side. Each module scaffolding the others.

Then, what the OP addressed, or put at the center of his ruminations was

The way to solve this is not just saying/thinking “visual = answer”. Or giving up.

What he addresses and describes is actually a mix of: UI architecturing and cueing systems, some ideas on schematization (icons), and use of more “spatial logic”… And, it’s all formulated as an invitation to think about “Modern UI” (“I’ve been thinking about…”).

So, one can reduce this to the question of user-generated canvases and (certain kinds of) maps. (But then one would also have to address why Tinderbox and TheBrain work for so many, even regarding large scale info pools).

Or one can meet the original post at the more complex framing and formulations the OP and others have given.

-- To give an example of where things are more complex, and also relating to the real working of DT-as-is: the graph already works in a hybrid mode, and as such very effectively: while it processes the full 100% of higher level info entities available in DT (and these are sometimes automatically created from the full corpus, as in the case of (semi-)automated tagging etc.), it also visualizes them in a local context that is useful for users in terms of browsing and orientation (the real theme of the OP). So, one thing to see in this example: it’s all as much about automation (which cannot be mapped to text vs. visual) as it’s about a more “visual” (= good for browsing/orientation) UI and architecture. Also this shows: the encompassing aspect is (good) UI/UX, and way to effectively hybridize text, schemas, visual display for quick browsing and orientation. Again, it neither about “visual OR text”, nor about “human readable/browsable OR automated”.

Also, it should be acknowledged in this discussion (representing some experienced users) that the graph and the logic described have just recently been upgraded and improved. I guess for a reason. Plus this also shows: beyond black and white, yes and no, there is the vast grey/colorful land of iteration, incremental change/extension, and actual thinking within the potentials of the real UI at hand (considering how to improve UX and orientation.)

So, in a way this discussion so far is also simply disregarding some aspects of DT itself, and ignoring/downplaying/forgetting some developments of it (which I would qualify as valuable progress).

Then: what the OP floats in terms of creating automated icons that qualify and visually label/cue documents for fast browsing, is a really interesting idea. For one, it’s kind of possible now, with the progress of the automated – multimodal – models. And then, it’s in core just a small leap from things like color-labels, tag-pills, flags etc. So, again, this is just an extension of [a deep running existing discussion in the DT user community]( https://discourse.devontechnologies.com/search?q=labeling ), … just adding the idea to also consider icons as labels (and use AI for that) – At least it should be a topic that can be appreciated as an idea by imaginative souls in touch with DT and current developments (even if one doesn’t like it in the end).

So, maybe it’s time to get a little more colorful in here. And follow the squirrel. Beyond just “grey knowledge”. At least this is the part where I acknowledge the positive impulse of the OP (while not 100% agreeing with his AI optimism… but then, things are … grey … often…)

PS: aside from Tinderbox and TheBrain being RW-examples of maxxing a visual/schematic/navigable UI/UX with x1000 of nodes, there is the interesting take by Fibery, on implmenting this in an interactive whiteboar/canvas format – that sits atop of a per-defintionem large-/company-scale universal database: Fibery
Of course, in the real world there is much more once one understands operation rooms, bourses etc. arenothing but a highly effective way of building interfaces hybridizing spatio-visual and textual and data systems (with potentially endless numbers of nodes)… but that might lead outside of a software/UI-UX discussion (though there are of course also application serving domains like Trading terminals and charting suites, portfolio or operational dashboards… that all pack endless databases with textual (and other modal information) into – very visual/schematized) interfaces optimized for browsing, orientation and (quick) navigation…

Me too. I don’t understand the proposal. I’m probably too stupid to grasp it.

Unfortunately the OP left, dishartened – a point largely unadressed…

But if my reading is correct, the proposal had 2 clearly readable dimensions:

  1. Let´s think about ways to make large – largely textual – databases more browsable by thinking about spatio-visual interfaces, cuing and automated data digestion (see prior posts). So, on one dimension it was a general invitation to think about visual navigation and minimal-friction browsing of content, in DT (and generally)…

  2. a plugin to achieve that, attached with some very concrete, if prelim, ideas: using auto-generated/applied icons as labels, and more extenstive use of color coding (which currently is limited to the 6(?) colors of the labels… something btw also regularly leading to discussions here…)

So, in a way 2) is not really different from Ammonite: Tag visualizer and search utility – that old plugin that – at the time – visualized the tag cloud outside of DT, and was highly appreciated by some power users here. Actually it´s relevance was that big, it seems, that the principle – pointing to a then missing dimension of visual overview, browsing, interaction, interaction in DT – was later taken on board by DT, or at least by now has a functional equivalent in the (fabulous) interactive tag cloud. A good example that openess to external impulses is sometimes worth acknowledgment and openess of mind. (Similar arguments could probably be made about the recent massive evolution of the graph, esp in light of a similar – complex – discussion about external users craving for a graph in DT, and even going in direction of implementing things as third-party, plugin etc. – Node Graph for Document Links )

So, I think it is not really that opaque, if one reads along and into it w/ some (critical) sympathy …
Also what shows again here, is that some internal interlinking of (historic) debates w/in forum would sometimes make things more straightforward and productive – a problem of a lot of unmoderated forums, where collective knowledge or just reference often gets lost in the stream… (and btw something where AI might also be a modest remedy against frail human wetware or information overload)

This is not an unmoderated forum. And moderation doesn’t mean every response must be in agreement with any proposition. Moderation is about decorum and civility and addressing ad hominem attacks. It a place where people can speak their minds – on both sides of a topic – with no expectation of agreement, only the expectation no one will demean or defame another participant.

Also, critical thinking is not just saying many words about something. It is not only about the topic. It’s also listening and accepting that (1) a concept may have flaws or weaknesses that weren’t considered, and (2) it may be poorly expressed or explained (beyond language differences).

Thanks Jim.

I might differ on some implicit evaluations here, but acknowledge this a statement about your understanding of moderation, or your role. That´s fair. And good to have this explicated, of course.

Also: you are right in one respect. “Unmoderated” was too contextually (implicitly) used here. There is – in the world of grey and color – of course more than one form of moderation, and no general concept/hat of “moderation” does justice to all. What I meant, in the given context, was a form of moderation that actively binds together strings of discussion, even if distributed in forum threads, and helping people thereby to arrive at the best possible formulations for their intended topics and motivations – via insight, reference and collective learning. So, I hope that clarifies.
Some forums – and I think this clever – by now use things like RAG to counter the dispersive force of old-style forum threads… so, clearly, I am thinking of a certain function moderation can take.

“critical thinking is not just saying many words about something”
… this one, admittedly, I do not understand. Does anybody hold that position?

But we agree: listening and accepting are real valuable civic virtues for a forum. (All other details left aside here…)

I believe the idea in the OP is a proposed AI-generated visual metadata layer for large document libraries, where each document gets a recognizable visual card with stable project imagery and simple semantic icons, allowing users to skim large archives visually instead of reading filenames, tags, and search results.

I can grasp that, and even visualize the outcome, but as the domain grows to “10,000+ notes or documents”, my brain gives up. I cannot grasp how 10,000 cards with symbols or colors or icons would assist my understanding of the content of that dataset one bit. I would feel frightened and overloaded, frankly.

I generally have 8 databases open containing ~20K documents overall. Any visual display of all of that would have no meaning to me at all. I don’t think I’m alone. But I have no need to see my documents all-at-once. I search, I read and make notes, I work with the subset of documents that are relevant to my current task or project and ignore the rest until they are needed.

I don’t think DEVONtech needs to add a visual layer feature. I use my own approaches. For example, I can have Claude build a Mermaid visualization of the 50 documents I in a group hierarchy for a bit of research I am working on. But having the AI do that work is not as useful to me as building the visualization myself based on, ooops, reading and thinking on my own.

6 Likes

FWIW, Eastgate Systems has said that Tinderbox works best with hundreds or a low number of thousands of notes in a single file. Both Tinderbox and TheBrain also offer outline-style interfaces for those who prefer them.

I don’t think anyone is arguing this point. But I also don’t think that’s what the OP was proposing. They explicitly dismissed text-based organization entirely.

1 Like

This is the foundational problem of library and information science, FWIW.

At least in my own work, the solution is a hybrid approach: use folder hierarchies and search tools to manage the mass of data, use visual tools to organize items related to a particular project.

2 Likes

Thanks for that thoughtful re-take. Appreciate the engagement.

This is also a good clarification.

As

… but as the domain grows to “10,000+ notes or documents”, my brain gives up. I cannot grasp how 10,000 cards with symbols or colors or icons would assist my understanding of the content of that dataset one bit

sounds like a different statement – also in kind – than

I don’t understand the proposal. I’m probably too stupid to grasp it.

(And let’s skip the fact of what it says to someone – OP – explicitly stating he feels (culturally) unwelcomed…)

So, my reading/understanding would be that you re-iterate the scope-argument, and if I am correct in reading, aligning either with kewms’ take:

My experience is that the visual metaphor simply doesn’t scale beyond at most a few hundred “nodes” (whatever the particular software calls them). My DT databases have thousands of items and millions of words.

That is of course an argument for some, based on certain individual preferences (as stated by @kewsm). But that doesn´t make it a generally/universally valid proposition, and open to other preferences (e.g. OP)

  1. I still would like to see how such a position relates to known instances where there are quasi-visual/schematizing interfaces, embedded in larger corpus collections, making such things possible (see my prior example; even your own Brain reference can be called up here…; see also Tinderbox – below​:backhand_index_pointing_down:; but also DT graph… in a certain way.)

  2. It also projects something into the OP’s position that was never stated, I think. I didn’t understand him (or my own position) holding that such a browsing interface can generate a) immediate/full *global* access, neccessarily and more importantly b) *replace* personal understanding/thinking altogether. – So if we read his post closely and attentively, he centrally states his aim/motivation as “When you *browse* through them, your eyes have to physically read the ‘hieroglyphs’ of fonts. It’s slow, it causes immediate eye strain,”… For me, *browsing* and how to make it efficient (UI/UX) here is the central concept/requirement given. And, frankly, here he talks about nothing else than upscaled, intelligent “thumbnails 4.0” (your “cards”), and improving on “cues” (labels, tags etc. – which he also mentions). Nobody would question thumbnails in principle and see them as shortcircuiting understanding, I think.

  3. All this bypasses silently the already given challenges in this context - the stated argument (I think: fact), i.e. challenging that “ontological” difference between “text” and “visual” would have tight, sorted borders, at certain numbers, or between certain applications. I am not getting into whole research departments thinking about the inherent visuality of text. But this is also, I think, undercomplex given the fact almost all PKM apps nowadays integrate the visual and the textual side-by-side (modularity, hybridity), and in tandem use. This also goes for DT-as-is itself (graph, tag cloud). Something constantly skipped in large parts of the discussion. This is where the argument, IMO, becomes more demanding than “above x,000 abandon ‘visual’ scaffolding” etc. Because the graph *simultaneously* plugs into the global corpus (in different ways, some also open for automation) *and* gives a handy overview because of local restriction/circumscription. *That* is actually what a good interface allows, and in some structurally similar cases it can be labeled “facetting”, “drilling” or what you like (and nowhere is the question of it replacing traditional reading or “understanding”). I think, this is not included in the first serve of the OP, but would be more productive to discuss, than turning scope limitations into a fundamental criticism of good, worthwhile ideas (the icons, broader color coding, more visual navigation layers/modes etc.)

  4. Also passed over in this discussion (or muted about) is the stated fact that DT itself evolved in this respect, giving us – at certain times, and often after people outside flagged a need/want – those very tools as modules in DT (See also – again the skipped over – Ammonite – interactive tag cloud example.). So there is a genealogy – and it´s one of progressive incorporation of visual navigation/broswing/orientation modules. And if that is accepted, the natural question is: why should it stop? Or why should it stop with AI? (-- also: “hello ‘MCP’ discussion!” :waving_hand:)

Not agreeing with this approach in terms of personal style (“I don’t want to use that”, “I work differently”) is naturally open to anyone. But one should be careful to make this a global (sounding) argument and statement of “this *is*…”, “I do not understand…” etc, especially given the many working counterexamples.

The last bit, which is strung into this particular discussion regularly now, comes as something I rather regard as rhetorical, shallow:

But having the AI do that work is not as useful to me as building the visualization myself based on, oops, reading and thinking on my own.

I think this is a little suggestive gesturing, and tries to basically dismiss other kinds of thinking as rhetorical device here. It’s also inconsistent, as you are saying about your image production: “yes I do it sometimes to get things done” (which was the OPs motivation/scenario), “but, really the most valuable outcome comes when I do *everything myself*”. First: nobody demanded total replacement of own creativity, reading, understanding in the first place. Then, with that position one couldn’t use a photo camera… and AI – semantic automation in general – should be banned outright… the argument of “thinking for yourself” would actually prohibit you from using augmentation tools like DT (at least with its inbuilt semantic (original) AI).

What I observe as a tendency: the “read everything yourself or you are an AI slop” “argument”/topos, not only builds a strawman (misreading “high level reading” – orientation/browsing and UI/UX – as “(close) reading”; and it is brought forward mostly when the real focus is on UI/UX improvements and/or speculations (not “reading” 1 – or 10.000 documents). But it is notably silent when DT is building LLMs into the very core of its user interface, announcing MCP etc. Makes one wonder, where principles are applied.

@ kewsm

I

FWIW, Eastgate Systems has said…

This is a little bit like “The internet has said…”
At Eastgate, I am reading (after tracing the relevant spot with AI):
“How many notes can I make?
Lots. There’s no fixed limit to the number of notes Tinderbox can handle. You can easily manage thousands, even tens of thousands of notes, in Tinderbox.” ( Tinderbox: FAQ: Tinderbox Notes ).
– Where is your reference from?
Maybe they are also more grey than sometimes appears in their commentaries… :grinning_face_with_smiling_eyes:

II

“Both Tinderbox and TheBrain also offer outline-style interfaces for those who prefer them.”

Exactly. First of all it´s hybrid. Then, as stated above, these are at some points matters of preference – and should not be confused with real arguments. But the way the OP´s points are ruled out are largely setting one preference against another… knowing (as you say yourself) both have legitimate bases out there in the demography…

III

“I don’t think anyone is arguing this point [of systemic hybridity].”

I can only read you:

“My experience is that the visual metaphor simply doesn’t scale beyond at most a few hundred “nodes” (whatever the particular software calls them).”

I can only read this (esp. in additional contexts quoted above) as “at a certain scope the visual is out of the system,…”.
So what are you saying, if that is wrong reading?

IV

“But I also don’t think that’s what the OP was proposing. They explicitly dismissed text-based organization entirely.”

Again, I am reading it. I find:
“information overload caused by text.” – that is not a principal dismissal of text. But of information overload.

“They explicitly dismissed text-based organization entirely”.

I see him/her talking about “browsing” the corpus, not dismissing the text base, or other forms of underlying/parallel “organization”. OP also talks about “ergonomic layout”. And he gives special scope/functional context by giving paradigmatic examples like “reading status updates” – which is not meant to cover “reading Hegel”, as I understand it.
The only moment where OP explicitly talks “organization”, that is “organize them using nested folders” – so that´s about hierarchical structuring as means of navigational/cognitive access to the source documents.
Also his “tiredness” of “reading text” here is very specfic, and contextual (as contextual to high level reading/orientation/navigation), like: “to read text tags”.

– But as the argument of “reading” and understanding is put by some at the very heart of the matter, you might provide other parts of text (and their context).