I might stop doing my semi-regular / / "tooling used by software industry is fundamentally broken on a philosophical level" / "organizing code in plaintext files is incredibly, ridiculously wasteful".

By accident, I found this:

feifan.blog/posts/the-database

...which covers 90% of the things I thought and ranted about over the last ~5 years, but better.

Seriously, go and read it.

(Also: news.ycombinator.com/item?id=2)

And this seems to be a proper attempt to make a programming environment that doesn't suck: gtoolkit.com/

It's , because of course it is.

Gonna push it to the front of my "to play with list". I can live with learning me some Smalltalk, or any other language for that matter, if it lets me work in an environment that doesn't make me want to stab myself in the eyes with a dull spoon, every single day.

I'm gonna say something blasphemous here: in context of these fundamental issues, also sucks hard. So does in general. Yes, they're immensely more ergonomic, malleable and powerful than their more mainstream competition, but they're still hindered by the same fundamental issue: they have the nature of writing code in plaintext files deeply embedded in their DNA.

(And unfortunately, I don't see a way for Emacs to improve here, as long as text buffers are its fundamental concept.)

This is turning into an unexpected thread 🧵, sorry. But there's one idea I couldn't put properly into words until now:

The problem with our tooling isn't plaintext representation per se. The problem is that it's simultaneously:

1) the ultimate, canonical representation of a program - the "single source of truth", and

2) the representation we work on directly when creating that program.

3) usually the *only* representation we work on.

The result is not powerful enough to manage complexity efficiently.

Here's why this is a problem: it makes us commit up front to a single view of a program, emphasizing some concepts, while making different - and often equally important - concepts implicit.

Because we have only one canonical representation of a program, it can support only a single way of understanding it.

The art of writing readable and maintainable code is necessary because of this: we can't express every concept properly at the same time, so we have to pick the ones we do, and let the rest be smeared.

The term "cross-cutting concerns" is used in our industry as an admission of defeat. Data transformation, execution order, security, logging, "happy path" vs. "failure path" - they're all equally valid concerns to focus on, but the "single plaintext representation" problem makes us commit to only *one* of those concerns, up front, and route the rest around it.

This is why I have to keep writing bullshit like:

fn foo(Data&, Logger&) -> Either<Result, Error>

because I have to commit to a single definition.

I have to spend time making bullshit decisions like:

- Exceptions or expected type?
- Pass logger as argument or use a global singleton?
- What to log here and how to get all the data I need for it?
- How many fake-monads I can stack in the return value before the C++ compiler tells me to maybe use Haskell instead?

And I have to deal with decisions made by others when reading or modifying their code - *always* deal, with *all of them*.

Even if the only thing I care about ATM is adding a log statement.

The way to look at it is: every one of those "cross-cutting concerns" is a dimension. Like in geometry.

Error handling. Parallelism. Traceability. Transformation pipelines. Being a part of an architectural concept A. And of a concept B. That's 6 dimensions already.

And the single plaintext codebase - that's just a one-dimensional medium. You can map 6D points to 1D - hell, that's effectively what modern software development is - but you do that by focusing on one arbitrary dimension, and mixing the rest.

Follow

The solution would be to allow the programmer to view and *edit* the code in multiple different representations - textural, graphical, tabular, whatever fits best. All those representations are just different ways of viewing the same underlying artifact - the program source code.

Of course, there must ultimately be a single, complete definition of the program stored on the computer. It may or may not be plaintext. But as programmers, we shouldn't care about it or look at it for 99% of our time.

· · Web · 2 · 1 · 2

The first step to a better environment is thus dropping the requirement for programmers to work with underlying "single source of truth" representation.

It's not unprecedented - we've already done this for assembly/bytecode. We can do it again, at a higher level.

Second step is probably a new programming "language" - one that isn't fundamentally a linear, human-readable plaintext, but something multi-dimensional. Yes, . So a database.

Third step - taking responsibility for representations.

That last step is a combination of DIY / Lisp /craftsman philosophy of making your own tools (and the tools to make tools), and a reification of .

When I define a domain concept, my tools must make it easy for me to express it in code, but also give me the ability to look at the code through the lens of that concept, without dragging in irrelevant details, and that's *especially* when a single piece of code factors into multiple different concepts simultaneously.

This means, the tooling must let me not just encode the concept, but also to define tools/representations for efficiently working with that concept. At every abstraction level - whether it's a domain concept or implementation detail.

Like, imagine looking at "WidgetController", and your tooling telling you it's simultaneously:
- a Widget (domain type)
- a bridge (design pattern)
- a piece of a state machine
- a queue

And you get dedicated tooling/visualisations for each, some of which you added yourself.

Ok folks, you know what? I'm rambling. So enough for tonight.

But I feel like turning this into a proper (and less ranty) article. If anyone would be interested in reading (or reviewing a draft of it), please let me know.

@temporal you can send it my way I'll try to take a moment to read it.

I feel the same way as you on many points. I think about people who work with data and how they invented the relational database to be able to view their data arranged in different logical ways because of course you need to have that. We programmers don't for some reason.

@temporal yes! let me know if you have managed to get this wtitten down in the mean time.

@woozong So far I didn't manage to turn it into a proper article. I have too many things going on in my life right now, so I don't expect to be able to write about this properly earlier than ~3-4 months from now.

@temporal
no worries, it was a very interesting thread that clarified a nagging feeling I've been having for quite a while.
\m/

@woozong I'm happy my little rant was helpful :). It's definitely not over yet - I'm continuously thinking about this topic, and looking for more interesting references.

In fact, I'm about to add another one to this thread.

@temporal For over a decade, I claim that #information representation should only depend on the specific #retrieval situation and not on its storage situation. I usually think of #files and bits of my #KnowledgeManagement.
You have provided an interesting thread on the same point of view but for #sourcecode and #programming.
Thanks!

@publicvoit Your perspective on / is insightful and I didn't realize how it connects - I feel now it's a different perspective on the same thing.

Representation and storage are two orthogonal (modulo efficiency) concerns, and the former should be driven by what you want to actually do with the data. That includes not just retrieval, but also updating. I want to work with and on high-level concepts, not just refer to them as "reports" on the underlying data.

@publicvoit In context of your writing on , in particular tags, making notes/todos anywhere in the system vs. my assertion that when coding, I shouldn't waste time deciding where in a file/filesystem a given function/class is supposed to be stored...

I find myself to be surprisingly resistant to placing tasks in random places. I feel more comfortable with a well-defined hierarchy. But then, I notice I waste a lot of time deciding "where should I put this item?". I'm inconsistent in this.

@publicvoit My current hypothesis is this:

I don't trust search. When searching, I keep having this feeling that results are not complete. That there may be something important the query is excluding.

Conversely, I find a canonical hierarchy reassuring - because I know I can just manually walk over it (or a relevant subtree of it), and either I find what I'm looking for, or I know for sure it doesn't exist in the system.

Searches are open-world, canonical hierarchy is closed-world.

@publicvoit Now the silly bit here is: half of my yesternight's rant was arguing in favor of interactions that are effectively open-world - querying and filtering and pivoting.

But then, thinking about applying the same approach to my todo list, I start to feel claustrophobic - having a thousand local views and no global view makes me feel I might be missing important information that just happens to not be covered by any of the queries.

Not sure how to reconcile it.

@temporal @publicvoit problem is categories aren't mutually exclusive nor do they have canonical definitions or depth order . Information is a graph. Knowledge objects are multidimensional vectors . Even the animal kingdom taxonomy has loops/overlap and phenotype/genotype discrepancy.

@hobson @temporal Any strict #hierarchy is flawed. If you order things in one way, you're not ordering them in infinite/many minus one possible ways.

@hobson

I agree, and this is what my / rants over the years have been fundamentally about.

The problem I described in this subthread is not about graphs and categories. It's about querying. And it's subjective. I find myself defaulting to working with the fundamental "storage-level" representation, because it's the only one I fully trust. Queries miss results - whether because of a bad search engine, or a wrong query. Storage representation is, by definition, complete.

@publicvoit

@hobson

Things that killed my trust in search are shitty search engines. Like search in Windows Explorer, which misses files I *know* I have and can manually find. Search in Slack, or Google Docs - they also have sometimes missed things for me in the past.

Or social media - Facebook, Twitter, etc. They're all eventually consistent, best-effort searches, making them pretty much useless: if you don't see something in results, it may be just because the search job gave up early.

@publicvoit

@hobson It's a similar story with programming tools, too. IDEs and language servers.

Like, I have clangd working over my work codebase, but I still frequently use (rip)grep for searching. That's because with clangd (and most IDEs I worked with), when I search for a code symbol / callers / callees and don't find what I expect, the most probable reason is... that the underlying engine failed to parse some code or is otherwise confused.

Grep, I trust. Because it walks the filesystem.

@publicvoit

@hobson

The irony here is that the main rant in this toot tree was that I'd like to use million different representations of the same underlying data set (code base), without ever dealing with the canonical storage-level representation explicitly.

Meanwhile, in practice, I don't trust many tools that offer some kind of higher level of querying / classifying of data.

This is inconsistent, so I'm trying to figure it out.

@publicvoit

@temporal @publicvoit yea I was just telling a coworker recently that it's a shame they are too young to remember a world where you could trust search results. We had desktop search. better than grep/rip/find, or any file system tree I could organize. They got us addicted to search, and lazy about organizing information, then sneakily polluted search with misinformation, boiled the frog. Sublime Text ctrl-F & ctrl-P restored my hope for humanity.

@temporal I once worked on an important commercial project written in Smalltalk where very few of the developers knew how to rebuild the Smalltalk image and this needed doing regularly, from time to time. Just saying...

@underlap That's a problem with long-living images, and the reason why I habitually rebuild Lisp images I work on after any significant change.

But it's an unrelated issue - one of ephemeral modifications to the application not being recoverable from the source code.

What I'm after isn't (primarily, at this point) sculpting a running program - it's working on a program model, aka. "source code", just through better means than the code itself.

Sign in to participate in the conversation
Mastodon for Tech Folks

This Mastodon instance is for people interested in technology. Discussions aren't limited to technology, because tech folks shouldn't be limited to technology either!