Riff: LLMs are Software Diamonds

The making of a diamond is a repeatable, but naturally non-reproducible process. The exact same input of carbon subject to the exact same configuration of pressure, temperature, forge, time, process control will never produce the exact same diamond twice. Once made, a diamond is unique. And once made, a diamond is forever.


Lt. Cdr. La Forge and Dr. Brahms. (image source: screenrant)

Every Large Language Model is a spectacularly faceted diamond.

LLM training is a repeatable, but naturally non-reproducible process 1. The exact same corpus of data processed in the exact same data center on the exact same hardware and software configuration using the exact same training program will never produce identical LLMs across training runs.

To continue the analogy, a diamond is a fixed relationship of carbon atoms (tokens). It will refract light in a myriad bewildering and surprising ways. However, in its now-set crystalline form, the rearranged carbon atoms will always refract the exact same incident light into the exact same multi-spectral beauty every single time 2.

An LLM too, once-created, is a pure function; an impossibly fine crystalline next-token computer. The spectral catalogue we make by studying an unadorned diamond in vacuum will only ever be a partial reflection of said untouched diamond. We can spend entire lifetimes and fail to catalogue all the rearrangements of incident light a single diamond can effect. 3

Can an inert diamond be useful, beautiful, expensive, unique, surprising, delightful, and endlessly entertaining?

Of course it can. Industrially, personally, and aesthetically!

Degree of intelligence and aliveness is a whole other ballgame.

An unadorned LLM can "think" only to the extent that it is a calculating function; it is not itself a stateful process.

Thinking requires the thinker to mutate; record new information, alter its system state and capabilities, adapt and/or respond to the environment. The thought and the change of the thinker are concomitant processes.

This is why, to make diamond-based pattern generators behave we must prefix and suffix dynamic optical middleware—gates, filters, polarisers, gravity, lenses, surrounding media, vacuum etc. For more dynamism and new capabilities, we can compose various arrangements of such augmented diamond-based pattern generators.

In case of LLMs the LLM hosts, third-party orchestrators, and, we the users, provide the prefix and postfix middleware… our prompts and tacked-on labels and post-training "fine-tuning" and augmented memories and trampoline API calls etc… Our very minds are the middleware; modulating, interpreting, interrogating, manipulating the ephemeral stateful mind-context in which an LLM-diamond is submerged and operates, by continually experimenting with, and fine-tuning prompts and process (LLM chaining etc.). 4

A reader may object saying these are human-centric criteria. Why are LLMs not "alive"?

As far as I'm concerned, we are already ubiquitously using reliable AI 5, and LLMs certainly are a kind of mind 6 on the contiguous spectrum of intelligence… surprising, unexpected, novel, shockingly useful…

Just not alive in the sense of even a virus… I find it difficult to equate stunningly high degree of surprise to emergent behaviour. Even of the sort we can observe in the simplest arrangement of mechanical parts, governed by universal laws. The three body problem eludes a general solution because of process… it is, generally, a dynamic system tending to chaos. The best we can do is calculate approximate solutions on a case-by-case basis (there is no one-size-fits-all solution).

Meanwhile, LLM technology has been used to remarkably accelerate computing protein folds, attacking Fields Medal-level proofs, and other research-hard problem statements.

What does this mean?

  • Is it a property of the LLM? viz. that the LLM is itself an intelligent problem-solver in the anthropocentric sense, or at least biological sense (spontaneously auto-introspecting auto-adapting)?
  • Is it the property of those particular problem spaces? Were they always vulnerable to LLM-style brute-forcing; a sort of "permute through a search space until the output parts fit just so"?
  • Is it a property of those incredibly sophisticated users of LLM technology; veritable Geordi La Forges 7? Is it because they already know how to tell when parts fit; viz. when the completely novel-to-them solution satisfies an internally consistent system of axioms, and/or obeys well-understood rules and laws of chemistry and physics, and/or long-term stable, convergent behaviour of biological evolution itself.

A pure function can only approximately model a process. A process itself is a live, non-fungible phenomena. It is much more than the sum total of its model-as-equations or functions. Framed this way…

  • Rocks are more alive than diamonds, and contain more intelligence than the latter because rocks mutate over time and encode within themselves a story of whatever was going on — did a river flow, and then became a desert, and then a test nuclear bomb detonated near it, which put the location of the rock in Nevada, but now it gives mute company to its nameless faceless kin in a wall far from its original home? 8 etc. etc. etc.
  • A fly-by-wire system is way smarter than human pilots, capable of governing a control loop in an impossibly dynamic environment. It reconfigures the aircraft itself, compensating for atmospheric effects, passenger movement, fuel sloshing about, to keep the whole system within the envelope of control. We can trust this evolution of the humble thermostat with our lives.
  • Aristotle's writing is dead because it's just inert data, same as the computed LLM. A snapshot of a moment frozen in time. Completely devoid of any subtext or context. Which we must rely on other people to supply, to varying degrees of detail, completeness, fidelity / truth etc. The writing and the LLM live through other real-world processes 9.

Speech patterns (text, audio, visual) generated by LLM tech, simultaneously blow my mind and sober me up… Surprisingly sensible-looking to me, while making it painfully clear that humans are boringly predictable creatures as a collective.

We are tribal and each tribe and sub tribe and sub sub tribe vocalises in distinct but relatively stable echo chambers. Echo chambers that are littered across the corpus of the Internet. Putting them all inside one giant search construct exposes us to thoughts we never had, but a hell of a lot of other people did. Alive context-imbued thoughts whose very rough textual approximations were frozen in time. And then were number-crunched into a pattern space of set ways of thinking, that are networked inside the forever unchanging pre-computed graph of probabilities inside the LLM.

We are also (predictably) surprised because we get to know what we didn't, because we didn't even know where to look, because we had no way to interpolate connections. What's going on with LLMs from a user's point of view is that they are search tech that autocomplete pattern spaces. LLM technology cannot extrapolate holes outside the training set. It can interpolate novel connections within the training set. 10.

Further, our degree of surprise and incredulity is heavily tainted by our overwhelmingly anthropocentric conditioning. We are perpetually amazed by the crazy stuff an octopus can do. Or the tool-making felicity of Corvids.

Anyway, this has gone on long enough…

Let's just say I just finished reading SB Divya's Meru and I wouldn't mind that future at all; where alloys and megaconstructs keep us way too happy to shoot each other any more.

Soapboxing. (Credit: Culture Club and Getty Images. Image source: grunge.com).

~Fin~