The Three Rings of AI

We can think of every AI product as having three layers, like rings around a center. While not a rigid engineering blueprint, this is a helpful way to see how the technology actually comes together.

At the core is the model, which is the large language model itself. It is a fixed function: text (and now images or audio) goes in, a prediction comes out. Everything it "knows" was baked in during training; on its own it has no memory, cannot act, and cannot reach anything beyond itself. Think of it as a brilliant, sealed brain in a jar.

Around it is the harness, the software wrapped around the model. The harness has no intelligence of its own; its job is to feed the model the right input, hold the state the model lacks, run it in loops, and connect it to the outside world. If the model is the brain, the harness is the nervous system and the hands.

Around that is the product, which includes everything a person actually touches: the interface, the buttons, the defaults, and how a capability is packaged. This is the ring that determines if a breakthrough is valuable and if it actually reaches users.

A note on how to read this. Each section below covers exactly one improvement and the single ring it changed. The sections are cumulative: each one silently assumes everything before it, the way a building assumes its foundation, so we describe only the new change, never recapping the old. And though the rings are the subject, there is a quiet fourth party they all serve: you, the user. Watch what each improvement quietly takes off your plate.

1. Attention: The Transformer

Model · Google (2017)

The problem. Earlier networks read text one word at a time, in order, which made them slow to train and poor at connecting words that sat far apart. A pronoun and the noun it referred to, ten lines up, might as well have been on different planets.

What changed. Google researchers introduced the Transformer, built around a mechanism called attention. Instead of marching through text in sequence, the model looks at all the words at once. It learns which other words matter to each word, letting a pronoun like "it" reach back across a paragraph to the specific noun it refers to. Just as importantly, this could be computed in parallel, so training could scale across enormous amounts of text and hardware.

Ripple effect on other rings. This is the foundation the other two rings are built on. Without parallel training, there is no large model to wrap or to sell. Everything else quietly assumes this engine is running underneath.

What it unlocked. Machine translation that finally read fluently rather than word-by-word. More importantly, it unlocked the very possibility of training models large enough to be generally useful.

2. Learning from raw text at scale: Pretraining

Model · OpenAI, GPT-3 (2020)

The problem. To make a computer do a language task (like answering questions, translating, or summarizing), you used to build and train a separate model for each one, each needing its own hand-labeled dataset. General language ability did not exist; you built it narrowly, one task at a time.

What changed. OpenAI trained a single very large model on one simple objective: predict the next word. They did this across a massive amount of internet text, repeated billions of times. To predict the next word well, a model must absorb grammar, facts, and reasoning patterns as a side effect. At a certain size (GPT-3 had 175 billion internal values, called parameters), it could suddenly handle tasks it was never built for, like basic translation or math. These are new capabilities that only show up at this scale.

Ripple effect on other rings. A single general engine meant the outer rings no longer needed a different model for each feature; one model could be pointed at almost anything.

What it unlocked. General-purpose text generation from a plain-English description of the task. For the first time, "just ask it" actually worked.

3. Following instructions: Instruction tuning and human feedback

Model · Google FLAN (2021) and OpenAI InstructGPT (2022)

The problem. A model trained only to continue text does not answer you; it continues you. Ask it a question and it might reply with three more questions in the same style, because that is a plausible continuation. It was a brilliant autocomplete, not an assistant.

What changed. A second training stage was added on top of the raw model. Google's FLAN showed that fine-tuning on many tasks phrased as instructions teaches a model to follow instructions it has never seen. OpenAI's InstructGPT added a further step called reinforcement learning from human feedback (RLHF). In this step, people rank the model's answers, those rankings train a scoring system, and the model is tuned to produce higher-scoring answers. The first step teaches the format (respond, don't continue). The second captures qualities too slippery to write as rules, such as being clear, direct, and honest.

Ripple effect on other rings. This is what made a usable product conceivable at all. An assistant you can simply talk to lives or dies on this single change.

What it unlocked. "Write me an email politely declining this meeting" producing exactly that: a direct, useful answer instead of a riff on the prompt.

4. A box anyone can type in: The chat interface

Product · OpenAI, ChatGPT (2022)

The problem. The technology worked, but using it required an engineer. You needed API keys, code, knowledge of tokens and settings, and the craft of formatting prompts. For everyone else, the technology was effectively invisible.

What changed. ChatGPT wrapped the model in a single, familiar surface: a chat box you could open and start typing into, with no manual and no setup. The main breakthrough was not the underlying intelligence, but the easy access. It quickly became the fastest-growing consumer application in history.

Ripple effect on other rings. This is the clearest case of a product-ring breakthrough with no real model change behind it; the leap was entirely about packaging. It also leaned on a clever harness trick. The model forgets everything between messages, so the application re-sends the entire conversation on every turn and the model re-reads it each time. Continuity is an illusion maintained from the outside; the model never actually remembers.

What it unlocked. Ordinary, daily use (like drafting, explaining, brainstorming, and tutoring) by hundreds of millions of people who would never have touched an API.

5. Acting and fetching: Tool use and function calling

Harness · Google's ReAct pattern (2022) and OpenAI function calling (2023)

The problem. A model on its own is sealed in. It cannot look anything up, run a calculation, or check today's date. It can only produce text from what it absorbed in training, which goes stale the moment training ends.

What changed. Google's ReAct showed a model could combine reasoning with actions, such as deciding to look something up, then using the result to reason further. OpenAI then made this reliable for developers with function calling. The model sends a formatted request like "search for X," and the harness performs the action, like running the search, executing code, or querying a database. The harness then pastes the result back for the model's next response. The deciding belongs to the model; the doing belongs to the harness.

Ripple effect on other rings. The harness stops being a passive pipe and becomes an executor that runs real programs on the model's behalf, opening a path to everything the model itself cannot do.

What it unlocked. Live answers (like checking the weather in Bengaluru) and real computation (like running code to analyze a spreadsheet) instead of confident guesses.

6. Plugging in private and current knowledge: Retrieval-augmented generation

Harness · Meta (2020; mainstream by 2023)

The problem. A model knows only what was in its training data. It cannot know your company's internal documents, last week's news, or any private files. Retraining it on fresh data every time is far too slow and expensive.

What changed. Meta researchers introduced retrieval-augmented generation (RAG). Before the model answers, the harness searches a separate knowledge store, such as documents, a wiki, or a database. It pulls the most relevant passages and places them into the model's input alongside the question. The model then answers based on that retrieved text rather than from memory alone, and can even point to where each claim came from.

Ripple effect on other rings. It turns a frozen model into something that feels current and domain-aware without touching the model's weights at all. The knowledge lives entirely in the harness, swappable at any moment.

What it unlocked. The "chat with your documents" category, including support bots grounded in a real help center, assistants that answer from a company handbook, and research tools that cite their sources.

7. Capacity without the cost: Mixture-of-Experts

Model · Mistral, Mixtral (2023)

The problem. Making a model smarter generally meant making it bigger, and a bigger model is slower and more expensive to run for every word it produces. Cutting-edge capability was colliding with extremely high running costs.

What changed. The French lab Mistral released Mixtral, an open-weight model built as a sparse Mixture-of-Experts. Instead of one dense network where every parameter fires for every word, the model holds many specialist sub-networks ("experts") and a small router sends each word to just a couple of them. The model can carry enormous total capacity while using only a fraction of it per word, providing high capability at a fraction of the running cost.

Ripple effect on other rings. Generating answers cheaper and faster is what lets the product ring offer snappy responses and generous free tiers. A hidden architecture choice quietly reshapes price and speed for the user.

What it unlocked. High-quality answers at a cost and speed that made always-on assistants and high-volume automation economically viable. Because Mixtral's weights were open, it was available to anyone.

8. Seeing and hearing: Native multimodality

Model · Google, Gemini (2023)

The problem. The systems were fluent in text but blind and deaf. A photo, a chart, a screenshot, a spoken question, or a video could not be understood directly. It all had to be converted to text first by some separate tool, losing detail along the way.

What changed. Google's Gemini was trained from the ground up on text, images, audio, and video together, rather than having vision bolted on afterward. Because the single model learned these forms jointly, it can take a picture or a sound as naturally as a sentence. It can reason across them in one place, describing an image, reading a graph, or transcribing and understanding speech in a single pass.

Ripple effect on other rings. Every other ring inherits new senses: the harness can feed images and audio into the model, and products can accept a photo or a voice note as input, not just typed words.

What it unlocked. Snap a photo of a broken appliance and ask what's wrong; point a camera at a menu in another language; ask questions about a chart or screenshot directly.

9. Working it out before answering: Reasoning

Model · OpenAI o1 (2024) and DeepSeek R1 (2025)

The problem. The model produced its answer in a single pass, straight away, like blurting the first thing that comes to mind. On a hard, multi-step problem it would commit to an early guess and get it wrong, with no chance to catch itself.

What changed. OpenAI's o1 introduced models that first generate a long private chain of intermediate steps, working the problem out, before committing to a final answer. Those steps are just more text the model writes and then reads back, so producing more of them means more computing power spent on hard problems. The models learn this through reinforcement learning on problems with checkable answers, like math and code, where correct reasoning paths get reinforced. Months later, DeepSeek's R1 showed the same leap could be reached with open weights, putting strong reasoning in anyone's hands.

Ripple effect on other rings. A more dependable engine on hard problems is necessary for trusting the harness to run the model over many steps on its own without it veering off course.

What it unlocked. Genuinely hard math, multi-step logic, and complex coding tasks that single-pass models reliably failed.

10. A universal plug for tools: The Model Context Protocol

Harness · Anthropic (2024)

The problem. Connecting a model to each external tool or data source meant building a custom, brittle integration for every pairing. Ten tools across three AI platforms meant thirty bespoke connectors to build and maintain. This tangled mess limited how much a system could actually reach.

What changed. Anthropic released the Model Context Protocol (MCP), an open standard, often described as "USB-C for AI," for how models connect to tools and data. A tool exposes its capabilities once through MCP; any MCP-aware system can then use it with no custom code. The standard was adopted across the industry within months.

Ripple effect on other rings. By standardizing the connection layer, the harness can suddenly reach a whole ecosystem of tools instead of a hand-built few. Products can now offer "connect your apps" as a simple toggle rather than an engineering project.

What it unlocked. Assistants that plug into your real tools (including calendar, email, code repository, design files, and ticketing) through a single shared interface.

11. Carrying out whole tasks: The agent loop

Harness · AutoGPT (2023), Anthropic computer use (2024), deep-research tools (2025)

The problem. The system could answer a question, but it could not carry out a task. Anything multi-step, like researching a topic, fixing a bug across several files, or planning a trip, still required a person to drive each step by hand. You had to ask, read, decide the next move, and ask again.

What changed. The harness began running the model in a sustained loop. The model proposes a step, the harness executes it and feeds back the result, the model reads that and proposes the next step, and this repeats until the task is done. The open-source AutoGPT made the pattern famous. Anthropic's "computer use" let a model operate a screen directly, and deep-research tools applied the loop to multi-step investigation. The system stops being something you converse with and becomes something you delegate to.

Ripple effect on other rings. Products shift from a simple chat box toward an interface where you delegate tasks. You hand over a goal and watch the progress unfold. The harness quietly takes on the routing as well, deciding how much reasoning each step deserves so you no longer have to choose.

What it unlocked. "Research these ten competitors and draft a comparison," or "find and fix this failing test," completed end-to-end while you do something else.

12. Remembering you across sessions: Persistent memory

Harness · OpenAI's ChatGPT memory (2024) and Anthropic's Claude memory (2026)

The problem. Every conversation started from zero. The system never remembered your name, your preferences, your projects, or anything you told it the day before, so you had to re-explain your context endlessly.

What changed. A memory layer was added around the model. As you talk, the harness quietly distills durable facts, such as your role, your preferences, and ongoing work. It stores them outside the model and slips the relevant ones back into the input on future conversations. The model itself still has no memory of its own; the remembering happens entirely in the harness. OpenAI's ChatGPT memory kept these as injected facts, while Anthropic's Claude later kept them as human-readable files you can open and edit.

Ripple effect on other rings. The product can now promise continuity, acting like an assistant that knows you. This is surfaced as a memory you can view, edit, and switch off, even though nothing in the model itself changed.

What it unlocked. An assistant that remembers your coding style, your writing voice, or your project's details without being re-briefed each time.

What the rings reveal

Step back from the twelve changes and a clear pattern appears. Read the ring labels in order, and the focus of activity shifts outward over time. The earliest improvements are almost all in the model, the core engine being built and shaped. The middle and later ones increasingly live in the harness, the software learning to feed, connect, loop, and remember on the model's behalf. The product ring, strikingly, earns only one improvement of its own: the chat interface. The rest of the time it is transformed indirectly, reshaped each time a change in an inner ring ripples outward into a new button, a new toggle, or a new capability.

Underneath all of it is you: the user. Look back at what each change took off your plate: formatting prompts, re-pasting context, fetching information, running code, picking a model, and driving a task step by step. From the user's seat, the history of AI products is a steady story of delegation. Each improvement absorbs another job you used to do by hand, until your main task is simply to say what you want and judge whether you got it.

That last part is where the story remains unfinished. The model still has no memory of its own, it still states things that aren't true, and giving it the power to act raises questions of trust and oversight that no ring has fully solved. Those are the jobs still sitting with you. If the pattern holds, the next improvements to watch are the ones that try to take those tasks off your plate too.