Digital
19mins

Five Questions Worth Asking About AI Right Now

June 26, 2026

Because the tools keep moving, and last year’s questions have already gone stale

Three days. That is how long the most capable artificial intelligence model ever released to the public managed to stay switched on. On 9th June 2026, Anthropic launched Claude Fable 5, and by independent benchmarks it was immediately the most powerful AI available to anyone with a subscription. On the 12th, the United States government ordered it shut down. Anthropic complied within hours, disabling it not just for American users but for everyone, everywhere, along with its more powerful sibling, Mythos 5. People who had paid to use it are being refunded. The best model of its kind in the world, gone in a single weekend.

You do not need to follow the details of export-control directives and national-security letters to take the point, and the point is this. The ground underneath artificial intelligence is shifting faster than anyone can write about it. The most advanced model on the planet had a public shelf life shorter than a pint of milk, pulled not by a competitor or a technical failure but by a government order that its own maker publicly disagreed with. The companies building these systems, and the governments trying to regulate them, are improvising in real time.

This is the world in which we are all now expected to make confident decisions about AI, and it is exactly why I have very little interest here in telling you what the current models can do. Whatever I write will be quaint by the time you act on it. The shelf life of an answer about AI is now measured in weeks. The shelf life of a good question is measured in years.

So this is not a piece about capabilities. It is about questions, the kind durable enough to survive whatever lands next, because questions endure where answers rot. Underneath the noise, the same move keeps repeating in classrooms, multi-academy trusts, marketing teams and boardrooms across the country. We are all getting quietly, alarmingly good at producing things we do not understand.

I have written a fair bit about AI on this blog already (see The Agent Problem, Are We Efficiently Rushing Nowhere? and What the Algorithm Knows if you want the longer arguments). I am not going to rehash all of that here. What I want to do instead is offer five questions that feel like the right ones to be asking in 2026 specifically, because the ground has shifted again. The first two come from the oddest new corner of the lot: vibe coding.

AI Generated Image. Gemini Prompt: a state-of-the-art machine powered down, the plug just pulled, three days after launch

1. Do we understand what we’re shipping, or just that it works?

If you have not come across the term, vibe coding is what happens when you describe what you want to an AI in plain English and accept the working code it gives you, without necessarily reading or understanding it. Andrei Karpathy named it in early 2025, though he was really just putting a label on something that was already happening everywhere. The honest definition is not “AI writes the code”. It is “you accept output you have not understood”. Hold onto that, because it is the bit that travels.

For a while the vibe was glorious, you could build a working app in a weekend with no engineering background. Then 2026 arrived with the receipts. A security firm called Tenzai built fifteen identical apps using five of the popular vibe coding tools and found sixty-nine vulnerabilities across them, six of them critical. Industry analysts started talking about code churn (the proportion of code rewritten or thrown away shortly after being written) climbing sharply, and code duplication rising several-fold. One write-up described vibe-coded systems beautifully, as behaving “more like an evolving transcript than a deliberately engineered platform”. The dream was speed. The bill, when it arrived six to twelve months later, was quality, security and maintainability.

I want to notice what actually went wrong there. The code worked. It demoed brilliantly. The problem only surfaced when real users did unexpected things, or when someone had to change it, and discovered that nobody understood how it held together, because nobody had ever understood it. “It works” and “we understand it” had quietly come apart, and we had stopped noticing the difference. One academic paper early in 2026 went so far as to argue that this is hollowing out open source itself, because the careful, unglamorous work of maintaining shared code (the stuff the whole internet quietly runs on) depends on people who actually understand it, and that is exactly the engagement vibe coding strips away.

This is not a software problem. It is a comprehension problem wearing a hi-vis jacket. Swap the code for a multi-academy trust data dashboard that produces a confident analysis of pupil progress nobody can interrogate. Swap it for the policy document generated in four minutes that no one in the building could actually defend to a governor. Swap it for the marketing report whose numbers look authoritative and whose assumptions nobody can name. Same move, same gap, different industry. The question to put to your own organisation is blunt: when we produce something with AI, does anybody understand it, or do we just know that it runs?

AI Generated Image. Gemini Prompt: a beautiful house with no foundations, photographed from below

2. Where does the gap go when the person who made it can’t fix it?

The uncomfortable thing about a comprehension gap is that it does not close. It moves. When you ship something you do not understand, you have not removed the need for understanding, you have simply posted it forward to whoever has to live with the thing next: the colleague who inherits the dashboard, the supply teacher handed the unfamiliar resource, the customer whose data leaks through a hole nobody spotted. The person who generated it got the speed. Somebody else, later, pays for the comprehension that was skipped.

The software world has, to its credit, started naming the discipline that closes this gap. The phrase doing the rounds is “vibe and verify”, and the sharper framing I keep seeing is that vibe coding is “a drafting tool, not a delivery model”. The slogan that sums it up is “judgment over syntax”. The interesting finding underneath all this is that senior developers are reporting enormous productivity gains from these tools, while juniors are at risk of never developing the judgement to catch what the AI gets wrong, because they let it do the very work that would have built that judgement. The tool helps most the people who least need it, and quietly hollows out the people who most do.

I made a version of this argument in The Agent Problem, where I talked about “agency literacy”, the capacity to know when you are using a tool versus delegating to an agent. This is its close cousin, which we might call comprehension literacy: knowing, for anything AI hands you, whether anyone actually understands it well enough to take responsibility when it fails. If the answer is nobody, you do not own that output. You are hosting it. And as I put it in that piece, drawing on the philosopher John Danaher’s idea of algocracy, the trouble with a black box is not that it makes bad decisions but that it makes them for reasons no one can scrutinise or stand behind.

So the second question is really about ownership. For every clever thing AI produces in your organisation, who is the human who could explain it, defend it and fix it when it breaks? Name them. If you cannot, the gap has not gone anywhere. It is sitting in your organisation with the fuse already lit.

And this travels across every sector, not just the ones that write code. The accountancy firm that lets a model do the first pass on a set of figures, the law firm that has AI draft the disclosure review, the NHS trust that hands triage summaries to an algorithm. Each is making the same bet, that the speed gained now will not be paid for later by someone who inherits work they cannot account for. The bet sometimes pays off. The trouble is that you only find out it has not at the precise moment you can least afford to.

AI Generated Image. Gemini Prompt: fast work that has long term consequences

3. Are we using AI to think, or to skip the thinking?

This is where vibe coding stops being a software story and becomes a story about all of us. Because the same trade is on offer everywhere now, not just in code. Let the model draft the email, the lesson, the report, the strategy. The question is whether, in taking the offer, you are doing more thinking or less.

The complexity scientist David Krakauer (president of the Santa Fe Institute) has a distinction I find more useful here than almost anything else, building on the cognitive scientist Donald Norman’s idea of “cognitive artifacts”. Krakauer splits the tools we think with into two kinds. Complementary cognitive artefacts make you better even when they are taken away: the abacus trains a mental model so that, after enough practice, you can do the sums in your head without it. Competitive cognitive artefacts do the opposite. They make you more capable while you hold them and no better, sometimes worse, the moment they are removed. Think of what satnav has done to your sense of direction.

To use a trite bastardised version of a famous proverb, complementary cognitive artifacts teach us how to fish; competitive cognitive artifacts simply deliver the fish, rendering us dependent.

Krakauer’s own worry, stated years before ChatGPT, was that AI would be the ultimate competitive cognitive artefact: a tool we lean on so completely, for such a wide range of thinking, that we gradually forget how to do the thinking ourselves. Vibe coding is that worry made literal. The junior developer who never struggles through the logic never builds the instinct. But it generalises instantly. The teacher who lets AI plan every lesson unread, the analyst who never wrestles with the data, the leader who outsources the first draft of every decision: all are running the competitive-artefact risk, getting more done today at the cost of being less able tomorrow.

I went into the personal side of this in Are We Efficiently Rushing Nowhere?, leaning on Ethan Mollick’s rule to “be the human in the loop” and Csikszentmihalyi’s work on flow, so I will not repeat it. The test, though, is simple and worth saying plainly. Does the way your people use AI leave them more capable without it next time, or less? Complementary or competitive? That single question tells you whether you are building capability or quietly renting it.

AI Generated Image. Gemini Prompt: a person whose shadow is doing all the work while they stand still

4. Who, or what, is now the authority on what’s true here?

In June 2025 the High Court of England and Wales issued a formal warning to the legal profession. Lawyers had been filing submissions that cited cases which did not exist. In a £90 million dispute involving Qatar National Bank, eighteen of the authorities cited were entirely fabricated; in a separate housing claim against the London Borough of Haringey, a barrister referenced five fictional cases. Dame Victoria Sharp, delivering the ruling, warned that such tools may “make confident assertions that are simply untrue”. What made those filings dangerous was not that they were wrong, because wrong is easy to deal with. It was that they were fluent, confident and indistinguishable on the surface from the real thing. That is the defining feature of what large language models produce: plausibility, generated with no particular stake in whether the result is true. The right answer and the confidently wrong answer arrive in identical packaging, in the same reasonable tone, with the same tidy formatting.

We are not well equipped for this, because we are trained to read confidence and fluency as competence. A nervous, hedging colleague gets second-guessed; a smooth, assured one gets believed. AI breaks that link entirely. The fabricated cases that embarrassed those lawyers were not amateurish or obviously dodgy. They were polished, plausible and formatted exactly like genuine authorities, which is precisely why they slipped through. The sycophants are running the asylum.

I explored the institutional version of this in What the Algorithm Knows, which asked who, in the absence of any deliberate decision, has become the authority on what your community believes to be true. That piece was about feeds and engagement algorithms. The same question now applies inside the organisation, pointed at AI output. When a model drafts the analysis a decision rests on, who in the room is responsible for asking “how do we actually know this is right?”, and do they have the standing to stop the thing on that basis? In a school, the AI-generated feedback on a child’s work can read beautifully and be quietly, confidently wrong about what that child needs next. Someone has to be able to see the gap. If nobody owns that job, fluency will do your believing for you, and fluency does not care whether it is right.

The Post Office Horizon scandal is the cautionary tale that should haunt every leader on this point. “Computer says no” was allowed to override the lived testimony of hundreds of sub-postmasters, because the output of a system carried an air of objective authority that the humans around it felt unable to challenge. The technology has changed; the human failure has not. Plausible, authoritative-sounding output is at its most dangerous exactly when nobody feels entitled to say “hang on, that doesn’t look right to me”.

5. What are we deliberately choosing not to automate?

The first four questions assume you are automating and ask how to do it with your eyes open. The fifth is the one almost nobody reaches, and it is the most strategic of the lot. Where will you decline, on principle?

This is not technophobia. I use this stuff every single day and a good chunk of my living depends on it, so I am hardly burning my laptop in the garden. It is about recognising that some things carry their value precisely because a human did them. The handwritten card. The difficult conversation held in person rather than fired off as a generated email. The pastoral phone call home that matters because a real person chose to pick up the phone. In a school, the marking that counts is not the tick, it is the teacher coming to know the child through their work. Automate the tick and you save time. Automate the knowing and you have removed the entire point.

I made a related case in What Gets Measured Gets Managed, about the difference between legibility and understanding, and about the things that resist measurement because the act of measuring them destroys what made them valuable. Watching a master glassblower in Murano work molten glass into a horse in ten minutes, I was watching something no data process and no model could replicate, built over fifteen or twenty years of practice. The same is true of the judgement of a great teacher or a seasoned leader. Some of that, you protect on purpose, even when the machine could produce a passable imitation faster and cheaper.

There is a hard-headed edge to this, not just a sentimental one. As more of what every organisation produces becomes machine-generated and therefore interchangeable, the things that are visibly, reliably human start to acquire scarcity value. The school that genuinely knows its families, the firm whose advice is unmistakably the considered work of a person who understands you: these get harder to commodify, not easier. Restraint, chosen well, is a form of differentiation that no model release can erode. The discipline is to decide your lines in advance, in the cold light of day, not in the heat of a budget meeting when the efficiency case is being pressed on you hard. What do we do that is valuable because a person does it, and are we protecting it on purpose?

AI Generated Image. Gemini Prompt: a craftsperson’s hands beside a switched-off machine

The questions outlast the answers

The thing about writing on AI is that any verdict on what the technology can or cannot do is decaying before the ink is dry. Eighteen months ago, vibe coding was a thrilling novelty. Now it has a hangover and a literature. Eighteen months from now, the specifics in this piece will look just as dated, and I cannot tell you what the model you will be using then will be capable of. Neither, honestly, can the people building it.

What I can tell you is that these five questions will still be worth asking. Whether we understand what we ship. Where the gap goes when we do not. Whether the tool is making us more capable or less. Who gets to say what is true. And what we have decided, deliberately, to keep human. Fable 5 was the most powerful model in the world for roughly seventy-two hours (and will likely come back with a vengeance very soon); the tools will keep arriving and vanishing at that pace. The one thing that holds steady is the quality of the questions you bring to them. That is the whole discipline, the same one the lawyers and the consultants forgot: do not pass on what you do not understand. The answers will keep changing. The questions, if they are the right ones, do not.

Key Takeaways

  1. Separate ‘it works’ from ‘we understand it’. A demo running is not the same as anyone knowing how it holds together, and the difference shows up months later when something breaks.
  2. Comprehension gaps move, they don’t close. Whatever you ship without understanding becomes someone else’s problem downstream, so name the human who can explain and fix it.
  3. Ask whether your tools are complementary or competitive. Following Krakauer, the test is simple: does using AI leave your people more capable without it next time, or less?
  4. Treat fluency as no proof of truth. AI produces plausibility, not accuracy, so make sure someone in the room owns the question of how we actually know this is right.
  5. Decide what you will not automate, in advance. Some things carry value because a person did them, and that restraint is increasingly a competitive edge, not a soft one.
  6. Hold the questions, not the answers. The specifics about what AI can do will date within months; the habit of asking the right questions is what actually lasts.

If you take one thing from all this, let it be the cheapest and most durable tool you own: a good question, asked again and again, of whatever the technology becomes next.

Subscribe Now

Subscribe to receive the latest blog posts directly to your inbox every week.

By subscribing, you agree to our Privacy Policy.
Thank you! Your submission has been received!
Oops! Something went wrong. Please try again later.