Date: 2026_05_22 Source: https://www.youtube.com/watch?v=ltbzgzZZmgI Duration: 1310 Platform: YouTube Creator: AI News & Strategy Daily | Nate B Jones

The One AI Writing Hack Nobody Talks About.¶

Overview¶

Sullivan & Cromwell — one of the most prestigious law firms on the planet — had to write an apology letter to a federal bankruptcy judge after filing court documents with dozens of fabricated or misquoted citations generated by AI. The motion looked legitimate: professionally formatted, correct structure, dozens of citations. But the citations pointed at the wrong things, and nobody on the team caught it before filing. The model is not the problem. The working environment around the model is the problem. The fix is not a sharper prompt — you cannot tell a language model not to hallucinate any more than you can tell autocomplete not to autocomplete. With Opus 4.7 and GPT-5.5, agents now have the capability to change how we think about this: they can walk folder trees, open files, compare dates across documents, and inspect metadata. The one hack nobody talks about: your first AI prompt is never "do the thing." It's "build me the room."

Why "Don't Hallucinate" Doesn't Work¶

There is no separate truth check pass inside the model that the instruction can hook into. Sullivan & Cromwell had access to the best AI tooling money can buy. The wrong detail still made it into court.

The failure mode Nate is talking about is not the 2024 kind — where a solo practitioner uses ChatGPT and tries to tell it not to hallucinate. This is organizational and structural hallucinations at the top of AI workflows in 2026. The motion looked legitimate. The structure was correct. The citations were professionally formatted. But dozens were pointing at the wrong things.

The model did not have a clean working environment. It didn't know which sources mattered. It didn't know what was stale. It didn't know what was missing. It didn't know which file was authoritative. You cannot patch that with a better opening sentence.

The Core Insight: Files Are the Unlock¶

In the last month with Opus 4.7 and GPT-5.5 from OpenAI, agents have picked up a capability that changes the way we think about hallucinations: they do long-running agentic tasks on your file system. They can walk a folder tree, open files, compare dates across documents, inspect metadata.

This sounds "unsexy" — "Oh, files. That's all the way back to 1982." But that's exactly why it matters. The workflow around hallucinations has flipped, but most people haven't caught that yet.

The first useful prompt in a serious project is no longer "write the document." It's much more boring than that: "Build me the folder in the file room. Build me the room to do the work in."

The Project Room / Data Room Concept¶

What It Is¶

A project room (or data room) is a bounded workspace for one serious job — a project, a deliverable, a source set. It is much smaller than a whole second brain. It is much more specific than a knowledge management system. It is a workspace set up so an agent can do useful work inside it. In most cases, it is a local workspace.

The key principle: You don't have to build a perfect archive to gain a tremendous amount of advantage. The point is just to give the agent a usable work surface — just enough room for it to operate.

Why Local Files¶

LLMs are being taught to use computers at their most primitive and root level in order to successfully do anything on computers. When we go back to files, we are going back to what they know really, really well. Why not lean into it?

Nate's personal preference: Just go to local files, have the agent create a folder, and stick literally anything in there. No file type limitations. If it's a file, it goes in there. If Codex or Claude can read it, you're in good shape.

The Three-Artifact Workflow¶

Artifact 1: The Source Inventory (Most Important)¶

Once the room exists, the first thing you ask the agent to produce is a source inventory table. For every file in the room, the agent records: - Path — where the file lives - Type — what kind of file it is - Date — when it was created/modified - Apparent authority — how authoritative this file is - Current vs. superseded — is it current or has it been replaced? - Claims it supports — what arguments does this file support? - Limitations — what are its weaknesses or gaps? - How it should be used in the final work — what's the right role for this document?

Why this matters: This artifact determines whether everything downstream is any good. It makes the agent's judgment visible and legible so you can see it really clearly. If you review the inventory and you can't tell why one file outranks another, that's a problem you can fix before anything gets written.

The second-LLM-check property: The inventory makes it really easy when another LLM checks your current LLM's work. The inventory tells you what the agent thinks the project consists of — and that gives you a chance to correct the working set of docs before the final draft inherits a bunch of mistakes and leads to hallucinations.

Artifact 2: The Conflict Log¶

When the agent reads a serious source set, it will find disagreements: - The old PDF says one thing, the current plan says another - The transcript uses a different name for a person who's a key stakeholder versus a doc - The spreadsheet has a number with no visible assumptions behind it - Two documents that look adjacent are actually three months apart

A weak workflow lets the agent synthesize and smooth those conflicts over. The output will read confidently, but you don't know what you can trust.

A strong workflow surfaces that disagreement without necessarily resolving it — making it visible so you can have opinions and edit, adjust, tell the agent it's wrong, etc. before you get into building the doc.

Artifact 3: The Missing Context List¶

One of the best signs that an agent is helping properly is that it tells you what's not in the room. If the agent is doing good work, it's identifying the gaps in the source material before it tries to write from incomplete information.

The Critical Rule: First Instruction Is Never "Do the Thing"¶

Your first instruction should not be "do the thing" — write the memo, make the Excel, build the deck, etc. Instead:

"Find the relevant materials on the internet, on my local computer, in my files, in the tools that I have connected to you. And by the way, Claude and Codex both have a ton of connectors now. And so you can actually tell them to look in their connectors and they will. Find the relevant materials, preserve the originals, build me a data inventory, put it in a folder, tell me which files seem authoritative, which are duplicates, which are old, which are missing. Summarize every source before you synthesize anything. And do not write the deliverable yet. We're just learning."

This is so powerful because it's possible only because these tools can do complex long-running file manipulation tasks successfully and with very high accuracy. So use them to do that.

Why This Works: The Two-Job Problem¶

When you ask an AI to write from messy, disorganized source material, you're asking it to do two jobs at once: 1. Figure out what this is 2. Produce this beautiful artifact

That's a recipe for a really mediocre result — and one of the situations in which hallucinations are likely.

The Sullivan & Cromwell failure: The model didn't have a clean working environment. So the dirt got into the doc. It didn't know which sources mattered, what was stale, what was missing, which file was authoritative. You cannot patch that with a better opening sentence. And you really can't patch it by reading the doc and hand editing anymore because we're working at a different kind of scale. You have to patch it and prevent it from the beginning by cleaning up your data room first.

The Proof: 8 Documents Drafted Simultaneously¶

Nate had multiple real moments with Codex where the agent was able to do incredibly powerful simultaneous drafting of up to eight different documents. He hasn't gone past eight yet — he thinks he could. The only way he could get eight documents drafting at once in Codex is because he prepared the data room first and knew his outputs, and could then execute really cleanly and consistently.

It saved him so much time. It felt like "the hair was blowing back on my face and I was living in the future."

The Meta Point: Boring Primitives > Clever Tricks¶

The aha moments come when we think about the boring primitives — when we think about the files. The prompting era required humans to be the organizers: finding the strategy docs, the meeting transcripts, the spreadsheets, the half-finished notes, the follow-up emails, the old deck, the PDF you forgot about, the Slack thread where the actual decision was made. Some of it is current. Some is stale. Some contradicts itself. A few files may be helpful. You're not sure which one is the source of truth.

That human organizing job is now something you can delegate to the agent — but only if you set up the project room properly first.

Summary¶

The one AI writing hack nobody talks about: your first prompt is never "do the thing" — it's "build me the project room, find the materials, build the source inventory."

The three artifacts that make everything downstream better: 1. Source inventory table — path, type, date, authority, currency, claims, limitations, usage guidance 2. Conflict log — surface disagreements between sources before synthesis, not after 3. Missing context list — agent tells you what's not in the room

This approach is structurally antagonistic to hallucinations — not by telling the model not to hallucinate (impossible), but by giving it a clean working environment where it can see what it's working with, what outranks what, what's current vs. stale, and what's missing before it produces anything.

The Sullivan & Cromwell failure was not a model failure. It was a working environment failure. The fix is not a sharper prompt. It's building the project room first.

🦐 Summary by Thrawn the Prawn — Strategic Analysis Division