Treat LLMs Like Colleagues, Not Tools

mogwai.

Aug 16

The time has come to anthropomorphise the machine.

Read →

17 Comments

Ese

Aug 16

I enjoyed reading. Enticing case for how we view LLMs.

I don't know if I'll say they have elevated to colleague level yet. That's giving too much power than my vanity allows for now.

I think my usage with Claude leans more towards constructive cynicism so I still continue to view it as a tool.

I do like that it tries to be objective and almost brutal in its review and feedback when prompted.

It is an interesting idea to view them as a colleague that can drag me without me getting vexed.

Context Engineering just sounds like another way of saying discussing with an argumentative debater who you might respect.

It does seem like the people who might climb the step from prompt to context are those who like to debate, are curious enough and or spend time prompting long enough to move up the next step. That dependence seems slower than we might predict.

PS: I never heard of arXiv until it was referenced in a book about the digital future I just finished reading called Everything is Miscellaneous by David Weinberger and now I see it everywhere and spend time looking through what they publish that I might like or be curious about.

Which is insane when I consider how much reading and research I have done in my life. There continues to be so many worlds out there to explore. I love it.

Expand full comment

Reply (1)

mogwai.

Aug 16

ese, you know i love you. this comment made me love you more. your perspective on the slowness of the shift from prompt engineering to context engineering is one i overlooked in my characteristic arrogance, but i see it now.

thanks for the book, everything is miscellaneous. i love it on the title alone and will add to my library.

arXiv is frankly one of the best things that happened to me on the internet, though i hear people complain about how all the research there is bad and fake—but to me that's a curation problem. even superficial searches for peer reviews and collaborators introduces some forms of rigor.

Expand full comment

TommyDate

Aug 16

Interesting read!

I use Cursor rules to enforce coding standards, but I wonder if these strict guidelines make it feel more like a tool than a collaborative colleague. Any suggestions on balancing rules with a more open approach to get more accurate outputs?

Offtop: I think I saw you working with AirPods on the train yesterday. I wanted to say hi but didn’t want to interrupt your flow.

Expand full comment

Reply (1)

mogwai.

Aug 16Edited

that's a very good question, and i don't feel confident enough to answer it today. i'll need to experiment to figure out what the right guard rails for an LLM are. i assume that, like anyone, constraints like cursorrules can help with being productive, as long as they're not stifling.

also: if you saw me yesterday, then you definitely saw me writing the first draft of this piece lol. thanks for not interrupting me, because i'm already so distracted as it is!

Expand full comment

Osarumen Osamuyi

Aug 20

tech bros have invented empathy

Expand full comment

Reply (1)

mogwai.

Aug 20

💀💀💀💀

Expand full comment

Emmanuel A

Aug 19

genuinely stoked by this piece man! although i'd push back a bit on the prompt engineering as strictly imperative vs. context engineering as declarative framing, entirely premised on my reflections from my own tinkering with LLMS of course. for the purpose of brevity(CE, PE represent those terms) when you describe PE as treating the LLM like a "database with a weird query language", direct commands like "concisely give me ten creative names" throttle the response space, potentially omitting key details(e.g, the tech angle in your dream analysis example). but I'd argue that the reverse feels truer?: CE often acts more imperative because it actively commands the LLM's mindset and constraints before any action is prompted. take your market example for instance, the second interaction does more than declaratively setting the scene(me thinks), it imperatively directs the "colleague" to adopt and foster a specific lens("there's a woman with a scar... so pork is available"), filling up the LLM's "vast unknowable void" and shaping its entire reasoning path. without that upfront command, even a sharp prompt risks vague or off-tangent results and it's evidenced in the bland first claude response.

i like to think context just serves as an umbrella term for both imperative and declarative elements measured by degree. a light context("You're a tech journalist") sets a vibe declaratively because it just describes the vibe but when you ramp it up just like in your collaborative claude chat pushing for 'Oneiros Labs' over 'DreamDecode' it becomes rather fiercely imperative and more dialed-in, and it circles back nicely into your point about respecting LLM's character by giving it room to interpolate creatively but also highlights why context probably isn't always the modest, hands-off approach you've tried to portray. it is the dominant lever in complex tasks for sure, though PE can still nail simpler tasks, that i can get behind fully.

Expand full comment

Reply (2)

mogwai.

Aug 19

i think we fundamentally agree, though we may be seeing the same thing with different lenses.

let me expand my perspective and tell me if we still disagree:

i think the process of llm retrieval is imperative (masked by its stochasticity). prompt engineering alone one-shots the imperative structure and locks it in (only stochasticity makes it brittle). there was a more powerful, imperative metaphor that could have been arrived at declaratively (context engineering). once locked in, you can make it the one-shot "prompt" (prompt engineer last, or not at all. sometimes the journey is all we need)

here's a quote from this very article where i may have poorly expressed that sentiment:

"Prompting prematurely constrains the probability space of an LLM (though it makes the LLM more ‘predictable’—it’s like a pen for pigs)".

The key term, as I was trying to capture, was the "premature" component for complex tasks.

Expand full comment

Reply (1)

mogwai.

Aug 19

how about this framing?

PE: imperative for human, declarative for machine.

CE: declarative for human, imperative for machine?

CE forces the LLM into a more precise cognitive state?

Expand full comment

Reply (1)

Emmanuel A

Aug 19

ummm, that actually clicks. i kinda like the human vs. machine lens as a reconciliation sha *violently*. from the human seat, PE feels like we’re bossing the model around, CE feels like we’re just teeing up the vibe. but under the hood, it’s inverted: PE is just a naked string, while CE is what actually corrals the probability space into a “cognitive state.”

that said, i’d still frame it less as a binary and more as a throttle. both of them carry declarative + imperative force — just at different times and intensities.

Expand full comment

Reply (1)

mogwai.

Aug 19

you’re right. for complex tasks: human explores until probability of output determinism is concrete, *then* you can lock it in PE?

Expand full comment

Reply (1)

Emmanuel A

Aug 19

we’ve achieved parity my liege!

Expand full comment

Emmanuel A

Aug 19

yeah, i think we are on the same wavelength here. where you emphasise “prompting prematurely constrains the space”, that’s exactly the itch i was trying to scratch lol. i just came at it from the angle that context, when dialled up, isn’t really a neutral backdrop anymore but an imperative force in its own right.

i think the cleanest way to reconcile would be: context declares, then quietly commands. at low intensity it’s descriptive but once you start to layer in rich scaffolding, it hardens into an imperative lens that governs everything downstream. PE is just the overt one shot version of that. this leads me back to my point of why CE is the dominant lever for complex tasks because it buys time before collapsing into a single probability path.

Expand full comment

Reply (1)

mogwai.

Aug 19

perfect, we’re in *violent* agreement. i already attempted to “cleanly” define it in another reply—let me know what you think!

Expand full comment

Reply (2)

Emmanuel A

Aug 19

yes, exactly😂

Expand full comment

mogwai.

Aug 19

also i just scanned my piece for tone and realize you may have gotten this impression because i seemed to be unnecessarily shitting on prompt engineering with the “pseudo-scientific” thing.

i see it now. thank you.

Expand full comment

🥷🏼

Aug 17

Just stumbling upon this substack from a comment on Ezra's tweet & this is beautiful & interesting to read.

In the past, I often see context prompting as wasting time & ultimately tokens(compute) that should be channelled towards other tasks for them, but I'm now going to do away with that thinking and see how much better output context prompting can get me.

Thank you for this really, your writing style is quite captivating.

Expand full comment

Craft Overflow

Treat LLMs Like Colleagues, Not Tools