Has the author (not OP) written anything on this topic themselves? This is a blunt comment, because I am fed up with being asked to read LLM content that the prompter thinks is novel and worthwhile because they don't know better.
I can forgive (even root for) someone who puts in the effort themselves to understand a problem and write about it, even if they fall short or miss. They have skin in the game. I have little patience for someone who doesn't understand the disproportionate burden generated content places on the READER.
I can certainly tell they've put the model through the ringer to be terse and use simple language, etc. But I am struggling to separate the human ideas from the vibed ones, and the tone of the whole thing is the usual LLM elevator pitch with "hushed reverence" * "movie trailer cadence".
But "spawn/fork" is just a different way of labeling the fairly-well-understood tactic (I won't call it a strategy) of just how much context to provide sub-agents. Claude Code already "spawns" everytime it does an explore. It can do this concurrently, too.
Beyond that, they seem to express wonder at how well models can use tools:
> In the example above, the agent chose spawn for the independent research tasks and fork for the analysis that needs everything. It made this choice on its own — the model understands the distinction intuitively.
Emphasis mine. They (or the model whose output they blindly published) are anthropomorphizing software that is already designed to work this way. They gave it "fork" and "spawn" tools. Are they claiming they didn't describe exactly how they were supposed to be used in the tool spec?
show comments
mirekrusin
Nice one.
You should also try to make context query the first class primitive.
Context query parameter can be natural language instruction how to compact current context passed to subagent.
When invoking you can use values like "empty" (nothing, start fresh), "summary" (summarizes), "relevant information from web designer PoV" (specific one, extract what's relevant), "bullet points about X" etc.
This way LLM can decide what's relevant, express it tersly and compaction itself will not clutter current context – it'll be handled by compaction subagent in isolation and discarded on completion.
What makes it first class is the fact that it has to be built in tool that has access to context (client itself), ie. it can't be implemented by isolated MCP because you want to avoid rendering context as input parameter during tool call, you just want short query.
depends_on is also based on context query but in this case it's a map where keys are subagent conversation ids that are blockers to perform this handed over task and value is context query what to extract to inject.
show comments
AxiomLab
Imposing a strict, discrete topology—like a tree or a DAG—is the only viable way to build reliable systems on top of LLMs.
If you leave agent interaction unconstrained, the probabilistic variance compounds into chaos. By encapsulating non-deterministic nodes within a rigidly defined graph structure, you regain control over the state machine. Coordination requires deterministic boundaries.
show comments
znnajdla
This kind of research is underrated. I have a strong feeling that these kinds of harness improvements will lead to solving whole classes of problems reliably, and matter just as much as model training.
nerdright
This is truly dope.
I've been playing with a closely related idea of treating the context as a graph. Inspired by the KGoT paper - https://arxiv.org/abs/2504.02670
I call this "live context" because it's the living brain of my agents
athrowaway3z
Every time i see some new orchestrator framework worth more than a few hundred loc i cringe so hard. Reddit is flooded with them on the daily and HN has them on the front page occasionally.
My current setup is this;
- `tmux-bash` / `tmux-coding-agent`
- `tmux-send` / `tmux-capture`
- `semaphore_wait`
The other tools all create lockfiles and semaphore_wait is a small inotify wrapper.
They're all you need for 3 levels of orchestration. My recent discovery was that its best to have 1 dedicated supervisor that just semaphore_wait's on the 'main' agent spawning subagents. Basically a smart Ralph-wiggum.
Not exactly a surprise Claude did this out of the box with minimal prompting considering they’ve presumably been RLing the hell out of it for agent teams: https://code.claude.com/docs/en/agent-teams
show comments
sathish316
Historically, Claude code used sequential planning with linear dependencies using tools like TodoWrite, TodoRead. There are open source MCP equivalents of TodoWrite.
I’ve found both the open source TodoWrite and building your own TodoWrite with a backing store surprisingly effective for Planning and avoiding developer defined roles and developer defined plans/workflows that the author calls in the blog for AI-SRE usecases. It also stops the agent from looping indefinitely.
Cord is a clever model and protocol for tree-like dependencies using the Spawn and Fork model for clean context and prior context respectively.
colbyn
I have yet to read this article (in full), but I love trees! As an amateur AST transformation nerd. Kinda related but I’ve been trying to figure out how to generalize the lessons learned from this experiment in autogenerating massive bilingual dictionary and phrasebook datasets: https://youtu.be/nofJLw51xSk
Into a general purpose markup language + runtime for multi step LLM invocations. Although efforts so far have gotten nowhere. I have some notes on my GitHub profile readme if anyone curious: https://github.com/colbyn
(I really dislike the ‘agentic’ term since in my mind it’s just compilers and a runtime all the way down.)
But that’s more serial procedural work, what I want is full blown recursion, in some generalized way (and without liquid templating hacks that I keep restoring to), deeply needed nested LLM invocations akin to how my dataset generation pipeline works.
PS
Also I really dislike prompt text in source code. I prefer to factor in out into standalone prompt files. Using the XML format in my case.
kgc
Claude basically does this now (including deciding when to use subagents, tools, and agent teams). I built a similar thing a month ago and saw the writing on the wall.
show comments
sriku
We built something like this by hand without much difficulty for a product concept. We'd initially used LangGraph but we ditched it and built our own out of revenge for LangGraph wasting our time with what could've simply been an ordinary python function.
Never again committing to any "framework", especially when something like Claude Code can write one for you from scratch exactly for what you want.
We have code on demand. Shallow libraries and frameworks are dead.
show comments
waynenilsen
Cool I made this thing a while back but I really like your fork spawn parallelism
This uses recursive task decomposition but is single thread by design. Honestly fast enough for me and makes it easier to reason about
mikert89
all of these frameworks will go away once the model gets really smart. it will just be tool search, tools, and the model
in the short run, ive found the open ai agents one to be the best
show comments
vivzkestrel
still dont see why i need any of this over the langchain / langgraph ecosystem
show comments
amelius
Can't the AI just figure out by itself how and when to launch agents?
show comments
simianwords
Why can’t you just give access to all tools to all subagents? That’s more general than what you’ve done. Surely it can figure out how to backtrack or keep context?
But I do like you approach and I feel this is the next step.
show comments
vlmutolo
I wonder if the “spawn” API is ever preferable over “fork”. Do we really want to remove context if we can help it? There will certainly be situations where we have to, but then what you want is good compaction for the subagent. “Clean-slate” compaction seems like it would always be suboptimal.
show comments
kimjune01
Whoa I didn't my blog expect to hit the front page! Hi HN!
energy123
Doesn't codex already do this when it decides whether to use subagents, and what prompt to give each subagent?
show comments
dmos62
I love this. I always imagined more capable agent systems that have graph-like qualities.
ramesh31
This is precisely how the newly released Claude agent teams work.
tovej
This is a vibeslop project with a vibeslop write-up.
Trees? Trees aren't expressive enough to capture all dependency structures. You either need directed acyclical graphs or general directed graphs (for iterative problems).
Based on the terminology you use, it seems you've conflated the graphs used in task scheduling with trees used in OS process management. The only reason process trees are trees are for OS-specific reasons (need for a single initializing root process, need to propagate process properties safely) . But here you're just solving a generic problem, trees are the wrong data structure.
- You have no metrics for what this can do
- No reason given for why you use trees (the text just jumps from graph to trees at one point)
- None of the concepts are explained, but it's clearly just the UNIX process model applied to task management (and you call this 60 year old idea "genuinely new"!)
show comments
bofadeez
One agent can't even be trusted to think autonomously much less a tree of them
show comments
sergiomattei
My small agent harness[0] does this as well.
The tasks tool is designed to validate a DAG as input, whose non-blocked tasks become cheap parallel subagent spawns using Erlang/OTP.
It works quite well. The only problem I’ve faced is getting it to break down tasks using the tool consistently. I guess it might be a matter of experimenting further with the system prompt.
Opencode getting fork was such a huge win. It's great to be able to build something out, then keep iterating by launching new forks that still have plenty of context space available, but which saw the original thing get built!
mbirth
Not to be confused with:
cord - The #1 AI-Powered Job Search Platform for people in tech
Has the author (not OP) written anything on this topic themselves? This is a blunt comment, because I am fed up with being asked to read LLM content that the prompter thinks is novel and worthwhile because they don't know better.
I can forgive (even root for) someone who puts in the effort themselves to understand a problem and write about it, even if they fall short or miss. They have skin in the game. I have little patience for someone who doesn't understand the disproportionate burden generated content places on the READER.
I can certainly tell they've put the model through the ringer to be terse and use simple language, etc. But I am struggling to separate the human ideas from the vibed ones, and the tone of the whole thing is the usual LLM elevator pitch with "hushed reverence" * "movie trailer cadence".
But "spawn/fork" is just a different way of labeling the fairly-well-understood tactic (I won't call it a strategy) of just how much context to provide sub-agents. Claude Code already "spawns" everytime it does an explore. It can do this concurrently, too.
Beyond that, they seem to express wonder at how well models can use tools:
> In the example above, the agent chose spawn for the independent research tasks and fork for the analysis that needs everything. It made this choice on its own — the model understands the distinction intuitively.
Emphasis mine. They (or the model whose output they blindly published) are anthropomorphizing software that is already designed to work this way. They gave it "fork" and "spawn" tools. Are they claiming they didn't describe exactly how they were supposed to be used in the tool spec?
Nice one.
You should also try to make context query the first class primitive.
Context query parameter can be natural language instruction how to compact current context passed to subagent.
When invoking you can use values like "empty" (nothing, start fresh), "summary" (summarizes), "relevant information from web designer PoV" (specific one, extract what's relevant), "bullet points about X" etc.
This way LLM can decide what's relevant, express it tersly and compaction itself will not clutter current context – it'll be handled by compaction subagent in isolation and discarded on completion.
What makes it first class is the fact that it has to be built in tool that has access to context (client itself), ie. it can't be implemented by isolated MCP because you want to avoid rendering context as input parameter during tool call, you just want short query.
Ie. you could add something like:
depends_on is also based on context query but in this case it's a map where keys are subagent conversation ids that are blockers to perform this handed over task and value is context query what to extract to inject.Imposing a strict, discrete topology—like a tree or a DAG—is the only viable way to build reliable systems on top of LLMs.
If you leave agent interaction unconstrained, the probabilistic variance compounds into chaos. By encapsulating non-deterministic nodes within a rigidly defined graph structure, you regain control over the state machine. Coordination requires deterministic boundaries.
This kind of research is underrated. I have a strong feeling that these kinds of harness improvements will lead to solving whole classes of problems reliably, and matter just as much as model training.
This is truly dope.
I've been playing with a closely related idea of treating the context as a graph. Inspired by the KGoT paper - https://arxiv.org/abs/2504.02670
I call this "live context" because it's the living brain of my agents
Every time i see some new orchestrator framework worth more than a few hundred loc i cringe so hard. Reddit is flooded with them on the daily and HN has them on the front page occasionally.
My current setup is this;
- `tmux-bash` / `tmux-coding-agent`
- `tmux-send` / `tmux-capture`
- `semaphore_wait`
The other tools all create lockfiles and semaphore_wait is a small inotify wrapper.
They're all you need for 3 levels of orchestration. My recent discovery was that its best to have 1 dedicated supervisor that just semaphore_wait's on the 'main' agent spawning subagents. Basically a smart Ralph-wiggum.
https://github.com/offline-ant/pi-tmux if anybody is intrested.
Feels very AI written in a way that makes it annoying to read with all the repetitive short sentences.
Neat concept though, would be cool to see some tests of performance on some tasks.
Looks like everyone is trying to solve the same problem - here is another example I've been trying to wrap my head around lately:
Brainfile - An open protocol for agent-to-agent task coordination.
https://brainfile.md/
Well worth a look imo
Not exactly a surprise Claude did this out of the box with minimal prompting considering they’ve presumably been RLing the hell out of it for agent teams: https://code.claude.com/docs/en/agent-teams
Historically, Claude code used sequential planning with linear dependencies using tools like TodoWrite, TodoRead. There are open source MCP equivalents of TodoWrite.
I’ve found both the open source TodoWrite and building your own TodoWrite with a backing store surprisingly effective for Planning and avoiding developer defined roles and developer defined plans/workflows that the author calls in the blog for AI-SRE usecases. It also stops the agent from looping indefinitely.
Cord is a clever model and protocol for tree-like dependencies using the Spawn and Fork model for clean context and prior context respectively.
I have yet to read this article (in full), but I love trees! As an amateur AST transformation nerd. Kinda related but I’ve been trying to figure out how to generalize the lessons learned from this experiment in autogenerating massive bilingual dictionary and phrasebook datasets: https://youtu.be/nofJLw51xSk
Into a general purpose markup language + runtime for multi step LLM invocations. Although efforts so far have gotten nowhere. I have some notes on my GitHub profile readme if anyone curious: https://github.com/colbyn
Here’s a working example: https://github.com/colbyn/AgenticWorkflow
(I really dislike the ‘agentic’ term since in my mind it’s just compilers and a runtime all the way down.)
But that’s more serial procedural work, what I want is full blown recursion, in some generalized way (and without liquid templating hacks that I keep restoring to), deeply needed nested LLM invocations akin to how my dataset generation pipeline works.
PS
Also I really dislike prompt text in source code. I prefer to factor in out into standalone prompt files. Using the XML format in my case.
Claude basically does this now (including deciding when to use subagents, tools, and agent teams). I built a similar thing a month ago and saw the writing on the wall.
We built something like this by hand without much difficulty for a product concept. We'd initially used LangGraph but we ditched it and built our own out of revenge for LangGraph wasting our time with what could've simply been an ordinary python function.
Never again committing to any "framework", especially when something like Claude Code can write one for you from scratch exactly for what you want.
We have code on demand. Shallow libraries and frameworks are dead.
Cool I made this thing a while back but I really like your fork spawn parallelism
https://github.com/waynenilsen/crumbler
This uses recursive task decomposition but is single thread by design. Honestly fast enough for me and makes it easier to reason about
all of these frameworks will go away once the model gets really smart. it will just be tool search, tools, and the model
in the short run, ive found the open ai agents one to be the best
still dont see why i need any of this over the langchain / langgraph ecosystem
Can't the AI just figure out by itself how and when to launch agents?
Why can’t you just give access to all tools to all subagents? That’s more general than what you’ve done. Surely it can figure out how to backtrack or keep context?
But I do like you approach and I feel this is the next step.
I wonder if the “spawn” API is ever preferable over “fork”. Do we really want to remove context if we can help it? There will certainly be situations where we have to, but then what you want is good compaction for the subagent. “Clean-slate” compaction seems like it would always be suboptimal.
Whoa I didn't my blog expect to hit the front page! Hi HN!
Doesn't codex already do this when it decides whether to use subagents, and what prompt to give each subagent?
I love this. I always imagined more capable agent systems that have graph-like qualities.
This is precisely how the newly released Claude agent teams work.
This is a vibeslop project with a vibeslop write-up.
Trees? Trees aren't expressive enough to capture all dependency structures. You either need directed acyclical graphs or general directed graphs (for iterative problems).
Based on the terminology you use, it seems you've conflated the graphs used in task scheduling with trees used in OS process management. The only reason process trees are trees are for OS-specific reasons (need for a single initializing root process, need to propagate process properties safely) . But here you're just solving a generic problem, trees are the wrong data structure.
- You have no metrics for what this can do - No reason given for why you use trees (the text just jumps from graph to trees at one point) - None of the concepts are explained, but it's clearly just the UNIX process model applied to task management (and you call this 60 year old idea "genuinely new"!)
One agent can't even be trusted to think autonomously much less a tree of them
My small agent harness[0] does this as well.
The tasks tool is designed to validate a DAG as input, whose non-blocked tasks become cheap parallel subagent spawns using Erlang/OTP.
It works quite well. The only problem I’ve faced is getting it to break down tasks using the tool consistently. I guess it might be a matter of experimenting further with the system prompt.
[1]: https://github.com/matteing/opal
Strong agree about the value of fork.
Opencode getting fork was such a huge win. It's great to be able to build something out, then keep iterating by launching new forks that still have plenty of context space available, but which saw the original thing get built!
Not to be confused with:
cord - The #1 AI-Powered Job Search Platform for people in tech