Original Korean article

ReAct planning source page (Korean)

This English page is published as a child page of the English PART 1 page to preserve the original series structure.

If PART 1 used the Akka actor tree to establish order among AI terminals, this part explains how to turn a general LLM into a working agent on top of that runtime. Date: 2026-04-14. Audience: .NET and server developers who have read PART 1.

0. Introduction

In PART 1, we built the control room. StageActor, WorkspaceActor, TerminalActor, and AgentBotActor each got their own place, and multiple AI terminals stopped crashing into each other.

But one thing was still missing. The moderator LLM could talk well, yet it could not wait.

When it sent a command and the result came back late, it grew impatient, called the same tool again, re-checked work that was still in progress, and once the loop had ended it could no longer receive the eventual completion signal. In other words, it had become a clever chatterbox, but not a working agent.

What this part covers

  • Why general-purpose LLMs are weak at waiting
  • Why a realistic Waiting state had to be added to the ReAct pattern
  • Why Akka Become() is a natural fit for turning that flow into a state machine

1. Frozen's Closed Door - Why the Synchronous Loop Gets Stuck

The old RunFunctionCallLoopAsync looks plausible at first glance.

User input
  -> call the LLM
  -> execute tool calls
  -> feed results back to the LLM
  -> return when there is nothing left to do

The problem is that the door closes inside that loop. Imagine the moderator sending npm test to a terminal.

Action: term_send("npm test")
Observation: "Sent to term-0"

At that point it looks like success, but the real test run is still happening. The result may arrive 20 seconds later. During that gap, the loop only has two options.

Choice

Why it is a problem

Polling

You keep rereading the same state and waste context

Exit

You lose the path that would receive the later completion signal

The synchronous loop works in a world where answers come back immediately. It struggles in the real tool world where the answer is often, "I will come back later with the result."

2. Inside Out's Control Room - ReAct in Plain Language

ReAct stands for Reasoning + Acting, but it is easier to understand as a scene before it is a paper term. Picture the control room from Inside Out.

That is the basic rhythm of ReAct.

Thought -> Action -> Observation -> Thought -> ...

In AgentZero terms it becomes even simpler.

Think
-> call a tool
-> decide whether the result is immediate or delayed
-> wait or think again
-> report back to the user when the work is done

The original ReAct paper mostly focuses on scenes where observations come back immediately. Real applications are different. Tests can run for seconds or minutes. Another AI terminal may finish at an unpredictable time. File operations and external calls are also asynchronous. That is why AgentZero added a realistic intermediate state called Waiting to the original ReAct pattern.

3. Doctor Strange's Time Loop - How Become() Turns a Loop into a State Machine

This is where Akka Become() starts to shine. One actor can swap its handler set depending on the current state.

State

What it does

Thinking

Decides what should happen next

Acting

Executes a tool action

Waiting

Waits for an asynchronous result

Complete

Produces the final response

The flow is straightforward.

StartReAct
  -> Thinking
  -> Acting
  -> Waiting
  -> Thinking
  -> Complete

That distinction matters because a normal loop dies once it hits return, but an actor survives in the Waiting state. When an external signal arrives, it can wake back up into Thinking. We did not make the LLM itself "smarter." We wrapped it in an execution environment that knows how to wait.

4. Zootopia Traffic Lights - What the Waiting State Changes

4.1 Before: the synchronous loop

term_send("npm test")
-> "sent"
-> term_read
-> term_read
-> term_read
-> ...
-> consume 10 rounds
-> exit

4.2 After: ReActActor

term_send("npm test")
-> "sent"
-> Become(Waiting)
-> receive completion signal
-> Become(Thinking)
-> "Now read the result"
-> term_read
-> Complete

The difference becomes obvious in a side-by-side table.

Scene

Synchronous loop

ReActActor

Right after sending a command

It keeps checking again

It transitions into waiting

Asynchronous completion

The receive path is weak

It arrives through CompletionSignal

Context usage

Grows because of polling

Only thinks again when necessary

Failure recovery

The loop dies when it ends

Can re-evaluate after a timeout

Waiting is not just a pause. It is the mechanism that lets the agent choose order over impatience.

5. Toy Story Walkie-Talkies - Which Messages Need to Flow

A state machine only works when the message contract is explicit. AgentZero defined separate messages for its ReAct runtime.

Message

Role

StartReAct

Starts a session

ReActProgress

Sends intermediate progress to the UI

CompletionSignal

Delivers an external completion signal

CancelReAct

Lets the user cancel

The actor-tree placement also mattered.

ActorSystem("AgentZero")
\-- /user/stage
    |-- /bot
    |   \-- /react   <- ReActActor
    \-- /ws-{name}
        \-- /term-{id}

Why place ReActActor under bot? Because it stays close to bot session memory, completion signals can be forwarded to it naturally, and the UI can bind to its progress cleanly. If the names and positions of messages become confusing, the agent loses its way very quickly.

6. Up's Adventure Book - Why Session Memory and Logs Must Work Together

The ReAct state machine alone did not solve everything. On-device LLMs especially tended to forget where the work currently stood. That is why two supporting devices were added.

6.1 Session memory

Current state: Waiting
Completed work: meeting_create, term_send("npm test")
Pending item: test result from term-0
Expected next step: read the result and report it to the user

This memory is generated from actor state and prepended before each LLM call. In effect, the agent opens its own notebook every turn before deciding what to do next.

6.2 Diagnostic logs

Log tag

Meaning

[REACT-THINK]

What judgment was made

[REACT-ACT]

Which tool was executed

[REACT-WAIT]

When waiting started and what was received

[REACT-STATE]

Which state transition occurred

The pairing matters because memory keeps the agent from forgetting its own work, while logs let the developer trace the agent's mistakes. One is memory for the agent. The other is memory for the human.

7. Marvel Team-Up Scene - Why the Actor Model Fits an Agent Runtime So Well

There were many frameworks in 2025 and 2026: LangGraph, OpenAI Agents SDK, AutoGen, CrewAI, Semantic Kernel, and more. So why did AgentZero still hold on to the actor model?

Viewpoint

Typical agent framework

Akka actors

State

Often hidden in dicts, history objects, or runner internals

State is explicit through Become()

Failure recovery

Usually centered on manual retry logic

Supervision is built in

Concurrency

Depends heavily on how well async code was designed

Actor boundaries isolate concurrent work naturally

Distributed scaling

Often needs extra design

Connects naturally to the Akka family

Timeouts and waiting

Must be implemented per use case

Core tools like ReceiveTimeout already exist

Put simply, a general framework may help you start an agent quickly, but Akka gives you a stronger shape for an agent that has to keep living. When many terminals move at once, external completion signals arrive unpredictably, and both recovery and state observation matter, that difference becomes large.

8. Doctor Strange's Sling Ring - Autonomy Still Needs Reins

Saying an agent can move on its own also means it can run out of control on its own. That is why the runtime was built with safety devices from the start.

Device

Meaning

Value

MaxRounds

Prevents infinite loops

10

ReceiveTimeout

Forces a re-evaluation after waiting too long

30 seconds

Repeated-call cap

Blocks the same tool from being spammed

3 times

MaxCallsPerRound

Prevents a single round from exploding

30

CancelReAct

Immediate user stop

Immediate

The goal is not to remove autonomy. It is to retain control so that when autonomy leaves the track, the system can bring it back. As PART 3 will show, these safety devices later grow into more concrete forms such as the DONE handshake, _pendingDone, ESC control, and UI cards.

9. Closing - Preview of PART 3

This part was not about flashy implementation. It was about why that implementation had to exist in the first place.

The next question follows naturally.

Once that design entered real code, where did it break and what else had to be added?

That is where PART 3 begins. It examines how ReActActor was implemented in code, which protocol reconnected it to terminal AIs, why the DONE(...) handshake became necessary, and why devices such as the card UI, ESC control, and queueing had to be added.

NEXT - PART 3

If PART 2 was the design episode that teaches the moderator how to wait, PART 3 is the implementation episode that shows what happened when that moderator finally stepped onto the real stage.

-> How Does ReAct Come Alive in Real Code? - AgentZero Implementation Deep Dive [PART 3]


References