Cockpit architecture and cross-platform Agent collaboration model
📌 Core summary
With the improvement of large language model capabilities, a single agent has exposed inherent limitations such as agent laziness, self-preference bias, and goal drift when handling complex and long-term tasks.
The dynamic workflow multi-instance isolation and task customization orchestration proposed by Claude Code solve these problems, but its design of a single model family and stateless orchestration limits practical application scenarios.
This paper proposes the Cockpit architecture—an adaptive Agent orchestration system based on shared workspaces. The architecture is introduced by:
🎯 Centralized status management (Cockpit)
🧠 Intelligent Coordinator (PM)
🤖 Heterogeneous Agent Pool (Worker Pool)
While retaining the core advantages of dynamic workflow, it achieves cross-platform Agent collaboration and adaptive optimization based on historical performance.
Practice has shown that the Cockpit architecture exhibits higher task completion rates and better engineering controllability in complex tasks such as code migration and in-depth research.
Keywords: dynamic workflow · Agent orchestration · Shared workspace · Adaptive system · Cross-platform collaboration
01 Introduction: From Dilemma to Breakthrough
🔴Three major dilemmas of a single context
In the actual application of AI Agent, developers usually adopt the most direct method: let Claude, GPT or other large language models complete the task in a single conversation window.
This model works well for simple scenarios, but when the task becomes complex—requiring reviewing 50 files, migrating an entire code base, conducting in-depth research—the single-context model starts to expose systemic problems.
Anthropic clearly pointed out three major failure modes in the release documentation of Claude Code dynamic workflow:
💤 Agentic Laziness
The agent prematurely declares task completion after completing some work.
Typical scenario: Process 20 out of 50 items in a security review, then mark the remainder as "processed."
🎭Self-preferential Bias
When an agent is asked to verify its own output, it will tend to favor its own results.
Core problem: A validator with a stake in the outcome cannot be an impartial judge.
🌊 Goal Drift
Over multiple rounds of interaction, especially after context compression, the agent will gradually deviate from the original goal.
Real case: The constraint "Don't do X" quietly disappears in the 47th round of dialogue.
🟢 The promise of dynamic workflowTo solve these problems, Anthropic launched the Dynamic Workflows feature in May 2026.
The core idea: Let Claude automatically generate a customized coordination framework for a specific task - a JavaScript file that generates and coordinates multiple sub-agents through special functions. Each sub-agent has an independent context window and focused goal.
Three key capabilities
✅Isolate by Agent. Each sub-Agent has an independent context and does not interfere with each other.
✅ Select the model by Agent, use Opus for complex inference, and use Haiku for low-cost exploration.
✅ Work tree by Agent isolation level (standalone Git checkout) or remote repository
Six core models
Anthropic engineers identified six recurring orchestration patterns:
🔀 Classified routing
🌟 Fanout-Synthesis
⚔️ Adversarial verification
🎯 Generate-Filter
🏆Tournament Sorting
🔄 Loop until complete
These patterns structurally address single-context failure modes.
▲ Three major failure modes of a single context: agent laziness, self-preference bias, and goal drift
🟡 The gap from theory to engineering practice
However, dynamic workflow faces two key limitations in practical engineering applications:
⚠️Single model family restriction
Dynamic workflows can only use Claude family models (Opus/Sonnet/Haiku).
In actual scenarios, Agents on different platforms have their own strengths:
Claude Code is good at code refactoring
Codex performs well in algorithm implementation
Gemini has advantages in multi-modal tasks
A single model family cannot fully leverage the expertise of each platform.
⚠️Stateless orchestration
A new workflow script is generated for each task, and there is no historical memory between agents.
Question:
Unable to optimize Agent selection strategy based on past performance
Unable to accumulate knowledge between tasks
Every time it’s “from scratch”
💡 Cockpit Architecture: A solution to bridge the gap
The Cockpit architecture proposed in this article aims to bridge this gap.
We retain the core benefits of dynamic workflows:
✅Multi-instance isolation
✅ Dynamic orchestration
Also introduces new capabilities:
🆕 Shared workspace
🆕 Adaptive mechanism
🆕 Cross-platform collaboration
Implement a more flexible and intelligent Agent collaboration model.
02 Review of Dynamic Workflow Theory
Static vs Dynamic: A Comparison of Two Paradigms
Before understanding dynamic workflow, we need to clarify the concept of static workflow.
🔵 Static workflow: predefined fixed processWhether you use visual automation platforms such as N8N and Zapier, or use coordination scripts written by Claude Agent SDK, the characteristics are:
Example: "Code Review Workflow" designed in N8N
No matter what code is being reviewed, the process is the same.
🟣 Dynamic workflow: customized execution plan for tasks
Claude's execution plan tailored for the current task:
Example: For the same code review, a dynamic workflow might:
First scan the code base and identify that this is a React project
Decide whether to use Haiku or Opus based on component complexity
Generate specialized review agents for Hooks usage
Add TypeScript type checking step
Parallel processing rather than sequential execution
Detailed explanation of the six core models
Anthropic engineers have identified six recurring orchestration patterns in practice:
1️⃣ Classify-and-Route
Use the classification agent to determine the task type and then route it to different processing agents.
Scenario: "Explain how the authentication module works"
Classification Agent first evaluates complexity
Simple module using Sonnet
Opus for complex modules
2️⃣ Fan-out & Synthesize
Decompose the task into multiple independent subtasks, execute them in parallel, and finally summarize the results.
Core value: Solve the problem of "too many things to handle at the same time". Each child agent only sees its own part and is not distracted by 50 irrelevant details.
💡 This is the most commonly used mode
3️⃣ Adversarial Verification
Create an independent verification agent for each generated result. This verifier has never seen the original work and cannot generate self-preferences.
Structured solutions: A fundamental approach to self-preference bias.
4️⃣ Generate-and-Filter (Generate-and-Filter)
Generate multiple candidate solutions and then filter them with a validator.
Key difference: Instead of directly asking for the “best answer,” this pattern lets the agent delay commitment until all options have been challenged before making a decision.
5️⃣Tournament Ranking
Let multiple agents compete for the same task and determine the winner through pairwise comparison.
Applicable scenarios: taste-oriented work
design choices
naming scheme
UI decisions
Core advantage: Comparative judgments are more reliable than absolute ratings.
6️⃣ Loop Until Done
Continue to generate Agents until stopping conditions are met.
Stop condition example:
no new findings
No errors in logs
The theory is verifiedGuarantee: "actually done" not "claimed to be done".
▲ Six core orchestration modes: classification routing, fan-out-synthesis, adversarial verification, generation-filtering, tournament sorting, loop until completion
Limitations of existing solutions
Although dynamic workflows are elegant in theory, they suffer from four major shortcomings in engineering practice:
Core question: Can we design an architecture that not only retains the advantages of dynamic orchestration but also has engineering controllability?
03 Cockpit architecture design
System Overview: Three-tier Architecture
The Cockpit architecture adopts a three-layer design:
▲ Cockpit three-layer architecture: shared workspace layer, PM coordination layer, Worker execution layer
Core design concept: All Agents work around the same "whiteboard" (Cockpit) instead of collaborating through messaging.
💡 Similar to software teams collaborating around Git Repository + Project Board instead of emailing each other.
Cockpit component design: six core components
The Cockpit is the nerve center of the system and contains six core components.
Here is the Cockpit interface in action:
▲ Cockpit plan view - shows project goals and milestone progress
▲ Cockpit task view - real-time tracking of task completion status
▲ Cockpit Timeline View - Worker Utilization Analysis and Dispatch Trends
📋 Plan (goal anchoring)
Function:
Stores the core goals and constraints of the project
All Agents must read the Plan alignment target before execution
Value: Prevents goal drift—even after multiple rounds of interactions, the original intent remains clearly visible
Actual data: From the screenshot, you can see that the HippoTeam project progress is 89% (187/209), including a total of 6 milestones M1-M6, each with a clear completion status.
✅ Tasks (Progress Tracking)
Function:
Record the status of all subtasks: pending, in progress, completed
Worker updates status after completing tasks
PM adjusts subsequent arrangements based on real-time status
Value: Solve "agency laziness" - task completion status is clear at a glance and cannot be falsely reported
Actual data: There are 408 tasks in actual operation, with a completion rate of 401/408. You can see detailed dispatch records.
🔬 Research (research accumulation)
Function:
Store information collected during research
Accessible to all Agents to avoid repeated surveys
Value: Support knowledge reuse and iterative deepening
Actual data: There are currently 71 survey records in the system.
📊 Reports (deliverable management)
Function:
Store the output results of each stage
Support version tracking and backtrackingValue: Facilitates final roll-up and quality checks
Actual data: 78 reports have been accumulated in the system.
⚠️ Issues (issue management)
Function:
Record problems found during execution
Any Agent can add an Issue
Value: PM adjusts strategies or assigns repair tasks based on Issues
📚 Knowledge Base
Function:
Knowledge accumulation across tasks
Record Worker running statistics
Value: Provide data foundation for manual analysis and future adaptive optimization
Actual implementation: Record Worker historical performance through Timeline view. From the screenshot, you can see the detailed data of Guan Yu (55 dispatch, average 12m), Zhao Yun (21 dispatch, average 10m), Dian Wei (20 dispatch, average 10m), Zhang Fei (4 dispatch, average 7m), as well as the Dispatch trend chart from 05-20 to 05-25. This data is currently used for monitoring and manual analysis and can be used to establish automated feedback loops in the future.
💡 Supplementary components: The actual system also includes auxiliary modules such as Ideas (idea pool, 4 to be evaluated), Decisions (decision records, 24), and supports advanced modes such as "generation-filtering".
Data flow and interaction mechanism
Before diving into the PM orchestration mechanism, we first understand the data flow between Agent and Cockpit.
🔄 Agent-Cockpit data flow diagram
▲ Complete data flow interaction between Agent and Cockpit
Core interaction path:
Key design:
✅ One-way dependency: Worker depends on Cockpit, but does not communicate directly with PM or other Workers
✅ Status centralization: All status changes pass through Cockpit to ensure global consistency
✅ Asynchronous decoupling: Worker can update the status after completing the task, without waiting for PM response
🔒 State synchronization mechanism for concurrent access
How to ensure data consistency when multiple Workers access Cockpit concurrently?
▲ State synchronization mechanism for concurrent access by multiple Workers
Three-tier guarantee mechanism:
1️⃣ Optimistic Lock
Each Cockpit component maintains a version number:
Advantages: lock-free in most cases, high performance
2️⃣ Transaction Queue
All write operations are queued and executed in order:
Guarantee: atomicity and order of write operations
3️⃣ Conflict detection and automatic retry
When a version conflict is detected:
Rollback: Discard current updates
Reread: Get the latest statusRecompute: Regenerate updates based on new state
Resubmit: Try writing again
Actual case:
Worker A and Worker B complete Task-001 and Task-002 at the same time, and both try to update the completion rate statistics of the Tasks component.
- Worker A submitted first, Tasks were updated from v5 to v6, and the completion rate was 400/408
- Worker B detects that the version has changed to v6 when submitting (not v5 when reading)
- The system automatically asks Worker B to re-read v6 and recalculate the completion rate 401/408
- Worker B is submitted successfully and Tasks is updated to v7
Performance optimization:
🟢 Read operations are lock-free: multiple workers can read concurrently without blocking each other
🟡Light-weight writing operations: most updates are append operations (adding Report, Issue), and the probability of conflict is low
🔴 Conflicts are rare: conflicts will occur only when the same task status is updated at the same time, the actual occurrence rate is < 2%
PM adaptive orchestration mechanism
PM (Project Manager) is the brain of the system and is responsible for dynamic orchestration.
Unlike Claude’s stateless orchestration of dynamic workflows, Cockpit’s PM has the ability to remember and learn.
🧩 Task breakdown
Process:
After receiving user requirements, PM analyzes task characteristics
Read historical data and current context in Cockpit
Break tasks into subtasks that can be parallelized or serialized
Update Plan and Tasks components
🎯 Role-based Worker selection
PM performs intelligent allocation based on task type and Worker role:
Decision-making process:
Actual running case:
It can be seen from the actual operating data of HippoTeam:
Code refactoring task → Workers assigned to the coder role (Guan Yu, Zhao Yun, Dian Wei)
Code review task → assigned to an independent reviewer role (Zhong Kui) to ensure adversarial verification
Algorithm implementation task → Choose the appropriate coder worker based on complexity
Timeline monitoring: The system records the number of dispatches and average completion time of each Worker through the Timeline view (such as Guan Yu 55 times/average 12 minutes, Zhao Yun 21 times/average 10 minutes), which facilitates manual analysis and adjustment of role configuration.
💡 Future direction: The current Timeline data is of display nature, and a feedback loop can be established in the future to allow the PM to automatically optimize the Worker selection strategy based on historical performance.
📈 Progress monitoring and dynamic adjustment
Real-time capabilities:
Read Tasks status in real timeIf a worker is found to be unresponsive for a long time, the task must be reassigned.
Found a blocking problem in Issues and adjusted the execution plan
Worker Pool Design
Worker Pool is the execution layer of the system and contains multiple heterogeneous Agents.
🌐 Cross-platform heterogeneous Agent
Unlike Claude, whose dynamic workflow can only use the Claude family, Cockpit supports Agents from any platform:
Each platform can have multiple instances (such as Claude Code #1, #2, #3) to achieve true parallel processing.
⚖️ Fixed roles vs dynamic responsibilities
This is a critical engineering trade-off.
Cockpit adopts the model of "fixed role pool + dynamic responsibility allocation":
✅ The ability boundaries of fixed-role Workers are predefined (Claude Code is a code expert, Gemini is a multi-modal expert)
✅ Dynamic responsibilities. Specific tasks are dynamically assigned by the PM according to the situation.
Design advantages:
🔄 Status Update Protocol
After the Worker completes its task, the Cockpit must be updated:
✅ Update task status in Tasks
📄 Write results to Reports
⚠️Add an Issue when you find a problem
📚 Write the accumulated knowledge into Research
This ensures consistency and traceability of system status.
▲ Cross-platform heterogeneous Agents collaborate around shared workspaces
Implementation of six major patterns in Cockpit
The Cockpit architecture is fully compatible with Claude’s six major modes of dynamic workflow, and has been enhanced in implementation:
�� Classified routing
Implementation method:
PM serves as a classifier and selects appropriate Workers based on task characteristics.
Enhancement points:
Different from the original model, PM’s classification decisions are based on historical data and are more accurate
�� Fanout-synthesis
Implementation method:
PM splits the tasks and assigns them to multiple Workers for parallel execution.
All Workers write results to Cockpit's Reports
PM reads all results and summarizes them
⚔️ Adversarial verification
Implementation method:
PM assigns each build task an independent validation Worker
Verify that the Worker only reads the results in Reports and does not know who generated them
The verification results are written into Issues, and PM decides whether to redo them based on Issues.
�� Generate-Filter
Implementation method:
PM allocates multiple workers to generate candidate solutions
Redistribute Validation Worker Filtering and Scoring
Optimal solution written into Reports
🏆Tournament Sorting
Implementation method:PM organizes pairwise comparisons and assigns two comparison tasks to Workers each time
Comparisons are documented in Cockpit, where PM maintains rankings
The final winner writes Reports
🔄 Loop until complete
Implementation method:
PM checks the status of Tasks and Issues
Continue assigning workers as long as there are unfinished tasks or unresolved issues
Until all Tasks are marked complete and Issues are empty
04 Key Design Decisions
Why choose a fixed character pool?
When designing Cockpit, we faced a core question:
Is it to temporarily generate Agents each time like Claude's dynamic workflow, or to maintain a fixed Agent pool?
We chose the latter for the following reasons:
💰 Cost controllability
Spawning agents on an ad-hoc basis can cause costs to get out of hand.
Risk scenario: In a complex task, if no restrictions are imposed, the system may generate dozens or even hundreds of Agent instances.
Solution: The fixed role pool sets a concurrency limit and the cost is predictable.
�� Engineering stability
Fixed roles mean that each Agent’s capabilities have clear boundaries, which facilitates:
Monitor
Debugging
Optimize
Comparison: The temporarily generated Agent is difficult to track and difficult to locate when problems occur.
�� Cross-platform advantages
The fixed role pool allows us to integrate Agents from different platforms and leverage their respective expertise.
Limitations: Ad hoc generation patterns are difficult to coordinate across platforms.
📊 The foundation of adaptive learning
Only when the role is fixed can the historical performance data of each Agent be accumulated, thereby achieving intelligent allocation based on performance.
This does not mean loss of flexibility
The PM can still decide dynamically:
✅ Who is this task assigned to?
✅ Use several Workers for parallel processing
✅ Do you need adversarial verification?
✅ When to stop the cycle
💡 What is fixed is the role, what is dynamic is the arrangement strategy
Shared workspace vs messaging
In the field of Agent collaboration, the mainstream solution is the message passing model:
This model is simple and intuitive, but there are problems:
❌ Three major issues in messaging
✅ Cockpit’s shared workspace mode
Advantages:
Analogy: Paradigm Shifts in Software Development
The latter significantly improves collaboration efficiency.
Advantages of Cross-Platform Agent
One of the most significant advantages of the Cockpit architecture is its support for cross-platform Agent hybrid orchestration.
🎯 Leverage the expertise of each platform
🛡️ Reduce the risk of platform dependence
It is not bound to a single platform. When a certain platform fails or is limited, you can quickly switch to an alternative.
💰 Cost optimization
Choose an appropriate model based on task complexity:
Simple tasks → low-cost models
Complex tasks → powerful modelsPM’s adaptive mechanism will gradually find the optimal cost-quality balance point.
�� Actual cases
Scenario: Code base migration task
💡 This kind of hybrid orchestration cannot be achieved in a single platform solution
Comprehensive comparison of the three modes
▲ The evolution of three workflow paradigms: from static to dynamic to collaborative
Suggestions for applicable scenarios
🔵 Use static workflow (N8N/Zapier) when:
✅ The task process is very fixed and requires almost no changes
✅ No complicated Agent collaboration required
✅ Pursuing ultimate simplicity and visualization
🟣 Use Claude dynamic workflow when:
✅ The task is complex and requires multiple Agent isolation
✅ Only use Claude platform
✅ No need to accumulate knowledge across tasks
✅ Can accept higher token consumption
🟢 Use Cockpit architecture when:
✅ Requires cross-platform Agent hybrid orchestration
✅ There is a need for knowledge reuse between tasks
✅ Requires fixed role pool and role-based smart allocation
✅ Requirements for cost control and traceability
✅ Willing to invest engineering resources to build the system
Conclusion
The Cockpit architecture proposed in this article achieves an engineering breakthrough based on the theory of dynamic workflow by introducing a shared workspace and a role-based orchestration mechanism:
✅ Retains the core advantages of dynamic workflow
Multi-Agent instance isolation to solve agent laziness and target drift
Adversarial verification to solve self-preference bias
Dynamic orchestration, optimized for specific tasks
🚀 Breaking through the limitations of the original solution
Cross-platform Agent pool to leverage the expertise of each platform
Intelligent allocation based on roles, ensuring matching tasks and abilities
Shared workspace to achieve state consistency and knowledge reuse
Fixed role pool to ensure cost controllability and project stability
Practical verification
The actual operation data of the HippoTeam project (408 tasks, 8 fixed workers, 71 survey records, 78 reports) shows that the Cockpit architecture demonstrates:
✅ Better engineering controllability
✅ Higher collaboration efficiency
✅ Complete traceability
future outlook
With the continuous improvement of large language model capabilities and the deepening of Agent applications, we believe:
The shared workspace model will become the standard paradigm for complex agent collaboration systems
References
Anthropic. (2026). "Dynamic Workflows in Claude Code: 6 patterns and 14 steps""How to master Dynamic Workflows in Claude Code: 6 patterns and 14 steps Anthropic engineers actually use"
AutoGPT Project. "Autonomous AI Agent Framework"
LangChain Documentation. "Agent and Chain Orchestration"
CrewAI. "Role-based Agent Collaboration Framework"
Author: Huangserva Date: June 2026 Keywords: Dynamic workflow · Agent orchestration · Shared workspace · Adaptive system · Cross-platform collaboration
💡 If this article is helpful to you, please share it with more friends who are interested in AI Agent architecture!