nebulaflow

Workflow Design

Overview

NebulaFlow workflows are directed acyclic graphs (DAGs) composed of nodes and edges that execute in a parallel, streaming fashion. The workflow engine uses a Kahn’s topological sort algorithm with ordered edges to determine execution order, enabling efficient parallel execution while respecting dependencies.

Workflow Structure

Nodes and Edges

A workflow consists of:

Each node has:

Edges define:

Execution Context

The workflow maintains several runtime contexts:

Execution Model

Parallel Scheduler

The workflow engine uses a parallel scheduler that executes nodes concurrently when dependencies are satisfied. The scheduler implements Kahn’s algorithm with the following optimizations:

  1. In-Degree Tracking: Each node tracks how many dependencies must complete before it can start
  2. Ready Queue: Nodes with zero in-degree are queued for execution
  3. Priority Sorting: Ready nodes are sorted by edge order for deterministic execution
  4. Concurrency Control: Per-node-type caps prevent resource exhaustion (e.g., 8 LLM nodes, 8 CLI nodes)

Execution Flow

1. Initialize in-degree for all nodes
2. Populate ready queue with nodes having zero in-degree
3. While ready queue is not empty:
   a. Sort ready queue by edge order
   b. Start nodes up to concurrency limits
   c. Wait for node completion
   d. Decrement in-degree of children
   e. Add newly ready nodes to queue
4. Handle completion, errors, and pauses

Hybrid Execution for Loops

Loops require special handling because they create cycles in the execution graph:

  1. Pre-Loop Nodes: Nodes that execute before the loop starts
  2. Loop Body: Nodes inside the loop (between LOOP_START and LOOP_END)
  3. Post-Loop Nodes: Nodes that execute after the loop completes

The scheduler uses a hybrid approach:

Node Types

LLM Node (llm)

Executes language model calls with streaming output.

Configuration:

Execution:

CLI Node (cli)

Executes shell commands with streaming output.

Configuration:

Execution:

Loop Nodes

Loop Start (loop-start)

Initiates a loop block.

Configuration:

Execution:

Loop End (loop-end)

Marks the end of a loop block.

Execution:

Conditional Nodes

IF/ELSE Node (if-else)

Branches execution based on condition.

Configuration:

Execution:

Data Nodes

Variable Node (variable)

Sets a variable value.

Configuration:

Execution:

Accumulator Node (accumulator)

Appends to a variable across iterations.

Configuration:

Execution:

Input Node (text-format)

Provides text input to the workflow.

Configuration:

Execution:

Utility Nodes

Preview Node (preview)

Displays text content without execution.

Configuration:

Execution:

Subflow Node (subflow)

Executes a reusable workflow.

Configuration:

Execution:

State Management

Workflow State Persistence

Workflow state is persisted across save/load operations:

interface WorkflowStateDTO {
  nodeResults: Record<string, NodeSavedState>
  ifElseDecisions?: Record<string, 'true' | 'false'>
  nodeAssistantContent?: Record<string, AssistantContentItem[]>
  nodeThreadIDs?: Record<string, string>
}

Resume Support

Workflows can be resumed from any point:

interface ResumeDTO {
  fromNodeId?: string
  seeds?: {
    outputs?: Record<string, string>
    decisions?: Record<string, 'true' | 'false'>
    variables?: Record<string, string>
  }
}

Resume Behavior:

Approval Workflow

Nodes can require user approval before execution:

  1. Node enters pending_approval status
  2. User reviews command in webview
  3. User approves or rejects
  4. Execution continues or aborts

Flow Control

Dependency Resolution

Nodes execute when all dependencies are satisfied:

Error Handling

Fail-Fast Mode (default):

Continue-Subgraph Mode:

Pause/Resume

Workflows can be paused:

Subflows

Definition

Subflows are reusable workflow components with:

Execution

Subflows execute as single nodes in parent workflows:

Port Mapping

interface SubflowPortDTO {
  id: string
  name: string
  index: number
}

Ports are ordered by index for consistent mapping.

Best Practices

Design Principles

  1. Keep Nodes Small: Single responsibility per node
  2. Use Meaningful Names: Descriptive node titles
  3. Organize Visually: Group related nodes
  4. Document with Preview Nodes: Add notes to workflows

Performance

  1. Limit Concurrency: Use per-type caps for resource-intensive nodes
  2. Batch Operations: Combine related CLI commands
  3. Cache Results: Use variables to avoid recomputation

Safety

  1. Require Approval: Enable needsUserApproval for dangerous commands
  2. Set Timeouts: Configure timeouts for LLM nodes
  3. Validate Inputs: Use IF/ELSE nodes to validate data
  4. Limit Iterations: Set maxSafeIterations for loops

Debugging

  1. Use Preview Nodes: Document expected behavior
  2. Check Outputs: Inspect node results after execution
  3. Use Single-Node Execution: Test nodes individually
  4. Monitor Token Counts: Track LLM usage

Protocol

Message Flow

Webview → Extension:

Extension → Webview:

Streaming

LLM and CLI nodes stream output in chunks:

Configuration

Environment Variables

Execution Settings

Examples

Simple Linear Workflow

Input → LLM → Preview

Branching Workflow

Input → IF/ELSE → LLM (true) → Preview
                → CLI (false) → Preview

Loop Workflow

Input → Loop Start → LLM → Accumulator → Loop End → Preview

Subflow Workflow

Input → Subflow (Process Data) → Output

Troubleshooting

Common Issues

Node Never Starts:

Loop Doesn’t Iterate:

LLM Timeout:

CLI Command Fails:

Debug Mode

  1. Single Node Execution: Test nodes individually
  2. Preview Nodes: Add checkpoints in workflow
  3. Variable Inspection: Check variable values after execution
  4. Token Counts: Monitor LLM usage

See Also