Module Format

Every stato module is a Python file containing a single class. The compiler infers the module type from the class name, fields, and methods, then validates against the appropriate schema.

Module Types

Type	File Location	Purpose
`skill`	`.stato/skills/<name>.py`	Reusable procedures with parameters
`plan`	`.stato/plan.py`	Project roadmap with steps and dependencies
`memory`	`.stato/memory.py`	Working memory (phase, tasks, issues)
`context`	`.stato/context.py`	Project-level metadata
`protocol`	`.stato/protocol.py`	Multi-agent handoff schemas

Skill Module

Skills represent reusable procedures. They require a name field and a run() method.

class QualityControl:
    """Single-cell RNA-seq quality control filtering."""
    name = "qc_filtering"
    description = "Filter cells by gene count and mitochondrial percentage"
    version = "1.2.0"
    depends_on = ["normalize"]
    default_params = {
        "min_genes": 200,
        "max_genes": 5000,
        "max_pct_mito": 20,
    }
    lessons_learned = """
    Mouse uses lowercase mt- prefix for mitochondrial genes.
    FFPE samples need max_pct_mito=40.
    """
    tags = ["qc", "scrna"]

    def run(self, adata, params=None):
        ...

Schema

Field	Type	Required	Description
`name`	`str`	Yes	Unique skill identifier
`description`	`str`	No	Short description
`version`	`str`	No	Semantic version (e.g. `"1.2.0"`)
`depends_on`	`list`	No	Dependencies (other skill names or packages)
`input_schema`	`dict`	No	Expected input format
`output_schema`	`dict`	No	Expected output format
`default_params`	`dict`	No	Default parameter values
`lessons_learned`	`str`	No	Markdown-formatted lessons from experience
`tags`	`list`	No	Categorization tags
`context_requires`	`list`	No	Required context fields

Required methods: run() (execution entry point)

Plan Module

Plans define a project roadmap with ordered steps.

class AnalysisPlan:
    """scRNA-seq analysis pipeline plan."""
    name = "cortex_analysis"
    objective = "Complete scRNA-seq analysis of mouse cortex P14"
    version = "1.0.0"
    steps = [
        {"id": 1, "action": "load_data", "status": "complete", "depends_on": []},
        {"id": 2, "action": "qc_filtering", "status": "complete", "depends_on": [1]},
        {"id": 3, "action": "normalize", "status": "complete", "depends_on": [2]},
        {"id": 4, "action": "find_hvg", "status": "pending", "depends_on": [3]},
        {"id": 5, "action": "cluster", "status": "pending", "depends_on": [4]},
    ]
    decision_log = """
    Chose SCTransform over log-normalize for better variance stabilization.
    """

Schema

Field	Type	Required	Description
`name`	`str`	Yes	Plan identifier
`objective`	`str`	Yes	What this plan achieves
`steps`	`list[dict]`	Yes	Step dicts with `id`, `action`, `status`, optional `output`, `depends_on`
`version`	`str`	No	Plan version
`decision_log`	`str`	No	Record of key decisions
`constraints`	`list`	No	Constraints on execution
`created_by`	`str`	No	Author attribution

Step Format

Each step in the steps list is a dict:

Field	Type	Required	Description
`id`	`int`	Yes	Unique step identifier
`action`	`str`	Yes	What this step does
`status`	`str`	Yes (auto-set if missing)	One of: `pending`, `running`, `complete`, `failed`, `blocked`
`depends_on`	`list[int]`	No	IDs of prerequisite steps
`output`	`str`	No	Result or notes from completed step

The compiler validates step ID uniqueness, dependency references, status values, and DAG acyclicity.

Memory Module

Memory tracks the current working state of the project.

class ProjectState:
    """Current working memory for the analysis."""
    phase = "analysis"
    tasks = [
        "Run HVG selection with flavor='seurat_v3'",
        "Test different resolution parameters for clustering",
    ]
    known_issues = {
        "batch_effect": "Samples 3 and 7 cluster separately, may need Harmony",
    }
    reflection = "QC removed 15% of cells, consistent with expectations for cortex."
    last_updated = "2025-01-15"

Schema

Field	Type	Required	Description
`phase`	`str`	Yes	Current work phase
`tasks`	`list`	No	Active task list
`known_issues`	`dict`	No	Known problems and descriptions
`reflection`	`str`	No	Agent’s current understanding
`error_history`	`list`	No	Record of errors encountered
`decisions`	`list`	No	Decisions made
`metadata`	`dict`	No	Arbitrary metadata
`last_updated`	`str`	No	Timestamp

Context Module

Context stores project-level metadata that rarely changes.

class ProjectContext:
    """Project metadata for cortex scRNA-seq analysis."""
    project = "cortex_scrna"
    description = "Mouse cortex P14 scRNA-seq analysis, 10x Chromium"
    datasets = [
        {"name": "cortex_p14", "path": "data/cortex_p14.h5ad", "cells": 12000},
    ]
    environment = {
        "python": "3.11",
        "scanpy": "1.9.6",
        "anndata": "0.10.3",
    }
    conventions = [
        "Use scanpy naming conventions",
        "Store results in adata.obs and adata.uns",
    ]

Schema

Field	Type	Required	Description
`project`	`str`	Yes	Project name
`description`	`str`	Yes	Project description
`datasets`	`list`	No	Data file paths
`environment`	`dict`	No	Tool versions
`conventions`	`list`	No	Project conventions
`tools`	`list`	No	Required tools
`pending_tasks`	`list`	No	Incomplete tasks
`completed_tasks`	`list`	No	Finished tasks
`team`	`list`	No	Team members
`notes`	`str`	No	Free-form notes

Protocol Module

Protocols define schemas for multi-agent handoffs.

class AnalysisProtocol:
    """Handoff protocol for multi-agent analysis pipeline."""
    name = "analysis_handoff"
    handoff_schema = {
        "required": ["adata_path", "completed_steps"],
        "optional": ["parameters", "warnings"],
    }
    description = "Schema for passing analysis state between agents"
    validation_rules = ["adata_path must exist", "completed_steps must be non-empty"]

Schema

Field	Type	Required	Description
`name`	`str`	Yes	Protocol identifier
`handoff_schema`	`dict`	Yes	Schema for agent-to-agent handoff
`description`	`str`	No	Protocol description
`validation_rules`	`list`	No	Rules for validating handoff data
`error_handling`	`str`	No	Error handling strategy

Type Inference

The compiler infers the module type using these rules (in order):

Class name ends with Context → context
Class name ends with State → memory
Class name ends with Protocol → protocol
Has steps and objective fields → plan
Has handoff_schema field → protocol
Has phase field (no run() method) → memory
Has project and description fields (no run() method) → context
Has run() method → skill
Fallback → skill (with low-confidence warning W006)

You can override inference by passing expected_type to the compiler.