Runtime — SOFIA's first implementation uses Claude Code and CLAUDE.md files. The concepts described here (persona, isolation, artifacts) are provider-agnostic — only the runtime layer is specific to a provider. Some sections describe Claude Code specifics. Latter versions implement other providers.

Pipeline overview Output structure Usage Static build Contributing external data Complete workflow

Analysis guide — Data pipeline

Generating, sanitizing, and publishing H2A friction data. This guide covers analysis.py and the data workflow (binding/filesystem/). For conformity checks, see audit-guide.md. For the dashboard, see dashboard-guide.md.

Pipeline overview

analysis.py <instances>
    │
    ├── scan.py     → records.json     (raw friction + contribution records)
    ├── mirror.py   → mirror.json      (health diagnostics: KPIs, radars, trajectory)
    ├── lens.py     → lens.json        (time series: markers, directions, flux)
    └── probe.py    → probe.json       (conformity checks)

Two independent pipelines:

Data pipeline (scan → mirror + lens): reads session summaries and artifacts, extracts friction/contribution records, builds aggregated views
Conformity pipeline (probe): reads instance structure directly, runs checks

Output structure

analysis.py writes per-instance and aggregated JSON:

<output>/
├── methodes/           per-instance
│   ├── records.json
│   ├── mirror.json
│   ├── lens.json
│   └── probe.json
├── produits/           per-instance
│   └── ...
├── records.json        aggregated (all instances)
├── mirror.json         aggregated
├── lens.json           aggregated
├── probe.json          aggregated
└── index.json          instance list

Default output: binding/filesystem/analysis/data/ (gitignored).

Usage

Basic

# Full pipeline — all views, all instances
python binding/filesystem/analysis.py /path/to/instance1 /path/to/instance2

# Single view
python binding/filesystem/analysis.py /path/to/instance --only mirror
python binding/filesystem/analysis.py /path/to/instance --only lens
python binding/filesystem/analysis.py /path/to/instance --only probe

# Launch dashboard after analysis
python binding/filesystem/analysis.py /path/to/instance --serve

Custom output directory

# Write to a specific directory
python binding/filesystem/analysis.py /path/to/instance -o /path/to/output

When -o is set, per-instance subdirectories and aggregated files are written to the specified directory instead of the default analysis/data/.

Sanitization

# Strip sensitive fields for open-source publication
python binding/filesystem/analysis.py /path/to/inst1 /path/to/inst2 \
    --sanitize -o /path/to/h2a-data/data

The --sanitize flag strips:

Friction/contribution descriptions — qualitative content
Source filenames — session paths with timestamps
Probe check details — filenames in warn/fail messages
Persona role descriptions — real role text
Non-persona identifiers — replaced with "orchestrator" in matrices

What is preserved: markers, resolutions, directions, types, dates, personas, counters, trajectories, radars.

The standalone script binding/filesystem/sanitize.py can also be used independently on existing JSON files:

python binding/filesystem/sanitize.py input_dir/ output_dir/

Static build

Generate a standalone dashboard (HTML + CSS + JS + JSON) that can be served by any static file server:

# Self-contained dist/
python binding/filesystem/build_dist.py

# For h2a-data repository structure
python binding/filesystem/build_dist.py \
    --output /path/to/h2a-data/dashboard \
    --data-dir /path/to/h2a-data/data

The static build excludes the Probe tab and the refresh button (no server to call).

Contributing external data

When a colleague uses SOFIA and wants to contribute their friction data:

# 1. Run analysis on their instance (with them, sanitized)
python binding/filesystem/analysis.py /path/to/their-instance \
    --sanitize -o /path/to/h2a-data/data

# 2. Aggregate all instances (regenerates index.json + aggregated JSON)
python binding/filesystem/aggregate.py /path/to/h2a-data/data

# 3. Rebuild the dashboard
python binding/filesystem/build_dist.py \
    --output /path/to/h2a-data/dashboard \
    --data-dir /path/to/h2a-data/data

aggregate.py is idempotent — it scans all subdirectories in the data directory containing lens.json or mirror.json, merges them, and regenerates the root-level aggregated files. Adding an instance = creating a new subdirectory + re-running.

aggregate.py is only needed when adding an external instance without re-running analysis.py on all instances. When you run analysis.py with multiple instances, it already produces per-instance + aggregated output.

Complete workflow

Internal use (audit + dashboard)

python binding/filesystem/analysis.py \
    /path/to/methodes /path/to/produits /path/to/oxynoe \
    --serve

Publication (h2a-data)

# Generate sanitized data
python binding/filesystem/analysis.py \
    /path/to/methodes /path/to/produits /path/to/oxynoe \
    --sanitize -o /path/to/h2a-data/data

# Build static dashboard
python binding/filesystem/build_dist.py \
    --output /path/to/h2a-data/dashboard \
    --data-dir /path/to/h2a-data/data

← Conformity Dashboard →