episteme

Latest Release License: AGPL-3.0-or-later Unique Clones

English한국어Español中文

epistemekernel.com

episteme is a way to think — 생각의 틀 — an epistemic engine that makes AI-assisted decisions earn their confidence before they land.

A five-stage cognitive practice — Frame → Decompose → Execute → Verify → Handoff — anchored in Kahneman's System-2 forcing, Dalio's Radical Transparency, Boyd's OODA Orientation, and Munger's Latticework of Mental Models. v2.0 delivers it as three layers. Cognition — the senior-researcher interrogation: decompose a load-bearing decision into tiered claims (measured / cited / inferred / assumed), verify the load-bearing ones in a fresh context against external evidence, argue the strongest opposition, name the weakest link, pre-commit a disconfirmation. Structure — deterministic hooks that route decision shapes, validate the verdict artifact (a stop verdict fails closed), and hard-block only genuinely destructive operations; the operator-signed Reasoning Surface (Ed25519, structurally out of the agent's reach) remains the operator-side framing artifact. Memory — lessons from verified interrogations become hash-chained, context-scoped protocols that resurface at the next matching decision. The division of labor is the research record's, not ours: models judging their own drafts get worse, form-checks are gameable by reasoning-shaped tokens, and only architectural constraint converts epistemic awareness into behavior.

The MIRROR benchmark (arXiv 2604.19809) settled the empirical question: across 16 models from 8 labs and ~250,000 instances, "providing models with their own calibration scores produces no significant improvement; only architectural constraint is effective." Confident Failure Rate drops from 0.60 to 0.14 under external architectural constraint. The practice itself is the product. The artifacts under core/ and src/episteme/ — the typed Reasoning Surface, the Append-Only Hash Chain, the Active-Guidance loop — are the Sovereign Cognitive Kernel that keeps the practice alive at frontier model strength, when vigilance-as-willpower fails. Posture over prompt.

docs/THE_WAY_TO_THINK.md — the practice, operationalized.

See it in 60 seconds ↓ · Install ↓ · Why the file-system, not the prompt ↓ · Architecture & philosophy ↓ · Does it work? ↗

Episteme — the Thinking Framework in motion


Why prompts can't enforce a way to think

The practice in docs/THE_WAY_TO_THINK.md names six cognitive moves per high-impact decision — Core Question, distinction map, signal-vs-noise filter, because-chain, hypothesis-as-bet, disconfirmation conditions. Each move counters a specific named System-1 failure (question substitution, WYSIATI, anchoring, narrative fallacy, planning fallacy, overconfidence — Kahneman). A prompt can request these moves, but prompts are advisory: they live for one call, get skipped at deadline, and disappear from context. Frontier models comply on the surface and skip the moves underneath — fluently, confidently, and the operator stops checking. That is exactly the failure mode the practice is for.

Concrete example. You ask the agent: "Evaluate whether our retrieval-augmented memory system is actually improving response quality."

The agent treats your prompt as a measurement task. It pulls metrics from the last 30 days, compares with-memory vs without-memory response samples, finds a 7% positive lift on thumbs-up rate, writes a memo concluding "memory helps; keep shipping." You read it.

The agent didn't ask the questions you would have asked, if you weren't tired:

A naive agent gives a measured-sounding answer because the prompt asked for a measurement. episteme forces the agent to write down — on disk, before the memo lands — what the measurement actually measures, what mechanism is being claimed, and what observable outcome would prove the claim wrong. The act of writing surfaces that the proxy wasn't the question.

Recent academic work calls the cumulative gap between what the agent knows in context, what you intended, and what your system actually requires Epistemic Drift. episteme closes that gap by structurally requiring the agent to reason — what · why · how — before it acts. Enforcement is structural, not advisory. Prompts can be skipped; a file-system hook that exits non-zero cannot.


The ABCD architecture — four blueprints, one cortex

episteme acts as a prefrontal cortex for AI agents: it sits between intent and action, and it refuses to let an action proceed until the reasoning behind it is explicit. Four Cognitive Blueprints — each keyed to a specific failure class — decide what "explicit enough" means for a given op:

Every blueprint firing — and every decision it validates — is committed to a tamper-evident hash chain. That chain is not a log; it is how the kernel gives you Active Guidance later: at the next matching decision, the relevant synthesized protocol is surfaced proactively, before the agent defaults to its training distribution.

The result is a project-specific thinking framework that compounds. The agent gets sharper on your codebase every time it resolves a conflict, not because you trained it — because the chain did the remembering.


The problem · the solution

The problem — conflicting sources, averaged answers, no durable know-how

The internet is full of contradictory how-to. Docs say one thing; a senior engineer says another. Two libraries recommend opposite patterns for the same bug. Modern agents, being auto-regressive pattern engines, cannot tell which answer fits this specific context — because fit is a causal-world-model judgment, not a pattern match over token frequency. So they average. The output sounds authoritative, fits no specific context, and misleads by omission.

Prompts cannot fix this:

The solution — a Thinking Framework at the file-system level

episteme intercepts the moment intent meets state change. Before any high-impact op (git push, npm publish, terraform apply, DB migrations, lockfile edits), the agent must project its reasoning onto a structured surface on disk:

FieldWhat the agent must commit to
Core QuestionThe one question this action is actually trying to answer (counters question substitution).
KnownsVerified facts, citations, measurements — not plausible-sounding guesses.
UnknownsNamed, classifiable gaps — not vague "there might be risks."
AssumptionsLoad-bearing beliefs, flagged so they can be falsified.
DisconfirmationThe observable event that would prove this plan wrong — pre-committed before action.

Validity is checked structurally: minimum content length, no lazy-token placeholders (none, n/a, tbd, 해당 없음), normalized command scanning so bypass shapes like subprocess.run(['git','push']) and os.system('git push') are caught. Agent-written shell scripts are deep-scanned via a stateful interceptor across calls. If the surface is absent or invalid, the op is refused (exit 2). Default is strict; advisory mode (warn-don't-block) is opt-in per-project: touch .episteme/advisory-surface.

v2.0 adds the deeper satisfier. Structure can compel an artifact into existence; it cannot tell thinking from theater — reasoning-shaped tokens defeat both regex validators and LLM judges. So the gate now accepts a second artifact: the interrogation verdict (.episteme/interrogation.json), produced by the epistemic-interrogation skill. The decision is decomposed into atomic claims; each load-bearing claim is verified by a fresh context that never sees the draft reasoning, using external evidence (file reads, execution, search — because self-review without an external signal measurably makes models worse); the strongest opposition is argued, the weakest link named, a disconfirmation pre-committed. The hook validates only what determinism can validate — freshness, floors, and verdict consistency (a refuted load-bearing claim with a proceed verdict is rejected as a contradiction; a stop verdict admits nothing). Substance lives in the protocol; structure keeps it practiced.

The Disconfirmation field in particular is not a risk checklist — it is the mechanical enforcement of Robust Falsifiability: the requirement that a plan commit, in advance, to a concrete observable event that would prove it wrong. Strict Mode rejects conditional-but-observable-less phrasing ("if issues arise") and admits only specific falsification conditions ("p95 latency > 500ms for 5 consecutive minutes, Grafana dashboard api-latency"). A plan that cannot be falsified is not episteme; it is doxa wearing episteme's vocabulary.

This is the difference between a prompt reminder and a compiler: one asks nicely, the other refuses to proceed.


Protocol Synthesis & Active Guidance — the ultimate vision

episteme is not just a blocker. The framework's real job is to turn every conflict it resolves into durable know-how that the agent re-applies automatically at the next matching decision.

Here is the loop (v1.0.0 GA shipped · CP1–CP10 green; v1.4.0-rc1 cut 2026-05-23 · 1170 tests + 54 subtests green — see docs/DESIGN_V1_0_SEMANTIC_GOVERNANCE.md):

  1. Detect conflict. The agent encounters two valid-looking but incompatible approaches for a context it hasn't fully resolved before.
  2. Decompose, don't average. The Thinking Framework refuses the "average" answer. It forces the agent to extract why the sources conflict and which feature of the context tips the decision.
  3. Synthesize a context-fit protocol. The resolved "in context X, do Y" rule is committed to an append-only, hash-chained knowledge base — tamper-evident, so the agent cannot silently rewrite the lesson.
  4. Guide actively. At the next matching decision — even weeks later, even across sessions or tools — the kernel surfaces the protocol proactively. You don't have to remember to ask.
  5. Self-maintain. When the agent discovers drift (stale config, deprecated API, core-logic mismatch), it is forced to evaluate patch vs. refactor honestly and synchronize the cascade across the full blast radius — CLI, config, schemas, docs, tests, external surfaces — before moving on.

Current status — self-measured, 2026-06-10. The gate (steps 1–2) is operational and battle-tested. The compounding arm (steps 3–4) fired its own falsifiability condition (kernel/FALSIFIABILITY_CONDITIONS.md § E1): 49 days of framework activity, zero synthesized protocols — the only emit path was attached to the rarest operation class. Event 137 made the kernel measure this itself (episteme report § Protocol Synthesis, SessionStart digest); Event 138 gave the loop a real source — every verified interrogation whose lesson is non-null synthesizes a context-scoped protocol on success. The aspirational label stays until lesson-sourced protocols demonstrably bind at future decisions (§ E4); a kernel that enforces disconfirmation on your decisions owes you the same on its own claims.

The knowledge base is not a vector store pretending to be memory. It is a structural, human-readable, version-controlled artifact you can read, edit, fork, and migrate between adapters (Claude Code, Cursor, Hermes, future tools).

Synthesized protocols are not cache entries — they are Knowledge Sanctuaries: tamper-evident (Pillar 2 hash-chained), context-scoped (each protocol carries a context signature so it only reactivates in matching situations), and supersession-respecting (a newer chain entry can override an older one, but cannot silently rewrite it). "Sanctuary" because the space is protected from the entropic LLM-average that surrounds it: only rules locally validated against this project's evidence occupy the space. The kernel outlives the tooling; the sanctuaries are how it carries know-how forward.

This architecture also counters Cognitive Deskilling — the erosion of the human operator's own reasoning capacity that follows from uncritical reliance on AI output. Because the Reasoning Surface forces declaration of Unknowns and Disconfirmation on every high-impact move, the operator cannot outsource thinking without the gaps being surfaced. See Human prompt debugging below for the specific mechanism.


I want to… → do this

GoalCommand / pointer
See the bidirectional symbiosis loop (agent and human debug each other's intent)demos/04_symbiosis/ · scripts/demo_symbiosis.sh
See the Thinking Framework off vs on on the same promptdemos/03_differential/ · scripts/demo_posture.sh
See what the framework produces end-to-enddemos/01_attribution-audit/ · demos/02_debug_slow_endpoint/
Install as a Claude Code plugin (one line)/plugin marketplace add junjslee/episteme
Install on my machine (CLI + editable kernel)pip install -e . && episteme init — see INSTALL.md
Understand what this installs in 3 minuteskernel/SUMMARY.md
Draft a reasoning surface from a Slack threadepisteme capture --input thread.txt --output surface.json
Sync identity to every AI tool I useepisteme sync
Encode working style + reasoning postureepisteme setup . --interactive
Apply the right harness for my project typeepisteme detect . && episteme harness apply <type> .
Know when not to use this kernelkernel/KERNEL_LIMITS.md
Find attribution for any borrowed conceptkernel/REFERENCES.md
Audit my setupepisteme doctor

See it in 60 seconds

Live site + visual dashboard — both rendered against the kernel's own cp7-chained-v1 hash chain. See web/README.md for the Vercel deploy guide.

Four demos, increasing in what they prove:

Open any of the three. You will know what episteme produces before reading any philosophy.


Quick start

Option A — install via Claude Code plugin marketplace

The fastest path if you use Claude Code. This repo ships a marketplace manifest (.claude-plugin/marketplace.json), so you can add it as a marketplace and install the plugin in two commands.

Inside Claude Code:

/plugin marketplace add junjslee/episteme
/plugin install episteme@episteme

Then from any shell:

episteme init     # one-shot: seed personal memory files from examples
episteme setup    # score workstyle + cognition profile
episteme sync     # propagate into Claude Code and Hermes
episteme doctor   # verify wiring

For authoritative command syntax and update semantics, see Claude Code's plugin marketplace documentation.

Option B — clone the kernel directly

For contributors, forkers, or if you want the full source tree locally:

git clone https://github.com/junjslee/episteme ~/episteme
cd ~/episteme
pip install -e .

episteme init              # generate personal memory files from templates
episteme setup . --write   # score working style + reasoning posture
episteme sync              # push identity to every adapter
episteme doctor            # verify wiring

Project-type harness:

episteme detect .                         # analyze repo, recommend a harness
episteme harness apply ml-research .      # apply it
episteme new-project . --harness auto     # scaffold + auto-detect

Deep-dive onboarding modes, scored dimensions, and defaults: docs/SETUP.md. Full command reference: docs/COMMANDS.md.


How episteme compares

Most tools in this space either build agent runtimes or provide memory APIs for applications. episteme augments the developer tools you already use.

AxisepistemeMemory APIs (mem0, OpenMemory)Agent runtimes (Agno, opencode, omo)
What it isIdentity + governance layer across dev toolsMemory API embedded in an appA runtime that executes agents
Where identity livesGoverned markdown + JSON, cross-tool, versionedVector/graph store, per appSystem prompt per session
SyncOne command, all toolsN/AN/A (per-project config)
Know-how extractionEnforced at file-system boundary; hash-chainedOpaque retrievalPrompt-tuned, per session

The gap episteme fills: no other project syncs a governed cognitive contract across multiple developer AI tools in one command, and no other project forces context-fit protocol extraction at the point of state mutation. Runtimes and memory APIs own different lanes; episteme sits above them and makes them aware of who you are, how you think, and what your project has already learned.


Why isn't this just contract testing?

A reasonable critique: if the kernel exists to keep generated behavior aligned with declared intent, why not just write more contract tests (OpenAPI conformance, Hurl scripts, DDL diffs) and let CI reject the commit when behavior diverges? No human reads anything; the machine judges.

The answer is a layer distinction.

Contract tests catch behavioral regressionsdid the code do what the spec says. Deterministic, no human in the loop, strictly stronger than review for the class of properties they can express. The Reasoning Surface catches a different failure class: epistemological regressionsdid we write the right spec, did we frame the right question, did we silently rule out an alternative, did we substitute a comfortable question for the real one. A passing Hurl suite cannot tell you you're solving the wrong problem fluently — that failure happens before the spec exists.

The two layers compose. Contract tests pin the output (behavior boundary). Reasoning Surfaces pin the framing (decision boundary). Same constraint-system meta, different target.

Episteme ships both:

When tooling lets you skip either, the gap re-opens at that layer. The kernel's job is to make both default-on for projects that opt in.


Zero-trust execution

The OWASP Top 10 for Agentic Applications (2026) — peer-reviewed by 100+ industry experts — identifies prompt injection, goal hijacking, overreach, memory poisoning, and unbounded action as the primary risk classes for autonomous agents. The Knowns / Unknowns / Assumptions / Disconfirmation structure is a structural counter to each:

OWASP Agentic Risk (2026)episteme counter
Direct goal manipulation / prompt injectionCore Question declared before execution begins; deviations surface as Unknowns
Indirect instruction injectionKnowns / Disconfirmation separate trusted state from prompt content; agent commits to a falsifiable outcome before acting on retrieved input
Overreach / unbounded actionConstraint regime declared in Frame; reversible-first policy enforced
Fluent hallucinationUnknowns field cannot be blank; assumptions must be named before acting on them
Memory poisoningPillar 2 hash-chained protocols — append-only, tamper-evident; silent rewrites of prior state are detected by verify_chain
Infinite planning loopsDisconfirmation condition required; loop exits when evidence fires

No assumption is trusted unless named. No action is taken unless the precondition (Knowns) and constraint regime are declared. The kernel is the verification layer between intent and execution.

Industry convergence — 2025–2026

Major frameworks and academic papers in the same window converge on the same architectural patterns the kernel ships: file-system-level pre-invocation checkpoints (Capsule Security ClawGuard, 2026), hash-chained tamper-evident memory (SSGM — Lam et al., 2026), reason-based alignment over rule-lists (Anthropic's Claude Constitution, 2026-01-22), five-phase cognitive loop with governance layer (SCL R-CCAM — Kim, 2025), and five-pillar agent integrity (Proofpoint Agent Integrity Framework 2026). The kernel predates these publications (CP1 shipped 2026-04-21; v1.0.0 GA 2026-04-28); the convergence is independent validation, not lineage. Full attribution map in kernel/REFERENCES.md under Convergent contemporary work.


Human prompt debugging

episteme doesn't just govern the agent — it debugs the human's intent. When the agent maps Knowns vs. Unknowns against a user request, it exposes logical gaps in the original prompt before executing flawed assumptions. The Unknowns field is often where the human realizes their question was underspecified. The Disconfirmation field is often where they realize they haven't thought about falsification at all.

This is not a side effect. It is a design property: a system that forces the agent to declare what it does not know forces the human to confront what they did not specify.


Repository layout

episteme/
├── kernel/                     philosophy (markdown; travels across runtimes)
├── demos/                      end-to-end reference deliverables
├── core/
│   ├── memory/global/          operator memory (gitignored; personal)
│   ├── hooks/                  deterministic safety + workflow hooks
│   ├── harnesses/              per-project-type operating environments
│   └── schemas/                memory + evolution contract schemas
├── adapters/                   kernel delivery layers (Claude Code, Hermes, …)
├── skills/                     reusable operator skills
├── templates/                  project scaffolds, example answer files
├── docs/                       runtime docs, architecture, contracts
├── src/episteme/               CLI + core library
└── tests/

Repo operating contract (for any agent working here): AGENTS.md. LLM sitemap: llms.txt.


CLI surface

episteme init
episteme doctor
episteme sync [--governance-pack minimal|balanced|strict]
episteme new-project [path] --harness auto
episteme detect [path]
episteme harness apply <type> [path]
episteme profile [survey|infer|hybrid] [path] [--write]
episteme cognition [survey|infer|hybrid] [path] [--write]
episteme setup [path] [--interactive] [--write] [--sync] [--doctor]
episteme bridge anthropic-managed --input <events.json> [--dry-run]
episteme bridge substrate [list-adapters|describe|verify|push|pull] ...
episteme capture [--input <file>] [--output <file>] [--by <name>]
episteme viewer [--host 127.0.0.1] [--port 37776]
episteme evolve [run|report|promote|rollback] ...

Full reference: docs/README.md.


Why this architecture

The product is a Thinking Framework; the rest of this list is what falls out when that framework is taken seriously.

Memory model, Memory Contract v1, Evolution Contract v1, and managed-runtime coexistence: docs/SYNC_AND_MEMORY.md.


Architecture & philosophy

Full diagram with node annotations and cross-references: docs/ARCHITECTURE.md.

The Thinking Framework above is the product surface. Beneath it sits a structural vocabulary borrowed from ancient Greek epistemology and Korean aesthetics — a spine that every diagram, demo, and artifact in this repository renders onto.

The triad — doxa · episteme · praxis

The grain — 결 · gyeol

The Korean word (gyeol) names the grain of wood or stone: the latent pattern-structure inside matter that, when followed, yields coherent form; when cut against, fractures. The Reasoning Surface's field ordering — Knowns → Unknowns → Assumptions → Disconfirmation — is the 결 of epistemic discipline: settled → open → provisional → falsification-condition. The calibration loop (prediction + outcome joined by correlation_id, analyzed by episteme evolve friction) is the grain refining itself across cycles.

Lifecycle

┌─────────────────────────────────────────────────────────────────────┐
│                         operator (you)                              │
│           ├── cognitive preferences   ├── working style             │
└──────────────────────────────┬──────────────────────────────────────┘
                               │
                    episteme sync
                               │
      ┌────────────────────────┼────────────────────────┐
      ▼                        ▼                        ▼
 Claude Code             Hermes (OMO)            future adapter
 (CLAUDE.md)             (OPERATOR.md)           (same kernel)
      │                        │                        │
      └────────────────────────┼────────────────────────┘
                               │
                       per-session loop
                               │
      ┌────────┬────────┬──────┴─────┬────────┬────────┐
      ▼        ▼        ▼            ▼        ▼        ▼
    FRAME → DECOMPOSE → EXECUTE → VERIFY → HANDOFF → (next session)
      │                                        │
      │ Reasoning Surface                      │ docs/PROGRESS.md
      │ (Knowns / Unknowns /                   │ docs/NEXT_STEPS.md
      │  Assumptions / Disconfirmation)        │ decision artifact
      │                                        │
      └────────────── feedback ────────────────┘

Four strata, one loop

graph TD
    subgraph SG1["① The Agentic Mind — Intention"]
        A["Agent\nGenerating intent for a high-impact op"]
        B["Reasoning Surface\ncore_question · knowns · unknowns\nassumptions · disconfirmation"]
        D["Doxa\nFluent hallucination\nnone / n/a / tbd / 해당 없음\n< 15 chars · missing fields"]
        E["Episteme\nJustified true belief\nconcrete knowns · named unknowns\ndisconfirmation ≥ 15 chars · no placeholders"]
    end

    subgraph SG2["② The Sovereign Kernel — Interception"]
        F["Stateful Interceptor\ncore/hooks/reasoning_surface_guard.py\nnormalises cmd · deep-scans agent-written files\ncross-call stateful memory"]
        G["Hard Block · exit 2\nExecution denied\nAgent forced to re-author surface"]
        H["PASS · exit 0\nPrecondition satisfied\nExecution admitted to Praxis"]
    end

    subgraph SG3["③ Praxis & Reality — Execution"]
        I["Tool Execution\ngit push · bash script.sh · npm publish\nterraform apply · DB migrations · lockfile edits"]
        J["Observed Outcome\ncore/hooks/calibration_telemetry.py\nexit_code 0 or non-zero · stderr captured"]
    end

    subgraph SG4["④ 결 · Gyeol — Cognitive Texture & Evolution"]
        K["Prediction Record\ncorrelation_id stamped at PASS\n~/.episteme/telemetry/YYYY-MM-DD-audit.jsonl"]
        L["Outcome Record\ncorrelation_id · exit_code · stderr\n~/.episteme/telemetry/YYYY-MM-DD-audit.jsonl"]
        M["episteme evolve friction\nsrc/episteme/cli.py · _evolve_friction\npairs prediction ↔ outcome by correlation_id\nranks under-named unknowns · flags exit_code ≠ 0"]
        N["결 · Gyeol\nRefined cognitive grain\nfriction hotspots · calibrated profile axes"]
        O["Operator Profile\ncore/memory/global/operator_profile.md\nlast_elicited axes updated · confidence rescored"]
        P["kernel/CONSTITUTION.md\nFour principles recalibrated\nfailure-mode counters sharpened"]
    end

    A --> B
    B --> D
    B --> E
    D --> F
    E --> F
    F --> G
    F --> H
    G -.->|"cognitive retry"| A
    H --> I
    I --> J
    E -.->|"correlation_id stamped at PASS"| K
    J --> L
    K --> M
    L --> M
    M --> N
    N --> O
    N --> P
    O -.->|"posture loop closed"| A
    P -.->|"posture loop closed"| A

    classDef doxaStyle fill:#c0392b,stroke:#922b21,color:#fff
    classDef episteStyle fill:#1e8449,stroke:#145a32,color:#fff
    classDef passStyle fill:#27ae60,stroke:#1e8449,color:#fff
    classDef praxisStyle fill:#2ecc71,stroke:#27ae60,color:#000
    classDef gyeolStyle fill:#1a5276,stroke:#154360,color:#fff
    classDef kernelStyle fill:#6c3483,stroke:#512e5f,color:#fff
    classDef neutralStyle fill:#2c3e50,stroke:#1a252f,color:#fff

    class D,G doxaStyle
    class E episteStyle
    class H,I passStyle
    class J praxisStyle
    class K,L,M,N,O,P gyeolStyle
    class F kernelStyle
    class A,B neutralStyle

Four subgraphs, one lifecycle. Doxa (red) — fluent-but-unvalidated output or a hard block — is the failure state the kernel exists to prevent. Episteme (green) — a validated Reasoning Surface — is the precondition for execution. Praxis (light green) — the admitted tool execution and its observed outcome. 결 · Gyeol (blue) — the calibration loop that refines the framework across cycles, feeding back into the operator profile and the kernel constitution.

Works with any stack. episteme operates independently of the LLM runtime — LangChain, CrewAI, Claude Code, Cursor, MCP. Kernel is pure markdown; operator profile is plain JSON; workflow loop is vendor-neutral. Adapter layer (Claude Code, Hermes, OMO/OMX) is pluggable.

Cognitive Arms — v1.1+

The four blueprints (above) and three pillars — Cognitive Blueprints · Append-Only Hash Chain · Framework Synthesis & Active Guidance — are the v1.0 unchanging structural foundation. Pillars do not move. v1.1 adds 3 Cognitive Arms operating on top: fluid active engines that refactor the kernel's own knowledge over time.

The distinction is load-bearing — pillars are settled vocabulary; arms are how the system audits and refines its own outputs across time. Status: v1.4.0-rc1 cut 2026-05-23, 1170 tests + 54 subtests green. Arm A substrate shipped (supersede-with-history infrastructure + auto-instrumentation hooks that record operator profile + policy edits to chain streams); Arm A residue resumes opportunistically. Arm B substrate-facing form formally SUNSET at Event 129 — its premise (a stable model-capability gap) was falsified by the Event 119–120 saturation finding; its operator-facing residue (core/ptsp/ typed Fact/Inference promotion gate) is retained-and-reachable via episteme practice trace. Arm C scoped for a future cycle pending evidence the substrate-gap claim survives.


The kernel files

Start at kernel/. Pure markdown. No code. No vendor lock-in.

FileWhat it defines
SUMMARY.md30-line operational distillation
CONSTITUTION.mdRoot claim, four principles, six reasoner failure modes
FAILURE_MODES.mdFull 12-mode taxonomy (6 reasoner + 3 governance v0.11 + 2 v1.0 RC + 1 v1.2 RC) ↔ counter artifacts
ARTIFACT_TAXONOMY.mdFour-tier mutation discipline (frozen-purpose · authoritative-living · working-execution · ephemeral)
PATTERN_GOVERNANCE.mdNovel-decision vs mechanical-implementation; pattern-declaration artifact + implementation-of reference
CALIBRATION_TELEMETRY.mdBrier score + calibration curve + base-rate-aware metrics from signed-surface outputs
REASONING_SURFACE.mdKnowns / Unknowns / Assumptions / Disconfirmation protocol
OPERATOR_PROFILE_SCHEMA.mdSchema for encoding an operator's cognitive preferences
MEMORY_ARCHITECTURE.mdFive memory tiers (working / episodic / semantic / procedural / reflective)
KERNEL_LIMITS.mdWhen the kernel is the wrong tool; declared gaps
REFERENCES.mdAttribution for every load-bearing borrowed concept; convergent contemporary work; regulator-recognizable standards
CHANGELOG.mdVersioned kernel history

Authority hierarchy: project docs > operator profile > kernel defaults > runtime defaults. Specific beats general.


TopicWhere
The v1.0 RC directiondocs/DESIGN_V1_0_SEMANTIC_GOVERNANCE.md
Kernel distillation (30 lines)kernel/SUMMARY.md
What the kernel producesdemos/01_attribution-audit/ · demos/02_debug_slow_endpoint/
Same prompt, framework off vs. ondemos/03_differential/
Bidirectional symbiosis — agent and human debug each other's intentdemos/04_symbiosis/
Install paths (marketplace, CLI, dev)INSTALL.md
Benchmark with disconfirmation targetbenchmarks/kernel_v1/
Substrate bridge (mem0, memori, noop)docs/SUBSTRATE_BRIDGE.md
Profile + cognition setupdocs/SETUP.md
Sync matrix, memory model, contractsdocs/SYNC_AND_MEMORY.md
Architecture diagram + cross-referencesdocs/ARCHITECTURE.md
Behavioral-drift complement (Contract Gate)docs/CONTRACT_GATE.md
Harness systemdocs/HARNESSES.md
Hook reference + governance packsdocs/HOOKS.md
Skills + agent personas + provenancedocs/SKILLS_AND_PERSONAS.md
Personal customization (memory/hooks/skills)docs/CUSTOMIZATION.md
Agent repo operating contractAGENTS.md
Layer model + adapter matrixdocs/LAYER_MODEL.md

Push-readiness checklist

PYTHONPATH=. pytest -q tests/test_profile_cognition.py
python3 -m py_compile src/episteme/cli.py
episteme doctor
git status && git rev-list --left-right --count @{u}...HEAD

Commercial licensing

For commercial licensing or consulting, contact me.