diff --git a/work-items/open/20260529-205844-session-pod-state-boundary/artifacts/.gitkeep b/work-items/open/20260529-205844-session-pod-state-boundary/artifacts/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/work-items/open/20260529-205844-session-pod-state-boundary/item.md b/work-items/open/20260529-205844-session-pod-state-boundary/item.md new file mode 100644 index 00000000..d9ca3521 --- /dev/null +++ b/work-items/open/20260529-205844-session-pod-state-boundary/item.md @@ -0,0 +1,77 @@ +--- +id: 20260529-205844-session-pod-state-boundary +slug: session-pod-state-boundary +title: Clarify session log and Pod metadata persistence boundaries +status: open +kind: task +priority: P2 +labels: [session-store, pod, persistence, architecture] +created_at: 2026-05-29T20:58:44Z +updated_at: 2026-05-29T20:58:44Z +assignee: null +legacy_ticket: null +--- + +## Background + +The current persistence design intentionally has two durable surfaces: + +- append-only session/segment logs, which are the authority for conversation/history state and segment lineage; +- name-keyed Pod metadata, which is supposed to be a thin pointer layer for Pod-name attach/restore and spawned-child bookkeeping. + +That boundary has become blurry. The `session-store` crate is named and documented primarily as session persistence, but it also owns Pod metadata types, the `PodMetadataStore` trait, validation of Pod names, and the filesystem layout `{sessions_root}/pods/{pod_name}/metadata.json`. In addition, Pod metadata currently stores `spawned_children` and `resolved_manifest_snapshot`, while session logs also store Pod scope snapshots as `LogEntry::Extension` entries. This creates a risk that session-log authority, Pod-state authority, and runtime mirrors drift or become hard to reason about. + +Observed code points: + +- `crates/session-store/src/lib.rs` documents session persistence via append-only JSONL logs, but also exports `pod_metadata` types. +- `crates/session-store/src/fs_store.rs` stores segment logs under `{root}/{session_id}/{segment_id}.jsonl` and Pod metadata under `{root}/pods/{pod_name}/metadata.json` in the same `FsStore`. +- `crates/session-store/src/pod_metadata.rs` says metadata is a lightweight name-keyed pointer, but `PodMetadata` also includes `spawned_children` and `resolved_manifest_snapshot`. +- `crates/pod/src/pod.rs` writes Pod metadata from run/restore/fork/compact paths (`write_pod_metadata_active`, `write_pod_metadata_pending`) and preserves existing `spawned_children` via a read-modify-write helper. +- `crates/pod/src/spawn/registry.rs` treats durable spawned-child state as living in Pod metadata and runtime `spawned_pods.json` as a live mirror, while scope snapshots for resume live in the session log. +- `crates/tui/src/pod_list.rs` reads `{store_dir}/pods/*/metadata.json` directly in some paths rather than using only the `PodMetadataStore` trait. + +## Goal + +Audit and clarify the architectural boundary between session logs, Pod metadata/state, and runtime mirrors. If the current placement is acceptable, document the boundary precisely. If it is not, refactor toward clearer ownership without changing user-visible restore/attach semantics. + +## Desired boundary + +The resulting design should make these responsibilities explicit: + +- Session log authority: + - conversation history and system prompt replay; + - segment lineage (`forked_from`, `compacted_from`); + - request config / usage / metrics / memory extension records; + - Pod runtime scope snapshots required to restore the same session without silently reclaiming delegated writes. +- Pod metadata authority: + - name-keyed active `(SessionId, SegmentId)` pointer; + - resolved manifest snapshot needed for Pod-name restore when the source profile/manifest should not be re-evaluated; + - spawned-child registry state, if retained here, with a documented reason why it is Pod state rather than session state. +- Runtime mirrors: + - sockets, lock-file allocations, and `spawned_pods.json` are live runtime views, not durable authority. + +## Acceptance criteria + +- Audit every public type/function in `session-store` related to Pod metadata and classify whether it is genuinely session-store responsibility or Pod-state responsibility. +- Decide whether Pod metadata should remain inside `session-store`, move to a separate crate/module, or be renamed/split so the session-store boundary is less misleading. +- Document the decision in code comments and/or crate/module docs. +- If keeping Pod metadata in `session-store`, update docs so the crate is explicitly described as the durable store for both session logs and Pod name metadata, and explain why they share a root/backend. +- If moving/splitting, introduce a clear API boundary so session log APIs do not need to expose Pod metadata concepts unnecessarily. +- Remove or justify direct filesystem reads of `pods/*/metadata.json` outside the store abstraction, especially in TUI Pod list/discovery paths. +- Clarify the authority of `resolved_manifest_snapshot`: whether it belongs in Pod metadata, session log, or another Pod-state record, and ensure restore paths follow the documented authority. +- Clarify the authority of `spawned_children`: whether it belongs in Pod metadata, session log, or runtime registry, and ensure restore/prune/reclaim behavior follows the documented authority. +- Ensure read-modify-write preservation of unrelated Pod metadata fields does not silently lose data when active pointer updates and spawned-child updates occur near each other; either make the update semantics explicit or add a safer merge/update API. +- Preserve the current durable behavior unless deliberately changed: + - Pod-name restore resolves active metadata then restores the session log; + - session restore uses session log state and scope snapshots; + - runtime `spawned_pods.json` remains a mirror; + - stopped or unreachable child Pod metadata is not deleted merely because its socket is gone. +- Add focused tests for whichever boundary is chosen, including active pointer updates preserving spawned children / manifest snapshot, spawned-child updates preserving active pointer / manifest snapshot, and discovery/restore behavior when one surface exists without the other. +- Update any relevant docs or workflow notes if the persistence model changes. + +## Non-goals + +- Do not redesign the session-log schema unless the audit proves it is necessary. +- Do not add backward compatibility for obsolete persistence layouts unless explicitly required by a chosen migration plan. +- Do not change live Pod registry lock semantics except where necessary to align with the clarified durable authority. +- Do not implement broader database storage or transactional storage in this ticket; if the boundary audit reveals a need for transactions, record it as a follow-up unless a minimal update API suffices. diff --git a/work-items/open/20260529-205844-session-pod-state-boundary/thread.md b/work-items/open/20260529-205844-session-pod-state-boundary/thread.md new file mode 100644 index 00000000..46d7b0f6 --- /dev/null +++ b/work-items/open/20260529-205844-session-pod-state-boundary/thread.md @@ -0,0 +1,7 @@ + + +## Created + +Created by tickets.sh create. + +---