Hare/yoi

Fork 0

Hare 6083121574

audit: record crate boundary findings

2026-05-28 22:25:54 +09:00

22 KiB

Raw Blame History

Crate boundary audit

Date: 2026-05-28

summary

The workspace dependency graph is broadly acyclic and mostly layered in the expected direction: protocol / lint-common / proc-macros sit at the bottom, llm-worker / manifest / tools / provider / session-store provide shared infrastructure, and pod / tui are orchestration or UI layers. I did not find a hard Cargo-level cycle or an obvious UI crate being depended on by a lower crate.

The main boundary problems are subtler:

protocol exposes several public wire payloads as serde_json::Value while documenting them as the JSON form of session_store::* types. This avoids a Rust dependency edge but creates a hidden schema dependency from protocol/clients to session-store.
workflow depends on memory for WorkspaceLayout, and memory::WorkspaceLayout owns workflow paths. This makes memory a cross-domain workspace-layout hub rather than only the memory subsystem.
Several lower/shared crates have comments/doc-comments explaining Pod, TUI, controller, prompt-catalog, or downstream orchestration behavior. Most are acceptable integration-contract notes, but a few are implementation knowledge that should move upward or be generalized.

No code, formatting, commits, merges, or project-record files outside artifacts/ were changed.

inspected commands / files

Commands run

cargo metadata --no-deps --format-version 1 | jq ... > artifacts/deps.txt
- Extracted workspace-internal normal/dev dependency edges.
cargo metadata --no-deps --format-version 1 | jq ... > artifacts/reverse-deps.txt
- Extracted reverse dependency summary.
rg -n 'pub (struct|enum|fn|mod|trait|type|use) ...' crates --glob '*.rs' > artifacts/public-concept-hits.txt
- Searched public APIs for boundary-relevant terms (Pod, TUI, Workflow, Manifest, Memory, Session, etc.).
rg -n '(^\s*(//!|///|//)\s?.*(...))' crates --glob '*.rs' > artifacts/comment-concept-hits.txt
- Searched comments/doc-comments for crate names and upper-layer concepts.
rg -n 'TUI / GUI|session_store::|parent Controller|Pod treats|Pod side|...' ... > artifacts/suspicious-excerpts.txt
- Narrowed suspicious comment excerpts.
rg -n 'use (session_store|pod_registry|llm_worker|manifest)::|...' crates/tui/src
- Checked why TUI depends on lower internal crates.
rg -n 'WorkspaceLayout|memory::' crates/workflow/src
- Checked workflow -> memory dependency use.

Failed exploratory commands:

python / python3 parse attempts failed because Python was not available in the environment; switched to cargo metadata + jq.

Supplemental raw outputs left in the artifact directory:

deps.txt, reverse-deps.txt
deps-numbered.txt, reverse-deps-numbered.txt
public-concept-hits.txt
comment-concept-hits.txt
suspicious-excerpts.txt

Main files inspected directly

Root/workspace:
- Cargo.toml
- work-items/open/20260528-131317-crate-boundary-audit/item.md
Cargo manifests:
- crates/protocol/Cargo.toml
- crates/manifest/Cargo.toml
- crates/llm-worker/Cargo.toml
- crates/pod/Cargo.toml
- crates/client/Cargo.toml
- crates/tui/Cargo.toml
- crates/memory/Cargo.toml
- crates/workflow/Cargo.toml
- crates/provider/Cargo.toml
- crates/session-store/Cargo.toml
- crates/pod-registry/Cargo.toml
- crates/session-metrics/Cargo.toml
- crates/tools/Cargo.toml
- crates/daemon/Cargo.toml
- crates/lint-common/Cargo.toml
- crates/llm-worker-macros/Cargo.toml
Public/API and suspicious source files:
- crates/protocol/src/lib.rs
- crates/manifest/src/lib.rs
- crates/manifest/src/model.rs
- crates/llm-worker/src/lib.rs
- crates/llm-worker/src/interceptor.rs
- crates/llm-worker/src/llm_client/types.rs
- crates/pod/src/lib.rs
- crates/pod/src/pod.rs (grep/read excerpts)
- crates/pod/src/spawn/comm_tools.rs (grep excerpts)
- crates/client/src/lib.rs
- crates/client/src/spawn.rs
- crates/tui/src/app.rs (grep excerpts)
- crates/tui/src/spawn.rs (grep excerpts)
- crates/tui/src/picker.rs (grep excerpts)
- crates/memory/src/lib.rs
- crates/memory/src/scope.rs
- crates/memory/src/workspace.rs
- crates/memory/src/extract/mod.rs (grep excerpts)
- crates/memory/src/consolidate/mod.rs (grep excerpts)
- crates/memory/src/resident.rs (grep excerpts)
- crates/workflow/src/lib.rs
- crates/workflow/src/linter.rs (grep excerpts)
- crates/workflow/src/scope.rs (grep excerpts)
- crates/workflow/src/workflow.rs (grep excerpts)
- crates/session-store/src/lib.rs
- crates/session-store/src/segment.rs
- crates/session-store/src/segment_log.rs (grep excerpts)
- crates/session-store/src/system_item.rs
- crates/session-store/src/pod_metadata.rs
- crates/pod-registry/src/lib.rs
- crates/provider/src/lib.rs
- crates/tools/src/lib.rs

dependency graph overview

Internal dependency edges from cargo metadata --no-deps:

client -> manifest, protocol
daemon -> manifest, protocol
lint-common -> (none)
llm-worker -> llm-worker-macros
llm-worker-macros -> (none)
manifest -> llm-worker, protocol
memory -> lint-common, llm-worker, manifest
pod -> llm-worker, manifest, memory, pod-registry, protocol, provider, session-metrics, session-store, tools, workflow
pod-registry -> manifest, session-store
protocol -> (none)
provider -> llm-worker, manifest
session-metrics -> session-store
session-store -> llm-worker, protocol
tools -> llm-worker, manifest
tui -> client, llm-worker, manifest, pod-registry, protocol, session-store; dev-dep tools
workflow -> lint-common, manifest, memory

Reverse summary:

client <- tui
lint-common <- memory, workflow
llm-worker <- manifest, memory, pod, provider, session-store, tools, tui
manifest <- client, daemon, memory, pod, pod-registry, provider, tools, tui, workflow
memory <- pod, workflow
pod-registry <- pod, tui
protocol <- client, daemon, manifest, pod, session-store, tui
provider <- pod
session-metrics <- pod
session-store <- pod, pod-registry, session-metrics, tui
tools <- pod, tui
workflow <- pod

This is directionally reasonable for orchestration-heavy code: pod is the main integrator; tui sits above client but also reads lower schemas; protocol has no Rust workspace dependencies.

dependency/interface findings grouped by severity

Severity: actual problem / should ticket

1. `protocol` public API has hidden `session-store` schema coupling through `serde_json::Value`

Evidence:

crates/protocol/src/lib.rs:237 documents Event::SystemItem.item as the JSON form of session_store::SystemItem.
crates/protocol/src/lib.rs:394 documents Event::Snapshot.entries as the JSON form of session_store::LogEntry.
crates/protocol/src/lib.rs:419 documents Event::SegmentRotated.entry as the JSON form of session_store::LogEntry::SegmentStart.
crates/tui/src/app.rs:1236 and nearby lines deserialize snapshot entries back into session_store::LogEntry.
crates/tui/src/app.rs:1277 and nearby lines deserialize Event::SystemItem.item into session_store::SystemItem.

Why this is a boundary issue:

protocol is dependency-free at the Cargo level, but its wire contract is not actually self-owned: clients must know session-store schemas to reconstruct state correctly.
The type system cannot enforce compatibility between protocol and session-store because the public protocol type is only serde_json::Value.
This explains why tui depends directly on session-store despite also depending on client/protocol.

Recommended direction:

Extract the wire-stable log/system-item DTOs into a neutral crate, or move protocol-facing DTOs into protocol and have session-store convert to/from them.
Avoid public protocol docs that say “this is session_store::X JSON” unless session-store is intentionally part of the protocol contract and typed as such.

2. `workflow -> memory` dependency exists for shared workspace layout

Evidence:

crates/workflow/Cargo.toml:11 depends on memory.
crates/workflow/src/linter.rs:5, crates/workflow/src/scope.rs:6, crates/workflow/src/workflow.rs:17 use memory::WorkspaceLayout.
crates/memory/src/workspace.rs:8 includes <root>/.insomnia/workflow/<slug>.md in memory's layout documentation.
crates/memory/src/workspace.rs:16-18 says workflows are human-managed and live one level up under .insomnia/workflow/.
crates/memory/src/workspace.rs:127-165 exposes workflow_dir() / workflow_path() from the memory crate.

Why this is a boundary issue:

workflow is conceptually a sibling subsystem, not a consumer of generated memory state.
The current dependency is only for path layout. That makes memory own cross-subsystem workspace conventions and forces workflow to import a memory-domain crate for non-memory concerns.
This is not severe yet, but it will make future workflow growth pull against crate ownership.

Recommended direction:

Extract WorkspaceLayout / .insomnia path conventions into a neutral crate or a neutral module under manifest/new workspace-layout crate.
Then make memory and workflow both depend on that neutral layout instead of depending on each other.

Severity: suspicious but currently acceptable

3. `session-store` owns Pod metadata and spawned-child metadata

Evidence:

crates/session-store/src/pod_metadata.rs:1-6 defines “Pod metadata persistence API”.
crates/session-store/src/pod_metadata.rs:42-60 defines PodSpawnedScopeRule / PodSpawnedChild, including delegated scope and callback_address.
crates/session-store/src/pod_metadata.rs:62-88 exposes PodMetadata and PodMetadataStore publicly.

Assessment:

This is Pod/orchestration-specific state inside a crate named session-store.
It is acceptable if session-store is intentionally “insomnia persistence primitives”, not a generic conversation-log crate. Current project decisions appear to lean that way.
If the intended boundary is “session-store only stores sessions/segments/logs”, this should be split or renamed. If the intended boundary is “session-store stores all durable Pod state”, the naming/docs should say that explicitly.

Recommended direction:

No immediate refactor unless the ownership goal changes.
Clarify crate-level docs: either broaden session-store's stated responsibility to durable Pod/session persistence, or split Pod metadata into a pod-state/pod-metadata crate.

4. TUI directly depends on persistence/registry crates

Evidence:

crates/tui/Cargo.toml depends on session-store, pod-registry, manifest, llm-worker, and protocol in addition to client.
crates/tui/src/picker.rs uses pod_registry::{LockFileGuard, default_registry_path} and session_store::{...}.
crates/tui/src/app.rs:1236-1298 parses session_store::LogEntry / session_store::SystemItem.
crates/tui/src/spawn.rs:408-409 uses session_store::FsStore and restore_by_segment for resume-related paths.

Assessment:

TUI is a top-level crate, so dependency direction is allowed.
The direct session-store parse dependency is largely a symptom of finding #1: protocol sends untyped JSON whose real schema lives in session-store.
Direct pod-registry access for picker/runtime discovery may be acceptable for a local-first TUI, but it bypasses a cleaner “TUI talks protocol/client only” boundary.

Recommended direction:

Fix protocol DTO ownership first.
After that, re-evaluate whether TUI still needs direct session-store and pod-registry dependencies or whether picker/discovery can move behind client/protocol APIs.

5. `manifest -> llm-worker` dependency is acceptable but should remain one-way

Evidence:

crates/manifest/Cargo.toml depends on llm-worker and protocol.
crates/manifest/src/model.rs:17-19 re-exports llm_worker::llm_client::capability::{ModelCapability, ReasoningControl, ReasoningEffort}.

Assessment:

This is a reasonable tradeoff to avoid duplicate model-capability types.
It does mean manifest is not a pure data crate independent of worker runtime types.
The boundary remains acceptable as long as llm-worker does not depend back on manifest, and provider-level resolution stays in provider.

Severity: no issue found

No Rust workspace dependency cycle was found in the inspected graph.
I did not find lower crates depending on tui or client implementation crates.
client -> protocol/manifest and pod -> provider/tools/session-store/memory/workflow are directionally appropriate.
provider -> llm-worker/manifest is appropriate: provider constructs concrete LlmClient implementations from resolved model configuration.
tools -> llm-worker/manifest is appropriate: tools expose ToolDefinitions and enforce manifest scopes.
pod-registry -> session-store is acceptable if registry entries need session/segment identity and durable state coordination.

comment/doc-comment findings

Problematic or should be generalized

`protocol` describes parent controller and pod-registry side effects

crates/protocol/src/lib.rs:65-70
- PodEvent docs say the “parent Controller applies variant-specific side effects (registry / pod-registry updates)”.
- This is implementation knowledge from the pod crate inside a dependency-free protocol crate.
- Better: state the wire contract (“event is delivered to the parent; receiver is responsible for handling lifecycle effects”) and keep registry-specific behavior in pod docs.

`protocol` documents `session_store::*` JSON shapes as protocol payloads

crates/protocol/src/lib.rs:237
crates/protocol/src/lib.rs:394
crates/protocol/src/lib.rs:419

This is the comment-level manifestation of the public-interface issue in finding #1.

`llm-worker` public request docs mention Pod-specific cache-key choice

crates/llm-worker/src/llm_client/types.rs:523-526
- Request::cache_key doc says pod side is expected to pass SegmentId.
- llm-worker should expose the generic concept: a stable caller-provided conversation/cache namespace key.
- Pod's choice of SegmentId belongs in pod docs/tests, not in the generic request type.

`memory` docs prescribe Pod assembly details

crates/memory/src/lib.rs:3-7
- Says generic CRUD tools must not touch memory/knowledge and Pod is responsible for denying them.
crates/memory/src/scope.rs:4-8
- Says Pod is expected to call deny_write_rules and pass the result to tools::ScopedFs.
crates/memory/src/extract/mod.rs:3-14
- Explains Pod post-run hook, PromptCatalog, PodPrompt::MemoryExtractSystem, and pointer persistence responsibility.
crates/memory/src/consolidate/mod.rs:5-15
- Explains Pod assembling a disposable Worker and using PodPrompt::MemoryConsolidationSystem.
crates/memory/src/resident.rs:3-11
- Says surfaces are used by the Pod system-prompt assembler and Pod IPC layer for TUI # completion.

Assessment:

These are understandable because memory is currently a helper subsystem consumed by pod.
They nevertheless make memory read like it is documenting Pod orchestration rather than memory-owned contracts.
Prefer caller-neutral wording: “the orchestrator/caller registers these tools”, “the caller persists the pointer”, “completion consumers may use ...”. Keep Pod-specific sequence docs in pod.

Suspicious but acceptable integration-contract comments

`protocol::Segment` docs mention TUI/GUI and Pod parsing behavior

crates/protocol/src/lib.rs:116-126
- Mentions richer clients (TUI/GUI) producing typed atoms and Pod not re-parsing flattened strings.
crates/protocol/src/lib.rs:143-153
- Mentions Pod resolving FileRef and treating unknown variants as unresolved input.
crates/protocol/src/lib.rs:222-231
- Mentions additional TUI/GUI instances rendering user messages.

Assessment:

Mentioning client classes can be acceptable in protocol docs when it explains wire semantics.
The Pod behavior details are more debatable; they should be limited to required protocol semantics, not specific controller implementation.

`session-store::SystemItem` mentions TUI typed rendering

crates/session-store/src/system_item.rs:27-35
crates/session-store/src/system_item.rs:49-52

Assessment:

It is valid to document why typed payload exists.
“so the TUI can render” should probably be generalized to “so clients can render” because session-store is lower than tui.

`session-store::segment` mentions Pod as typical caller

crates/session-store/src/segment.rs:3-5
crates/session-store/src/segment.rs:38-40
crates/session-store/src/segment.rs:175-180
crates/session-store/src/segment.rs:252-254

Assessment:

Mostly acceptable because Pod is currently the primary writer.
Better wording would say “caller/orchestrator” first and optionally “e.g. Pod” only where it clarifies current integration.

`client` docs mention TUI/GUI/E2E

crates/client/src/lib.rs:9
crates/client/src/spawn.rs:4-10
crates/client/src/spawn.rs:92-96

Assessment:

Acceptable: client is explicitly a library for UI/GUI/E2E callers to speak Pod protocol. These are consumer examples rather than lower-layer implementation leakage.

`llm-worker::Interceptor` docs mention Pod as an upper layer

crates/llm-worker/src/interceptor.rs:3-6
crates/llm-worker/src/interceptor.rs:122-126
crates/llm-worker/src/interceptor.rs:140-146

Assessment:

Mostly acceptable: the docs explicitly say Worker does not know higher-level concepts and Pod is only an example upper layer.
For stricter boundary hygiene, prefer “upper layers/orchestrators” and avoid naming Pod except in examples.

acceptable dependency-aware comments criteria

I treated a comment as acceptable when it met at least one of these criteria:

It explains a public wire or file-format contract that consumers must honor, without prescribing one consumer's private implementation.
It names a higher layer only as an example (e.g. Pod) while the API remains generic and caller-owned.
It documents an intentional direction-of-control boundary, such as “the lower crate exposes a hook; upper layers implement policy”.
It references another crate that the current crate actually depends on and whose type or function is part of the local API.
It appears in tests/examples whose purpose is cross-crate contract verification.

I treated a comment as problematic when it did any of the following:

A lower crate explains what a dependent higher crate currently does internally.
A lower crate's public docs define a payload as another higher crate's private or semi-private JSON schema.
A shared subsystem describes its API mainly as a sequence of Pod/TUI orchestration steps, rather than a caller-neutral contract.
The comment reveals a hidden dependency that Cargo cannot type-check.

recommended follow-up tickets

Typed protocol snapshot/system-item payloads
- Goal: remove protocol public serde_json::Value payloads whose real schemas are session_store::*.
- Candidate implementation directions:
  - Move wire DTOs for log entries/system items into protocol, with session-store converting to/from them; or
  - Extract a neutral session-log-schema / wire-log crate used by both protocol and session-store.
- Success condition: TUI/client code can parse snapshots/system items using protocol-owned typed structures, not session_store::LogEntry hidden behind JSON.
Extract neutral workspace layout from memory
- Goal: remove workflow -> memory when the only need is .insomnia path layout.
- Candidate implementation directions:
  - New neutral crate/module for WorkspaceLayout; or
  - Move .insomnia path layout into manifest if that crate is intended to own workspace configuration.
- Success condition: workflow and memory are siblings depending on a neutral layout owner.
Boundary-comment hygiene pass
- Goal: replace reverse-knowledge comments in lower/shared crates with caller-neutral wording.
- Scope:
  - protocol/src/lib.rs controller/session-store JSON wording.
  - llm-worker/src/llm_client/types.rs Pod SegmentId cache-key wording.
  - memory/src/{scope,extract,consolidate,resident}.rs Pod/TUI orchestration wording.
  - session-store/src/{system_item,segment}.rs TUI/Pod-specific wording where not required.
- Success condition: comments explain local contracts and extension points; dependent-crate implementation details live in the dependent crate.
Clarify session-store crate responsibility
- Goal: decide whether session-store is only session/segment log storage or the broader durable Pod-state persistence crate.
- If broader: update crate docs/naming comments to say so.
- If narrower: split pod_metadata into a Pod-owned persistence crate/module.

unresolved questions

Is protocol intended to be the sole owner of all stable wire DTOs, or is session-store intentionally part of the protocol contract despite the current serde_json::Value indirection?
Is session-store deliberately the durable state crate for all Pod metadata, or should it be constrained to conversation/session logs?
Should WorkspaceLayout be considered a memory-domain concept, or a repository/workspace-domain concept shared by memory, knowledge, and workflow?
Should TUI remain allowed to inspect local registry/session files directly for picker and restore UX, or should those capabilities move behind client/protocol APIs?
Are comments allowed to name the primary current consumer (Pod) when documenting a generic lower-layer extension point, or should comments avoid such names unless the type itself is Pod-specific?

22 KiB Raw Blame History

Crate boundary audit

summary

inspected commands / files

Commands run

Main files inspected directly

dependency graph overview

dependency/interface findings grouped by severity

Severity: actual problem / should ticket

1. protocol public API has hidden session-store schema coupling through serde_json::Value

2. workflow -> memory dependency exists for shared workspace layout

Severity: suspicious but currently acceptable

3. session-store owns Pod metadata and spawned-child metadata

4. TUI directly depends on persistence/registry crates

5. manifest -> llm-worker dependency is acceptable but should remain one-way

Severity: no issue found

comment/doc-comment findings

Problematic or should be generalized

protocol describes parent controller and pod-registry side effects

protocol documents session_store::* JSON shapes as protocol payloads

llm-worker public request docs mention Pod-specific cache-key choice

memory docs prescribe Pod assembly details

Suspicious but acceptable integration-contract comments

protocol::Segment docs mention TUI/GUI and Pod parsing behavior

session-store::SystemItem mentions TUI typed rendering

session-store::segment mentions Pod as typical caller

client docs mention TUI/GUI/E2E

llm-worker::Interceptor docs mention Pod as an upper layer

acceptable dependency-aware comments criteria

recommended follow-up tickets

unresolved questions

22 KiB

Raw Blame History

1. `protocol` public API has hidden `session-store` schema coupling through `serde_json::Value`

2. `workflow -> memory` dependency exists for shared workspace layout

3. `session-store` owns Pod metadata and spawned-child metadata

5. `manifest -> llm-worker` dependency is acceptable but should remain one-way

`protocol` describes parent controller and pod-registry side effects

`protocol` documents `session_store::*` JSON shapes as protocol payloads

`llm-worker` public request docs mention Pod-specific cache-key choice

`memory` docs prescribe Pod assembly details

`protocol::Segment` docs mention TUI/GUI and Pod parsing behavior

`session-store::SystemItem` mentions TUI typed rendering

`session-store::segment` mentions Pod as typical caller

`client` docs mention TUI/GUI/E2E

`llm-worker::Interceptor` docs mention Pod as an upper layer