Compare commits

...

12 Commits

46 changed files with 2172 additions and 404 deletions

View File

@ -5,6 +5,15 @@
- プロンプトはすべて resources/promptsに集約している。管理効率の工場と同時に、ユーザーがオーバーライドする形式でもある。
- E2E(実プロセスをスポーンさせてのテスト)は未設計。
### LLM コンテキストの加工原則
LLM に投げる context への割り込みは、大きく2種類に分かれる。**前者は許されるが、後者は禁止**。
- **許される**: 既存 history から純粋に再現可能な変換器pruning、compaction による要約、tool result の content 切り詰め、prompt cache anchor の付与等)。同じ history を入力すれば同じ結果が出る決定的な加工で、history そのものを書き換えるわけでもなく、外から新しい情報を持ち込まない。
- **禁止**: Pod の現在状態(受信した notification、active な内部キュー、time-of-day、外部イベント等に基づいて、history に commit せずに context だけに新規 input を差し込むこと。これをやると LLM はそれに反応して history を変化させる一方、トリガーは worker.history に残らないため、次ターン以降「自分がなぜその発言/tool call をしたか」の根拠が消える。resume 時にはさらに露骨に再現不能になる。prompt cache の prefix も毎回ズレる。
新しい input を context に乗せたいなら、必ず先に `worker.history` に append して commit すること。`history.json` への永続化はそこから自動的についてくる。Notify / PodEvent / `<system-reminder>` 系はこの原則で扱う(→ `tickets/notify-history-persist.md`)。
---
Gitは基本的にすべてユーザーが操作している。書き込みが必要な操作は明示的に許可されない限り行わないこと

11
Cargo.lock generated
View File

@ -2144,6 +2144,7 @@ dependencies = [
"schemars",
"serde",
"serde_json",
"session-metrics",
"session-store",
"tempfile",
"thiserror 2.0.18",
@ -2926,6 +2927,15 @@ dependencies = [
"syn 2.0.117",
]
[[package]]
name = "session-metrics"
version = "0.1.0"
dependencies = [
"serde",
"serde_json",
"session-store",
]
[[package]]
name = "session-store"
version = "0.1.0"
@ -2941,6 +2951,7 @@ dependencies = [
"tempfile",
"thiserror 2.0.18",
"tokio",
"tracing",
"uuid",
]

View File

@ -10,6 +10,7 @@ members = [
"crates/protocol",
"crates/provider",
"crates/pod-registry",
"crates/session-metrics",
"crates/tools",
"crates/tui",
"crates/memory",
@ -28,6 +29,7 @@ memory = { path = "crates/memory" }
pod-registry = { path = "crates/pod-registry" }
protocol = { path = "crates/protocol" }
provider = { path = "crates/provider" }
session-metrics = { path = "crates/session-metrics" }
session-store = { path = "crates/session-store" }
tools = { path = "crates/tools" }

View File

@ -2,7 +2,6 @@
- 内部 Worker / 内部 Pod の Workflow 化 → [tickets/internal-worker-workflow.md](tickets/internal-worker-workflow.md)
- Agent Skills を Workflow として ingest → [tickets/agent-skills.md](tickets/agent-skills.md)
- パーミッション: パターンベースのツール実行制御 → [tickets/permission-extension-point.md](tickets/permission-extension-point.md)
- Resume 時の Scope claim の改善 → [tickets/resume-scope-claim.md](tickets/resume-scope-claim.md)
- Pod CLI: マニフェスト関連フラグの整理 → [tickets/pod-cli-manifest-flags.md](tickets/pod-cli-manifest-flags.md)
- llm-worker のエラー耐性
- HTTP transient リトライ → [tickets/llm-worker-transient-retry.md](tickets/llm-worker-transient-retry.md)
@ -12,10 +11,10 @@
- Run 中の入力キューイング → [tickets/tui-input-queue.md](tickets/tui-input-queue.md)
- ユーザーマニフェストのモデル設定 wizard → [tickets/tui-user-model-setup.md](tickets/tui-user-model-setup.md)
- auto-kick 由来ターンが描画されない → [tickets/tui-pod-event-render.md](tickets/tui-pod-event-render.md)
- spawn 失敗時に Pod の stderr が TUI に表示されない → [tickets/tui-spawn-error-surface.md](tickets/tui-spawn-error-surface.md)
- Manifest: Tool Output / File Upload 上限の分離とデフォルト緩和 → [tickets/manifest-output-upload-limits.md](tickets/manifest-output-upload-limits.md)
- メモリ機構
- 使用頻度メトリクス + Knowledge 化候補レポート → [tickets/memory-usage-metrics.md](tickets/memory-usage-metrics.md)
- セッション内 TODO ツール(注意機構付き) → [tickets/session-todo.md](tickets/session-todo.md)
- セッションメトリクス: Extension 経由の汎用計測レーン(最初の利用者は Prune → [tickets/session-metrics.md](tickets/session-metrics.md)
- ワークスペースのメモリーをLintするヘッドレスCLI
- system-reminder 注入機構の汎用化2件目の利用者が出た時に検討。タグ形式と「履歴を汚さない」原則は session-todo で先行確立
- system-reminder 注入機構の汎用化2件目の利用者が出た時に検討。タグ形式 `<system-reminder>...</system-reminder>` の規約は session-todo-reminder で先行確立。注入された Item は worker.history に append する方針

View File

@ -120,8 +120,35 @@ pub trait Interceptor: Send + Sync {
PromptAction::Continue
}
/// Called before each LLM request. The context can be modified
/// (e.g. for context compaction).
/// Items that should be **committed to `worker.history`** just
/// before the next LLM request. Returned items are `extend`ed into
/// the persistent history (and therefore picked up by the per-turn
/// clone that backs the LLM request, plus the usual
/// history-persistence path).
///
/// Use this for inputs that arrive from outside the LLM and need
/// to be reflected in the on-disk history — notifications,
/// cross-Pod events, system reminders. Do **not** use
/// [`Self::pre_llm_request`] for that purpose: it mutates a
/// per-request clone, so any committed assistant response that
/// reacts to the injection would have no visible trigger on the
/// next turn (or after resume / compaction).
///
/// `pre_llm_request` remains the right place for purely
/// reproducible per-request transformations (pruning, content
/// trimming, cache anchors) that depend only on the existing
/// history.
async fn pending_history_appends(&self) -> Vec<Item> {
Vec::new()
}
/// Called before each LLM request. The context starts as a clone
/// of `worker.history` (after `pending_history_appends` and the
/// Worker's own prune projection have been applied) and can be
/// further modified for that single request only — mutations here
/// are **not** persisted back to history. Use
/// [`Self::pending_history_appends`] for inputs that need to land
/// in history.
async fn pre_llm_request(&self, _context: &mut Vec<Item>) -> PreRequestAction {
PreRequestAction::Continue
}

View File

@ -30,6 +30,43 @@ use crate::llm_client::types::Item;
/// 実際の projection と一致する savings を返す必要がある。
pub type SavingsEstimator = Box<dyn Fn(&[Item], &[usize]) -> u64 + Send + Sync>;
/// Result of one prune evaluation pass, surfaced to the optional
/// [`PruneObserver`] for instrumentation.
///
/// Worker は LLM リクエストごとに 1 回 prune の評価をし、その結果を
/// observer が登録されていればこの値で通知する。fire/skip の判定
/// 結果と、判定材料になった候補数 / 推定 savings / 境界ターン位置を持つ。
#[derive(Debug, Clone)]
pub struct PruneEvaluation {
/// `prunable_indices` の長さ。`Skipped::NoCandidates` の時は 0。
pub candidate_count: usize,
/// 推定された savings (tokens)。`NoCandidates` の時は 0。
pub estimated_savings: u64,
/// `protected_turns` 境界に当たる turn-start アイテムの index。
/// turn 数が `protected_turns` 以下で境界が決まらない場合は `None`。
pub border_turn: Option<usize>,
/// 判定結果。
pub decision: PruneDecision,
}
/// Outcome of one prune evaluation. Each variant is one branch of the
/// "fire vs skip" decision tree the Worker walks before each LLM request.
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum PruneDecision {
/// `prunable_indices` が空 → 何もしない。
SkippedNoCandidates,
/// 候補はあったが推定 savings が `min_savings` 未満 → 何もしない。
SkippedBelowMinSavings,
/// 候補があり savings >= min_savings → projection を適用した。
/// `pruned_count` は `project()` が実際に書き換えた item 数
/// (既に content=None だった候補は 0 計上)。
Fired { pruned_count: usize },
}
/// Optional observer invoked after each prune evaluation, regardless of
/// branch. Pod 等の上位層が install して metrics を発行する。
pub type PruneObserver = Box<dyn Fn(&PruneEvaluation) + Send + Sync>;
/// Configuration for the Prune algorithm.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PruneConfig {
@ -100,12 +137,20 @@ pub fn project(items: &mut [Item], indices: &[usize]) -> usize {
/// Returns an empty vector when there are too few turns or no prunable
/// candidates.
pub fn prunable_indices(items: &[Item], protected_turns: usize) -> Vec<usize> {
evaluate_candidates(items, protected_turns).0
}
/// Same as [`prunable_indices`] but also returns the index of the
/// `protected_turns` boundary (the turn-start item whose tail is
/// protected). `None` when too few turns exist for a boundary to be
/// defined.
pub fn evaluate_candidates(items: &[Item], protected_turns: usize) -> (Vec<usize>, Option<usize>) {
let turn_starts = find_turn_starts(items);
if turn_starts.len() <= protected_turns {
return Vec::new();
return (Vec::new(), None);
}
let boundary = turn_starts[turn_starts.len() - protected_turns];
items[..boundary]
let candidates = items[..boundary]
.iter()
.enumerate()
.filter_map(|(i, item)| match item {
@ -114,7 +159,8 @@ pub fn prunable_indices(items: &[Item], protected_turns: usize) -> Vec<usize> {
} => Some(i),
_ => None,
})
.collect()
.collect();
(candidates, Some(boundary))
}
#[cfg(test)]
@ -239,6 +285,30 @@ mod tests {
assert_eq!(project(&mut items, &candidates), 0);
}
#[test]
fn evaluate_candidates_returns_boundary_index() {
let big = "x".repeat(64);
let items = make_history(&[
("turn1", vec![("s1", Some(&big))]),
("turn2", vec![("s2", Some(&big))]),
("turn3", vec![("s3", Some("keep"))]),
("turn4", vec![("s4", Some("keep too"))]),
]);
let (candidates, border) = evaluate_candidates(&items, 2);
assert_eq!(candidates.len(), 2);
// protected_turns=2 → boundary は turn3 の user message 位置。
// turn1: u/a/c/r (4) + turn2: u/a/c/r (4) = index 8 (turn3 の user)。
assert_eq!(border, Some(8));
}
#[test]
fn evaluate_candidates_no_boundary_when_too_few_turns() {
let items = make_history(&[("only", vec![("s", Some("x"))])]);
let (candidates, border) = evaluate_candidates(&items, 2);
assert!(candidates.is_empty());
assert!(border.is_none());
}
#[test]
fn protected_turns_boundary_exact() {
// 3 turns with protected_turns=2: only turn 1 is a candidate.

View File

@ -184,6 +184,9 @@ pub struct Worker<C: LlmClient, S: WorkerState = Mutable> {
/// by higher layers that own usage measurements. `None` disables
/// the prune projection.
savings_estimator: Option<crate::prune::SavingsEstimator>,
/// Optional observer fired once per prune evaluation (regardless of
/// whether projection actually fired). `None` disables instrumentation.
prune_observer: Option<crate::prune::PruneObserver>,
/// Index of the last stable cache prefix item, set by higher layers.
/// Plumbed into [`Request::cache_anchor`] at request build time.
cache_anchor: Option<usize>,
@ -384,6 +387,16 @@ impl<C: LlmClient, S: WorkerState> Worker<C, S> {
self.savings_estimator = estimator;
}
/// Install an observer notified after each prune evaluation pass.
///
/// Fires once per outgoing LLM request (the same point as the
/// `prune_config` / `savings_estimator` pair), regardless of whether
/// projection actually applied. Intended for upper layers that want
/// to instrument fire/skip rates without owning the prune logic.
pub fn set_prune_observer(&mut self, observer: Option<crate::prune::PruneObserver>) {
self.prune_observer = observer;
}
/// Mark an index into the current history as a stable, cacheable
/// prefix boundary. The value is included in each outgoing
/// [`Request`] via [`Request::cache_anchor`] — caching-aware
@ -843,6 +856,16 @@ impl<C: LlmClient, S: WorkerState> Worker<C, S> {
cb(current_turn);
}
// Drain interceptor-side inputs that are meant to land in
// history (notifications, cross-Pod events, system
// reminders). These are committed *before* the per-request
// clone so they participate in the LLM request below and
// get persisted by the upper layer that owns history.json.
let pending = self.interceptor.pending_history_appends().await;
if !pending.is_empty() {
self.history.extend(pending);
}
// Clone the history into a per-request context. Everything
// below (prune projection, interceptor hooks) mutates only
// this clone, so the persistent `self.history` stays intact.
@ -854,9 +877,16 @@ impl<C: LlmClient, S: WorkerState> Worker<C, S> {
// threshold. Worker does not own usage history itself; the
// estimator is injected by the layer that does.
if let (Some(config), Some(estimator)) = (&self.prune_config, &self.savings_estimator) {
let candidates =
crate::prune::prunable_indices(&request_context, config.protected_turns);
if !candidates.is_empty() {
let (candidates, border_turn) =
crate::prune::evaluate_candidates(&request_context, config.protected_turns);
let evaluation = if candidates.is_empty() {
crate::prune::PruneEvaluation {
candidate_count: 0,
estimated_savings: 0,
border_turn,
decision: crate::prune::PruneDecision::SkippedNoCandidates,
}
} else {
let savings = estimator(&request_context, &candidates);
if savings >= config.min_savings {
let pruned = crate::prune::project(&mut request_context, &candidates);
@ -867,7 +897,25 @@ impl<C: LlmClient, S: WorkerState> Worker<C, S> {
"Projected old tool-result content out of request context"
);
}
crate::prune::PruneEvaluation {
candidate_count: candidates.len(),
estimated_savings: savings,
border_turn,
decision: crate::prune::PruneDecision::Fired {
pruned_count: pruned,
},
}
} else {
crate::prune::PruneEvaluation {
candidate_count: candidates.len(),
estimated_savings: savings,
border_turn,
decision: crate::prune::PruneDecision::SkippedBelowMinSavings,
}
}
};
if let Some(observer) = &self.prune_observer {
observer(&evaluation);
}
}
@ -1077,6 +1125,7 @@ impl<C: LlmClient> Worker<C, Mutable> {
tool_output_limits: None,
prune_config: None,
savings_estimator: None,
prune_observer: None,
cache_anchor: None,
cache_key: None,
_state: PhantomData,
@ -1334,6 +1383,7 @@ impl<C: LlmClient> Worker<C, Mutable> {
tool_output_limits: self.tool_output_limits,
prune_config: self.prune_config,
savings_estimator: self.savings_estimator,
prune_observer: self.prune_observer,
cache_anchor: self.cache_anchor,
cache_key: self.cache_key,
_state: PhantomData,
@ -1414,6 +1464,7 @@ impl<C: LlmClient> Worker<C, Locked> {
tool_output_limits: self.tool_output_limits,
prune_config: self.prune_config,
savings_estimator: self.savings_estimator,
prune_observer: self.prune_observer,
cache_anchor: self.cache_anchor,
cache_key: self.cache_key,
_state: PhantomData,

View File

@ -32,7 +32,7 @@ pub(crate) fn rules_overlap(a: &ScopeRule, b: &ScopeRule) -> bool {
}
/// Does `cover` fully contain `inner`'s claimed paths?
fn covers_fully(cover: &ScopeRule, inner: &ScopeRule) -> bool {
pub(crate) fn covers_fully(cover: &ScopeRule, inner: &ScopeRule) -> bool {
if cover.permission < inner.permission {
return false;
}
@ -44,8 +44,9 @@ fn covers_fully(cover: &ScopeRule, inner: &ScopeRule) -> bool {
}
/// Check whether `rule` is contained in `parent`'s effective write
/// scope: its allow set covers `rule`, and no child of `parent` has
/// already taken a piece that would overlap `rule`.
/// scope: its allow set covers `rule`, no deny rule caps it, and no
/// child of `parent` has already taken a piece that would overlap
/// `rule`.
pub fn is_within_effective_write(lock: &LockFile, parent: &str, rule: &ScopeRule) -> bool {
let Some(alloc) = lock.find(parent) else {
return false;
@ -61,6 +62,14 @@ pub fn is_within_effective_write(lock: &LockFile, parent: &str, rule: &ScopeRule
if !covered {
return false;
}
let denied = alloc
.scope_deny
.iter()
.filter(|r| r.permission == Permission::Write)
.any(|r| rules_overlap(r, rule));
if denied {
return false;
}
let child_conflict = lock
.allocations
.iter()
@ -71,7 +80,14 @@ pub fn is_within_effective_write(lock: &LockFile, parent: &str, rule: &ScopeRule
!child_conflict
}
/// Find the Pod that actually owns a write scope overlapping `rule`.
/// The Pod and rule that actually own a conflicting write scope.
#[derive(Debug, Clone)]
pub struct ConflictOwner {
pub pod_name: String,
pub rule: ScopeRule,
}
/// Find the Pod/rule that actually owns a write scope overlapping `rule`.
///
/// Walks the delegation tree: if an allocation overlaps `rule`, we
/// descend into its children and return the deepest overlapping node
@ -82,38 +98,47 @@ pub fn find_conflict_owner(
lock: &LockFile,
rule: &ScopeRule,
exempt: Option<&str>,
) -> Option<String> {
) -> Option<ConflictOwner> {
find_conflict_owners(lock, rule, exempt).into_iter().next()
}
/// Find every top-level delegation tree owner that conflicts with `rule`.
pub fn find_conflict_owners(
lock: &LockFile,
rule: &ScopeRule,
exempt: Option<&str>,
) -> Vec<ConflictOwner> {
if rule.permission != Permission::Write {
return None;
return Vec::new();
}
for alloc in lock
.allocations
lock.allocations
.iter()
.filter(|a| a.delegated_from.is_none())
{
if let Some(owner) = find_conflict_in_subtree(lock, alloc, rule) {
if Some(owner.as_str()) == exempt {
continue;
}
return Some(owner);
}
}
None
.filter_map(|alloc| find_conflict_in_subtree(lock, alloc, rule))
.filter(|owner| Some(owner.pod_name.as_str()) != exempt)
.collect()
}
fn find_conflict_in_subtree(
lock: &LockFile,
alloc: &Allocation,
rule: &ScopeRule,
) -> Option<String> {
let overlaps_here = alloc
) -> Option<ConflictOwner> {
let overlapping_rule = alloc
.scope_allow
.iter()
.filter(|r| r.permission == Permission::Write)
.any(|r| rules_overlap(r, rule));
if !overlaps_here {
.find(|r| rules_overlap(r, rule))?;
let fully_denied_here = alloc
.scope_deny
.iter()
.filter(|r| r.permission == Permission::Write)
.any(|r| covers_fully(r, rule));
if fully_denied_here {
return None;
}
for child in lock
.allocations
.iter()
@ -123,14 +148,17 @@ fn find_conflict_in_subtree(
return Some(owner);
}
}
Some(alloc.pod_name.clone())
Some(ConflictOwner {
pod_name: alloc.pod_name.clone(),
rule: overlapping_rule.clone(),
})
}
#[cfg(test)]
mod tests {
use super::*;
use crate::test_util::*;
use crate::{ScopeLockError, delegate_scope, register_pod};
use crate::{ScopeLockError, delegate_scope, register_pod, register_pod_with_deny};
use tempfile::TempDir;
#[test]
@ -200,4 +228,69 @@ mod tests {
other => panic!("expected WriteConflict, got {other:?}"),
}
}
#[test]
fn denied_write_region_is_not_claimed_by_restored_parent() {
let dir = TempDir::new().unwrap();
let path = dir.path().join("pods.json");
let mut g = open_empty(&path);
register_pod_with_deny(
&mut g,
"parent".into(),
std::process::id(),
sock("parent"),
vec![write_rule("/src", true)],
vec![write_rule("/src/core", true)],
sid(),
)
.unwrap();
register_pod(
&mut g,
"child".into(),
std::process::id(),
sock("child"),
vec![write_rule("/src/core", true)],
sid(),
)
.unwrap();
}
#[test]
fn partial_deny_does_not_hide_parent_conflict() {
let dir = TempDir::new().unwrap();
let path = dir.path().join("pods.json");
let mut g = open_empty(&path);
register_pod_with_deny(
&mut g,
"parent".into(),
std::process::id(),
sock("parent"),
vec![write_rule("/src", true)],
vec![write_rule("/src/core", true)],
sid(),
)
.unwrap();
let err = register_pod(
&mut g,
"other".into(),
std::process::id(),
sock("other"),
vec![write_rule("/src", true)],
sid(),
)
.unwrap_err();
match err {
ScopeLockError::WriteConflict {
competitor,
competitor_rule,
..
} => {
assert_eq!(competitor, "parent");
assert_eq!(competitor_rule.target, std::path::PathBuf::from("/src"));
}
other => panic!("expected WriteConflict, got {other:?}"),
}
}
}

View File

@ -13,8 +13,12 @@ pub enum ScopeLockError {
Io(#[from] io::Error),
#[error("pod name `{0}` is already registered")]
DuplicatePodName(String),
#[error("requested scope `{}` conflicts with pod `{competitor}`", .rule.target.display())]
WriteConflict { competitor: String, rule: ScopeRule },
#[error("requested scope `{}` conflicts with pod `{competitor}` rule `{}`", .rule.target.display(), .competitor_rule.target.display())]
WriteConflict {
competitor: String,
rule: ScopeRule,
competitor_rule: ScopeRule,
},
#[error(
"requested scope `{}` is not within spawner `{spawner}`'s effective scope",
.rule.target.display()

View File

@ -22,11 +22,16 @@ mod table;
#[cfg(test)]
mod test_util;
pub use conflict::{find_conflict_owner, is_within_effective_write};
pub use conflict::{
ConflictOwner, find_conflict_owner, find_conflict_owners, is_within_effective_write,
};
pub use error::ScopeLockError;
pub use lifecycle::{
ScopeAllocationGuard, SessionLockInfo, adopt_allocation, install_top_level, lookup_session,
update_session,
ScopeAllocationGuard, SessionLockInfo, adopt_allocation, install_top_level,
install_top_level_with_deny, lookup_session, update_session,
};
pub use mutate::{
delegate_scope, reclaim_stale, reclaim_stale_with, register_pod, register_pod_with_deny,
release_pod,
};
pub use mutate::{delegate_scope, reclaim_stale, reclaim_stale_with, register_pod, release_pod};
pub use table::{Allocation, LockFile, LockFileGuard, default_registry_path};

View File

@ -8,7 +8,7 @@ use manifest::ScopeRule;
use session_store::SessionId;
use crate::error::ScopeLockError;
use crate::mutate::{register_pod, release_pod};
use crate::mutate::release_pod;
use crate::table::{LockFileGuard, default_registry_path};
/// Owned allocation: on drop, opens the lock file and releases this
@ -46,15 +46,30 @@ pub fn install_top_level(
socket: PathBuf,
scope_allow: Vec<ScopeRule>,
session_id: SessionId,
) -> Result<ScopeAllocationGuard, ScopeLockError> {
install_top_level_with_deny(pod_name, pid, socket, scope_allow, Vec::new(), session_id)
}
/// Open the default lock file, register a top-level Pod with explicit
/// deny rules, and return a guard that will release the allocation on
/// drop.
pub fn install_top_level_with_deny(
pod_name: String,
pid: u32,
socket: PathBuf,
scope_allow: Vec<ScopeRule>,
scope_deny: Vec<ScopeRule>,
session_id: SessionId,
) -> Result<ScopeAllocationGuard, ScopeLockError> {
let lock_path = default_registry_path()?;
let mut guard = LockFileGuard::open(&lock_path)?;
register_pod(
crate::mutate::register_pod_with_deny(
&mut guard,
pod_name.clone(),
pid,
socket,
scope_allow,
scope_deny,
session_id,
)?;
Ok(ScopeAllocationGuard {
@ -176,6 +191,7 @@ mod tests {
pid: placeholder_pid,
socket: sock(pod_name),
scope_allow: vec![write_rule("/tmp/child", true)],
scope_deny: Vec::new(),
delegated_from: None,
session_id: None,
});

View File

@ -7,7 +7,7 @@ use std::path::PathBuf;
use manifest::{Permission, ScopeRule};
use session_store::SessionId;
use crate::conflict::{find_conflict_owner, is_within_effective_write};
use crate::conflict::{find_conflict_owner, find_conflict_owners, is_within_effective_write};
use crate::error::ScopeLockError;
use crate::table::{Allocation, LockFileGuard};
@ -25,6 +25,38 @@ pub fn register_pod(
socket: PathBuf,
scope_allow: Vec<ScopeRule>,
session_id: SessionId,
) -> Result<(), ScopeLockError> {
register_pod_with_deny(
guard,
pod_name,
pid,
socket,
scope_allow,
Vec::new(),
session_id,
)
}
/// Register a top-level Pod with explicit deny rules that reduce the
/// claimed effective write scope.
///
/// Conflict semantics: if every Pod overlapping a requested allow rule
/// is fully covered by one of `scope_deny`, the conflict is suppressed
/// and the registration proceeds. The check is structural (deny ⊇
/// competitor.rule), not relational — it does not verify that the
/// competitor actually descends from this Pod's prior delegations.
/// In practice this is safe because the canonical caller is `restore`,
/// which derives `scope_deny` from the session's own snapshot, so any
/// covered competitor is guaranteed to be a descendant of the original
/// allocation. Direct callers must uphold the same invariant.
pub fn register_pod_with_deny(
guard: &mut LockFileGuard,
pod_name: String,
pid: u32,
socket: PathBuf,
scope_allow: Vec<ScopeRule>,
scope_deny: Vec<ScopeRule>,
session_id: SessionId,
) -> Result<(), ScopeLockError> {
reclaim_stale(guard);
if guard.data().find(&pod_name).is_some() {
@ -41,10 +73,22 @@ pub fn register_pod(
.iter()
.filter(|r| r.permission == Permission::Write)
{
if let Some(competitor) = find_conflict_owner(guard.data(), rule, None) {
let conflicts = find_conflict_owners(guard.data(), rule, None);
let all_denied = !conflicts.is_empty()
&& conflicts.iter().all(|owner| {
scope_deny
.iter()
.filter(|r| r.permission == Permission::Write)
.any(|deny| crate::conflict::covers_fully(deny, &owner.rule))
});
if all_denied {
continue;
}
if let Some(competitor) = conflicts.into_iter().next() {
return Err(ScopeLockError::WriteConflict {
competitor,
competitor: competitor.pod_name,
rule: rule.clone(),
competitor_rule: competitor.rule,
});
}
}
@ -53,6 +97,7 @@ pub fn register_pod(
pid,
socket,
scope_allow,
scope_deny,
delegated_from: None,
session_id: Some(session_id),
});
@ -88,8 +133,9 @@ pub fn delegate_scope(
if rule.permission == Permission::Write {
if let Some(competitor) = find_conflict_owner(guard.data(), rule, Some(spawner)) {
return Err(ScopeLockError::WriteConflict {
competitor,
competitor: competitor.pod_name,
rule: rule.clone(),
competitor_rule: competitor.rule,
});
}
}
@ -99,6 +145,7 @@ pub fn delegate_scope(
pid,
socket,
scope_allow,
scope_deny: Vec::new(),
delegated_from: Some(spawner.into()),
// Pre-reservation. The child fills in its own session_id when
// it calls `adopt_allocation` after the worker is built.

View File

@ -35,6 +35,11 @@ pub struct Allocation {
pub socket: PathBuf,
/// Allow rules granted to this Pod (write + read).
pub scope_allow: Vec<ScopeRule>,
/// Deny rules that cap this Pod's effective scope. Normally empty for
/// fresh allocations; restored Pods use this to avoid reclaiming
/// previously delegated write regions.
#[serde(default)]
pub scope_deny: Vec<ScopeRule>,
/// Name of the Pod that delegated scope to this one, or `None` for
/// a top-level Pod started directly by a human.
pub delegated_from: Option<String>,

View File

@ -28,6 +28,7 @@ libc = { workspace = true }
schemars = { workspace = true }
memory = { workspace = true }
uuid = { workspace = true, features = ["v7"] }
session-metrics = { workspace = true }
[dev-dependencies]
dotenv = "0.15.0"

View File

@ -0,0 +1,50 @@
//! Sync buffer for `session_metrics::Metric` values queued from inside
//! Worker callbacks (which run synchronously and cannot themselves
//! perform `async` store writes).
//!
//! Pod drains this buffer in `persist_turn` and writes each metric via
//! `session_metrics::record_metric`, alongside the regular `LlmUsage`
//! entries.
use std::sync::Mutex;
use session_metrics::Metric;
pub(crate) struct MetricsTracker {
pending: Mutex<Vec<Metric>>,
}
impl MetricsTracker {
pub(crate) fn new() -> Self {
Self {
pending: Mutex::new(Vec::new()),
}
}
/// Queue a metric for the next `persist_turn` flush.
pub(crate) fn push(&self, metric: Metric) {
self.pending.lock().unwrap().push(metric);
}
/// Drain all queued metrics. Called by Pod after a run completes.
pub(crate) fn drain(&self) -> Vec<Metric> {
std::mem::take(&mut *self.pending.lock().unwrap())
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn push_then_drain_returns_in_order_and_clears() {
let t = MetricsTracker::new();
t.push(Metric::now("a"));
t.push(Metric::now("b"));
let drained = t.drain();
assert_eq!(drained.len(), 2);
assert_eq!(drained[0].name, "a");
assert_eq!(drained[1].name, "b");
assert!(t.drain().is_empty());
}
}

View File

@ -1,3 +1,4 @@
pub(crate) mod metrics_tracker;
pub(crate) mod prune;
pub(crate) mod state;
pub(crate) mod token_counter;

View File

@ -5,10 +5,16 @@
//! 直後。Worker は usage 履歴を知らないので、`min_savings` 判定に使う savings
//! の見積もりはコールバックで外部から注入する。このモジュールはそのコールバック
//! を組み立てて Worker に差し込むための `impl Pod` を提供する。
//!
//! 同じ経路で `PruneObserver` も install し、評価のたびに `prune.fire` /
//! `prune.skip` metric を `MetricsTracker` に積む。`Fired` 時は uuid を
//! `UsageTracker` にも stash しておき、後続の `LlmUsage` と組で
//! `prune.post_request` を吐けるようにする。
use llm_worker::Item;
use llm_worker::llm_client::client::LlmClient;
use llm_worker::prune::{PruneConfig, SavingsEstimator};
use llm_worker::prune::{PruneConfig, PruneDecision, PruneObserver, SavingsEstimator};
use session_metrics::Metric;
use session_store::Store;
use crate::Pod;
@ -24,6 +30,12 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
/// Measurement-less estimates (before the first LLM call, or immediately
/// after a compact) return `0` from the estimator, which naturally
/// prevents the prune projection from firing until usage data exists.
///
/// Also installs a [`PruneObserver`] that pushes `prune.fire` /
/// `prune.skip` metrics into the shared [`MetricsTracker`]. On `Fired`
/// the observer additionally stashes a fresh correlation_id in
/// [`UsageTracker`] so the next `LlmUsage` can be paired with a
/// `prune.post_request` metric carrying the same id.
pub fn attach_prune(&mut self, config: PruneConfig) {
let usage = self.usage_history_handle();
let estimator: SavingsEstimator = Box::new(move |history: &[Item], indices| {
@ -34,9 +46,43 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
_ => est.tokens,
}
});
let metrics = self.metrics_tracker_handle();
let usage_tracker = self.usage_tracker_handle();
let observer: PruneObserver = Box::new(move |eval| {
match &eval.decision {
PruneDecision::Fired { .. } => {
let correlation_id = uuid::Uuid::now_v7().to_string();
let mut metric = Metric::now("prune.fire")
.with_value(eval.estimated_savings as f64)
.with_correlation_id(&correlation_id)
.with_dimension("candidate_count", eval.candidate_count.to_string());
if let Some(border) = eval.border_turn {
metric = metric.with_dimension("border_turn", border.to_string());
}
metrics.push(metric);
usage_tracker.note_correlation_id(correlation_id);
}
PruneDecision::SkippedNoCandidates => {
metrics.push(
Metric::now("prune.skip").with_dimension("reason", "no_candidates"),
);
}
PruneDecision::SkippedBelowMinSavings => {
metrics.push(
Metric::now("prune.skip")
.with_dimension("reason", "below_min_savings")
.with_dimension("candidate_count", eval.candidate_count.to_string())
.with_value(eval.estimated_savings as f64),
);
}
}
});
let worker = self.worker_mut();
worker.set_prune_config(Some(config));
worker.set_savings_estimator(Some(estimator));
worker.set_prune_observer(Some(observer));
}
/// If the manifest has a `[compaction]` section, build a `PruneConfig`

View File

@ -19,19 +19,35 @@ use std::sync::Mutex;
use llm_worker::UsageRecord;
use llm_worker::timeline::event::UsageEvent;
/// One drained measurement: the underlying `UsageRecord` plus an optional
/// `correlation_id` stamped by the prune projection (or any other future
/// upstream observer) so that downstream metrics emitted alongside this
/// record can be joined to it after the fact.
#[derive(Debug, Clone)]
pub(crate) struct RecordedUsage {
pub(crate) record: UsageRecord,
pub(crate) correlation_id: Option<String>,
}
/// Shared between the pre-request hook, the `on_usage` callback, and Pod.
pub(crate) struct UsageTracker {
/// `history.len()` captured at the most recent `pre_llm_request`.
/// Cleared when paired with an incoming `on_usage` event.
pending_history_len: Mutex<Option<usize>>,
/// Optional `correlation_id` set by an upstream observer (currently
/// the prune projection on `Fired`). Paired into the next
/// `RecordedUsage` and cleared. Skips that don't fire leave this
/// `None`, so the resulting record carries no correlation.
pending_correlation_id: Mutex<Option<String>>,
/// Records accumulated during the current run; drained by Pod.
pending_records: Mutex<Vec<UsageRecord>>,
pending_records: Mutex<Vec<RecordedUsage>>,
}
impl UsageTracker {
pub(crate) fn new() -> Self {
Self {
pending_history_len: Mutex::new(None),
pending_correlation_id: Mutex::new(None),
pending_records: Mutex::new(Vec::new()),
}
}
@ -41,16 +57,29 @@ impl UsageTracker {
*self.pending_history_len.lock().unwrap() = Some(history_len);
}
/// Stash a `correlation_id` to be paired into the next `RecordedUsage`.
/// Currently invoked by the prune observer on `Fired` so that the
/// `prune.fire` metric and the `prune.post_request` metric (emitted
/// alongside the resulting `LlmUsage`) carry the same join key.
///
/// Overwrites any previous unconsumed value — by construction the
/// observer fires at most once per outgoing LLM request, immediately
/// before the pre-request hook captures `history_len`.
pub(crate) fn note_correlation_id(&self, id: String) {
*self.pending_correlation_id.lock().unwrap() = Some(id);
}
/// Called from the `on_usage` callback with the aggregated final
/// UsageEvent. If a `history_len` was previously stashed via
/// `note_request`, builds a `UsageRecord` and pushes it onto the buffer.
/// If not (e.g. test code that fires Usage outside a request), drops
/// the event.
/// `note_request`, builds a `RecordedUsage` and pushes it onto the
/// buffer. If not (e.g. test code that fires Usage outside a request),
/// drops the event.
pub(crate) fn record_usage(&self, event: &UsageEvent) {
let history_len = match self.pending_history_len.lock().unwrap().take() {
Some(n) => n,
None => return,
};
let correlation_id = self.pending_correlation_id.lock().unwrap().take();
// UsageEvent.input_tokens は scheme 層で「占有量(プロンプト全長)」に
// 正規化済みである前提Anthropic は cache_read + cache_creation を
// 加算して emit する)。
@ -58,18 +87,21 @@ impl UsageTracker {
let cache_read = event.cache_read_input_tokens.unwrap_or(0);
let cache_write = event.cache_creation_input_tokens.unwrap_or(0);
let output = event.output_tokens.unwrap_or(0);
self.pending_records.lock().unwrap().push(UsageRecord {
history_len,
input_total_tokens: input_total,
cache_read_tokens: cache_read,
cache_write_tokens: cache_write,
output_tokens: output,
self.pending_records.lock().unwrap().push(RecordedUsage {
record: UsageRecord {
history_len,
input_total_tokens: input_total,
cache_read_tokens: cache_read,
cache_write_tokens: cache_write,
output_tokens: output,
},
correlation_id,
});
}
/// Drain accumulated records. Called by Pod after a run completes,
/// before persisting the turn.
pub(crate) fn drain(&self) -> Vec<UsageRecord> {
pub(crate) fn drain(&self) -> Vec<RecordedUsage> {
std::mem::take(&mut *self.pending_records.lock().unwrap())
}
}
@ -96,11 +128,12 @@ mod tests {
let records = tracker.drain();
assert_eq!(records.len(), 1);
assert_eq!(records[0].history_len, 5);
assert_eq!(records[0].input_total_tokens, 1000);
assert_eq!(records[0].cache_read_tokens, 800);
assert_eq!(records[0].cache_write_tokens, 100);
assert_eq!(records[0].output_tokens, 42);
assert_eq!(records[0].record.history_len, 5);
assert_eq!(records[0].record.input_total_tokens, 1000);
assert_eq!(records[0].record.cache_read_tokens, 800);
assert_eq!(records[0].record.cache_write_tokens, 100);
assert_eq!(records[0].record.output_tokens, 42);
assert!(records[0].correlation_id.is_none());
}
#[test]
@ -129,8 +162,25 @@ mod tests {
let records = tracker.drain();
assert_eq!(records.len(), 2);
assert_eq!(records[0].history_len, 5);
assert_eq!(records[1].history_len, 10);
assert_eq!(records[1].cache_read_tokens, 50);
assert_eq!(records[0].record.history_len, 5);
assert_eq!(records[1].record.history_len, 10);
assert_eq!(records[1].record.cache_read_tokens, 50);
}
#[test]
fn correlation_id_pairs_with_next_record_only() {
let tracker = UsageTracker::new();
// Stash an ID, then run a request → the ID should land on this record.
tracker.note_correlation_id("abc".into());
tracker.note_request(5);
tracker.record_usage(&make_event(100, 0, 0, 20));
// Next request without a fresh stash → no correlation_id.
tracker.note_request(10);
tracker.record_usage(&make_event(200, 50, 0, 30));
let records = tracker.drain();
assert_eq!(records.len(), 2);
assert_eq!(records[0].correlation_id.as_deref(), Some("abc"));
assert!(records[1].correlation_id.is_none());
}
}

View File

@ -134,6 +134,8 @@ impl PodController {
// `PodFsView` to the shared state once the latter exists.
let fs_for_view: tools::ScopedFs;
let scope_change_sink = pod.scope_change_sink();
// Register event bridge callbacks on the worker
{
let worker = pod.worker_mut();
@ -257,7 +259,8 @@ impl PodController {
// worker) reads from it, and any future scope mutation
// (SpawnPod-style revoke, future GrantScope) propagates
// through it.
let fs = tools::ScopedFs::with_shared_scope(scope_handle.clone(), pwd_for_tools.clone());
let fs =
tools::ScopedFs::with_shared_scope(scope_handle.clone(), pwd_for_tools.clone());
let tracker = tools::Tracker::new();
// The same ScopedFs also powers the IPC `ListCompletions`
// query — keep a clone for the FS view we attach below,
@ -293,6 +296,7 @@ impl PodController {
self_parent_socket.clone(),
spawner_model.clone(),
scope_handle.clone(),
scope_change_sink.clone(),
));
worker.register_tool(send_to_pod_tool(spawned_registry.clone()));
worker.register_tool(read_pod_output_tool(spawned_registry.clone()));
@ -448,6 +452,9 @@ impl PodController {
}
Method::Notify { message } => {
let _ = event_tx.send(Event::Notify {
message: message.clone(),
});
pod.push_notify(message);
if shared_state.get_status() != PodStatus::Idle {
// RUNNING / Paused: the buffer push is the
@ -609,6 +616,10 @@ impl PodController {
Method::GetHistory | Method::ListCompletions { .. } => {}
Method::PodEvent(event) => {
// Echo the received event to all subscribers so
// every client sees the input that drove any
// following auto-kicked turn.
let _ = event_tx.send(Event::PodEvent(event.clone()));
// (1) system side effects — idempotent and
// tolerant of out-of-order delivery (e.g.
// `TurnEnded` arriving after `ShutDown`).
@ -809,12 +820,16 @@ where
});
}
Some(Method::Notify { message }) => {
let _ = event_tx.send(Event::Notify {
message: message.clone(),
});
// Route into the buffer; the in-flight turn will
// drain it at its next pre_llm_request.
notify_buffer.push(message);
}
Some(Method::GetHistory | Method::ListCompletions { .. }) => {}
Some(Method::PodEvent(event)) => {
let _ = event_tx.send(Event::PodEvent(event.clone()));
// mpsc is consume-once, so we cannot defer this
// to the next main-loop iteration — drop here
// would lose the event entirely (children fire

View File

@ -17,6 +17,7 @@ use llm_worker::interceptor::{
Interceptor, PostToolAction, PreRequestAction, PreToolAction, PromptAction, ToolCallInfo,
ToolResultInfo, TurnEndAction,
};
use tracing::warn;
use llm_worker::tool::ToolOutput;
use tracing::info;
@ -28,7 +29,6 @@ use crate::hook::{
use crate::ipc::notify_buffer::{NotifyBuffer, format_notify};
use crate::prompt::catalog::PromptCatalog;
use llm_worker::token_counter::total_tokens;
use tracing::warn;
/// Maximum number of bytes copied into `TurnEndInfo::final_text_preview`.
const FINAL_TEXT_PREVIEW_LIMIT: usize = 512;
@ -40,8 +40,10 @@ pub(crate) struct PodInterceptor {
/// per-request `context` to estimate current occupancy for threshold
/// checks. `None` when compaction is disabled (both thresholds unset).
usage_history: Option<Arc<Mutex<Vec<UsageRecord>>>>,
/// Pending-notification buffer drained into the per-request
/// context at the head of `pre_llm_request`.
/// Pending-notification buffer drained into `worker.history`
/// via [`Self::pending_history_appends`] just before the next LLM
/// request. The Worker `extend`s these into its persistent history
/// so the LLM has a visible trigger for any reaction it commits.
pending_notifies: NotifyBuffer,
/// Submit-scoped stash of resolver-produced system messages.
/// Drained inside `on_prompt_submit` and returned via
@ -122,6 +124,27 @@ impl Interceptor for PodInterceptor {
}
}
async fn pending_history_appends(&self) -> Vec<Item> {
let drained = self.pending_notifies.drain();
if drained.is_empty() {
return Vec::new();
}
let mut items = Vec::with_capacity(drained.len());
for n in drained {
match format_notify(&n, &self.prompts) {
Ok(item) => items.push(item),
Err(e) => {
// A render failure here would starve the LLM of
// the notify text. Fall back to the raw message
// so the trigger still lands in history.
warn!(error = %e, "failed to render notify_wrapper; using raw message");
items.push(Item::system_message(n.message.clone()));
}
}
}
items
}
async fn pre_llm_request(&self, context: &mut Vec<Item>) -> PreRequestAction {
let current_tokens = self.estimated_tokens(context);
@ -140,24 +163,6 @@ impl Interceptor for PodInterceptor {
}
}
// Internal mechanism: drain pending `Method::Notify` notifications
// into the per-request context as transient system messages.
// These are not persisted to the Worker history; they exist only
// for this single LLM request.
for n in self.pending_notifies.drain() {
match format_notify(&n, &self.prompts) {
Ok(item) => context.push(item),
Err(e) => {
// A render failure here would starve the LLM of the
// notify text. Fall back to the raw message —
// it still carries the intent, just without the
// wrapper phrasing.
warn!(error = %e, "failed to render notify_wrapper; using raw message");
context.push(Item::system_message(n.message.clone()));
}
}
}
let info = PreRequestInfo {
item_count: context.len(),
estimated_tokens: current_tokens,
@ -406,7 +411,7 @@ mod tests {
}
#[tokio::test]
async fn pre_llm_request_drains_pending_notifies_into_context() {
async fn pending_history_appends_drains_buffer_into_items() {
let registry = Arc::new(HookRegistryBuilder::new().build());
let buffer = NotifyBuffer::new();
buffer.push("first".into());
@ -420,49 +425,52 @@ mod tests {
Arc::new(Mutex::new(Vec::new())),
PromptCatalog::builtins_only().unwrap(),
);
let mut ctx: Vec<Item> = vec![Item::user_message("hi")];
let action = interceptor.pre_llm_request(&mut ctx).await;
assert!(matches!(action, PreRequestAction::Continue));
// Original user message preserved, two notifications appended in order.
assert_eq!(ctx.len(), 3);
let second = ctx[1].as_text().unwrap_or_default();
let third = ctx[2].as_text().unwrap_or_default();
let items = interceptor.pending_history_appends().await;
assert_eq!(items.len(), 2);
let first = items[0].as_text().unwrap_or_default();
let second = items[1].as_text().unwrap_or_default();
assert!(first.contains("[Notification]"));
assert!(first.contains("first"));
assert!(second.contains("[Notification]"));
assert!(second.contains("first"));
assert!(third.contains("[Notification]"));
assert!(third.contains("second"));
// Buffer is drained after a single pre_llm_request call.
assert!(buffer.is_empty());
assert!(second.contains("second"));
assert!(
buffer.is_empty(),
"buffer must be drained after pending_history_appends"
);
// Empty buffer → empty Vec (no synthesised items).
let again = interceptor.pending_history_appends().await;
assert!(again.is_empty());
}
#[tokio::test]
async fn pre_llm_request_skips_notification_injection_when_yielding() {
// When compaction yields, notifications remain in the buffer for
// the next pre_llm_request (after compaction + resume).
async fn pre_llm_request_does_not_touch_pending_notifies() {
// The drain lane has moved to `pending_history_appends`;
// `pre_llm_request` must leave the buffer alone and not inject
// anything itself.
let registry = Arc::new(HookRegistryBuilder::new().build());
let buffer = NotifyBuffer::new();
buffer.push("msg".into());
let state = Arc::new(CompactState::new(None, Some(100), 2));
let ctx_items = vec![Item::user_message("hi")];
let history = usage_handle_with(ctx_items.len(), 200);
let interceptor = PodInterceptor::new(
registry,
Some(state),
Some(history),
None,
None,
buffer.clone(),
Arc::new(Mutex::new(Vec::new())),
PromptCatalog::builtins_only().unwrap(),
);
let mut ctx = ctx_items;
let mut ctx: Vec<Item> = vec![Item::user_message("hi")];
let action = interceptor.pre_llm_request(&mut ctx).await;
assert!(matches!(action, PreRequestAction::Yield));
// Notifications were not drained (still held for post-compact resume).
assert_eq!(ctx.len(), 1);
assert_eq!(buffer.len(), 1);
assert!(matches!(action, PreRequestAction::Continue));
assert_eq!(ctx.len(), 1, "pre_llm_request must not append notifies");
assert_eq!(
buffer.len(),
1,
"pre_llm_request must not drain the notify buffer"
);
}
#[tokio::test]

View File

@ -1,9 +1,22 @@
//! Pending-notify buffer for `Method::Notify`.
//! Pending-notify buffer for `Method::Notify` and `Method::PodEvent`.
//!
//! Notify entries are queued here by the Controller and drained by
//! `PodInterceptor::pre_llm_request` into the per-request context
//! (never into the Worker's persistent history). Each queued entry
//! becomes one `Item::system_message` in the outgoing request.
//! Entries are queued here by the Controller (on receipt of the
//! corresponding IPC method) and drained by
//! `PodInterceptor::pending_history_appends`, which the Worker calls
//! at the head of each turn loop iteration to `extend` them into the
//! persistent `worker.history`. Each queued entry becomes one
//! `Item::system_message`.
//!
//! This is the **single lane** for "system messages produced by Pod
//! state that should land in the next LLM request": Notify, PodEvent,
//! and any future `<system-reminder>` injection all ride this queue
//! (or a sibling queue with the same lifecycle). Per
//! `tickets/notify-history-persist.md` and `AGENTS.md` (LLM コンテキスト
//! の加工原則), there is **no** "transient, history-skipping" lane —
//! everything injected into a request is also committed to history so
//! that any LLM reaction has a visible trigger across turns, resume,
//! and compaction, and so the Anthropic prompt cache prefix stays
//! stable across requests.
use std::collections::VecDeque;
use std::sync::{Arc, Mutex};
@ -68,9 +81,10 @@ impl NotifyBuffer {
}
/// Format a single pending notify entry into the `Item::system_message`
/// that gets injected into the per-request context. The wrapper body
/// comes from `PodPrompt::NotifyWrapper` so the surrounding phrasing
/// can be customised via a prompt pack (translation, tone, ...).
/// that gets appended to `worker.history` just before the next LLM
/// request. The wrapper body comes from `PodPrompt::NotifyWrapper` so
/// the surrounding phrasing can be customised via a prompt pack
/// (translation, tone, ...).
pub(crate) fn format_notify(
n: &PendingNotify,
prompts: &PromptCatalog,

View File

@ -7,12 +7,12 @@ use llm_worker::llm_client::RequestConfig;
use llm_worker::llm_client::client::LlmClient;
use llm_worker::state::Mutable;
use llm_worker::{ToolOutputLimits, UsageRecord, Worker, WorkerError, WorkerResult};
use session_store::{EntryHash, SessionId, SessionStartState, Store, StoreError};
use session_store::{EntryHash, PodScopeSnapshot, SessionId, SessionStartState, Store, StoreError};
use tracing::{info, warn};
use manifest::{
PodManifest, PodManifestConfig, ResolveError, Scope, ScopeError, ScopeRule, SharedScope,
WorkerManifest,
PodManifest, PodManifestConfig, ResolveError, Scope, ScopeConfig, ScopeError, ScopeRule,
SharedScope, WorkerManifest,
};
use crate::compact::state::CompactState;
@ -77,6 +77,11 @@ pub struct Pod<C: LlmClient, St: Store> {
/// Captures `(history_len, UsageEvent)` pairs during a run; drained
/// in `persist_turn` and persisted as `LogEntry::LlmUsage` entries.
usage_tracker: Arc<UsageTracker>,
/// Sync-side buffer for `Metric` values queued from inside Worker
/// callbacks (currently the prune observer). Drained in `persist_turn`
/// and written via `session_metrics::record_metric` alongside
/// `LogEntry::LlmUsage`. Always present after construction.
metrics_tracker: Arc<crate::compact::metrics_tracker::MetricsTracker>,
/// Cumulative Usage measurement timeline, one entry per LLM call.
/// Restored from session log on `restore`, appended on each persist.
/// Read by token-accounting APIs (`Pod::total_tokens`, etc.).
@ -143,6 +148,10 @@ pub struct Pod<C: LlmClient, St: Store> {
/// Phase 2 (consolidation) workers set this to false so the
/// agentic worker pulls knowledge through the search tools instead.
inject_resident_knowledge: bool,
/// Latest runtime scope snapshot queued by dynamic scope changes.
/// Drained into the session log before the next turn result is
/// persisted, so resume never silently reclaims delegated writes.
pending_scope_snapshot: Arc<Mutex<Option<PodScopeSnapshot>>>,
/// Phase 1 (memory.extract) reentry guard. `true` while an extract
/// worker is running; subsequent triggers are skipped per spec
/// (`docs/plan/memory.md` §Phase 1 並走防止). `Arc<AtomicBool>` so
@ -203,6 +212,7 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
interceptor_installed: false,
compact_state: None,
usage_tracker: Arc::new(UsageTracker::new()),
metrics_tracker: Arc::new(crate::compact::metrics_tracker::MetricsTracker::new()),
usage_history: Arc::new(Mutex::new(Vec::<UsageRecord>::new())),
tracker: None,
system_prompt_template: None,
@ -216,6 +226,7 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
workflow_registry: memory::WorkflowRegistry::empty(),
memory_layout: None,
inject_resident_knowledge: true,
pending_scope_snapshot: Arc::new(Mutex::new(None)),
extract_in_flight: Arc::new(AtomicBool::new(false)),
consolidation_in_flight: Arc::new(AtomicBool::new(false)),
extract_pointer: Mutex::new(None),
@ -307,6 +318,55 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
.update(|cur| cur.with_added_deny_rules(revoke.clone()))
}
/// Snapshot the current runtime scope in the session log. The entry
/// is intentionally appended as soon as a session head exists: if the
/// process later exits while children keep their allocations, resume
/// can restore the narrowed scope instead of reclaiming delegated
/// writes.
pub async fn persist_scope_snapshot(&mut self) -> Result<(), StoreError> {
if self.head_hash.is_none() {
return Ok(());
}
let snapshot = {
let scope = self.scope.snapshot();
PodScopeSnapshot {
allow: scope.allow_rules(),
deny: scope.deny_rules(),
}
};
session_store::save_pod_scope(&self.store, self.session_id, &mut self.head_hash, &snapshot)
.await
}
/// Cloneable callback handed to dynamic-scope tools. It cannot append
/// directly to the async store from a sync tool callback, so it records
/// the latest snapshot and the controller flushes it after the tool
/// turn completes.
pub fn scope_change_sink(&self) -> Arc<dyn Fn(PodScopeSnapshot) + Send + Sync> {
let pending = self.pending_scope_snapshot.clone();
Arc::new(move |snapshot| {
*pending.lock().expect("pending_scope_snapshot poisoned") = Some(snapshot);
})
}
async fn flush_pending_scope_snapshot(&mut self) -> Result<(), StoreError> {
let snapshot = self
.pending_scope_snapshot
.lock()
.expect("pending_scope_snapshot poisoned")
.take();
if let Some(snapshot) = snapshot {
session_store::save_pod_scope(
&self.store,
self.session_id,
&mut self.head_hash,
&snapshot,
)
.await?;
}
Ok(())
}
/// Direct access to the underlying Worker.
pub fn worker(&self) -> &Worker<C, Mutable> {
self.worker.as_ref().expect("worker taken during run")
@ -391,6 +451,26 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
self.usage_history.clone()
}
/// Handle to the per-LLM-request `UsageTracker`.
///
/// Sibling modules (e.g. the prune observer) clone this `Arc` to stash
/// per-request side state (e.g. a `correlation_id`) that pairs with
/// the next `LlmUsage`.
pub(crate) fn usage_tracker_handle(&self) -> Arc<UsageTracker> {
self.usage_tracker.clone()
}
/// Handle to the synchronous `MetricsTracker` buffer.
///
/// Worker callbacks (e.g. the prune observer) clone this `Arc` and
/// `.push(metric)` into it; Pod drains it in `persist_turn` and
/// writes each metric via `session_metrics::record_metric`.
pub(crate) fn metrics_tracker_handle(
&self,
) -> Arc<crate::compact::metrics_tracker::MetricsTracker> {
self.metrics_tracker.clone()
}
/// Attach the session-scoped file-operation tracker from the builtin
/// `tools` crate. Called by the Controller immediately after it
/// registers the builtin tools on the Worker. Overwrites any
@ -428,6 +508,28 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
}
}
/// Append a metric, swallowing errors so observability writes never
/// fail the surrounding turn. On failure the head hash stays put
/// (the entry is dropped) and a `Warn` alert + `tracing::warn!` are
/// emitted so the failure isn't completely silent.
async fn try_record_metric(&mut self, metric: &session_metrics::Metric) {
if let Err(err) = session_metrics::record_metric(
&self.store,
self.session_id,
&mut self.head_hash,
metric,
)
.await
{
warn!(name = %metric.name, error = %err, "failed to record session metric; dropping");
self.alert(
AlertLevel::Warn,
AlertSource::Pod,
format!("failed to record metric `{}`: {}", metric.name, err),
);
}
}
/// Broadcast a typed `Event` to connected clients. No-op when no
/// `event_tx` is attached (tests / direct `Pod::new` usage) or when
/// no clients are currently subscribed.
@ -437,11 +539,13 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
}
}
/// Push a `Method::Notify` entry onto the pending buffer.
/// Push a `Method::Notify` (or rendered `Method::PodEvent`) entry
/// onto the pending buffer.
///
/// The notification will be injected as an `Item::system_message`
/// into the next outgoing LLM request context (not into history).
/// See [`NotifyBuffer`] for overflow behaviour.
/// The notification will be appended to `worker.history` as an
/// `Item::system_message` just before the next LLM request, via
/// `PodInterceptor::pending_history_appends`. See [`NotifyBuffer`]
/// for overflow behaviour and the lane-of-record rationale.
pub fn push_notify(&self, message: String) {
self.pending_notifies.push(message);
}
@ -903,6 +1007,7 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
let hash =
session_store::create_session_with_id(&self.store, self.session_id, state).await?;
self.head_hash = Some(hash);
self.persist_scope_snapshot().await?;
return Ok(());
}
let prev_session_id = self.session_id;
@ -1059,6 +1164,8 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
session_store::save_delta(&self.store, self.session_id, &mut self.head_hash, new_items)
.await?;
self.flush_pending_scope_snapshot().await?;
let turn_count = self.worker.as_ref().unwrap().turn_count();
session_store::save_turn_end(
&self.store,
@ -1068,13 +1175,37 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
)
.await?;
// Flush any sync-buffered metrics from this run first
// (currently `prune.fire` / `prune.skip` from the prune observer).
// Ordered before LlmUsage so that a `prune.fire` and the
// `prune.post_request` derived from the matching usage record
// appear in the log close together.
//
// Metric writes are intentionally non-fatal: a failure here
// surfaces as a `Warn` alert + `tracing::warn!` and the loop
// continues. Metrics are observability data, not load-bearing
// for run correctness, so a transient FS error must not poison
// the turn record (`save_delta` / `save_turn_end` already landed
// by this point, and `save_run_completed` still needs to land).
let pending_metrics = self.metrics_tracker.drain();
for metric in pending_metrics {
self.try_record_metric(&metric).await;
}
// Persist any LLM Usage measurements collected during this run.
// One LogEntry::LlmUsage per LLM call (the tool loop may have run
// many calls within a single Pod::run). Each is also appended to
// the in-memory `usage_history` so token-accounting APIs see it
// before the next run.
// before the next run. Records carrying a `correlation_id` (set
// by an upstream observer such as the prune projection) also get
// a paired `prune.post_request` metric so cache_read/write can be
// joined back to the originating event.
let usage_records = self.usage_tracker.drain();
for record in usage_records {
for recorded in usage_records {
let crate::compact::usage_tracker::RecordedUsage {
record,
correlation_id,
} = recorded;
session_store::save_usage(
&self.store,
self.session_id,
@ -1086,6 +1217,14 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
record.output_tokens,
)
.await?;
if let Some(id) = correlation_id {
let metric = session_metrics::Metric::now("prune.post_request")
.with_correlation_id(&id)
.with_value(record.cache_read_tokens as f64)
.with_dimension("cache_write_tokens", record.cache_write_tokens.to_string())
.with_dimension("history_len", record.history_len.to_string());
self.try_record_metric(&metric).await;
}
self.usage_history
.lock()
.expect("usage_history poisoned")
@ -1365,6 +1504,7 @@ impl<C: LlmClient, St: Store> Pod<C, St> {
.lock()
.expect("usage_history poisoned")
.clear();
self.persist_scope_snapshot().await?;
// Reset Phase 1 pointer alongside usage_history: the compacted
// session has a fresh log with no `LogEntry::Extension` entries
// yet, so a cold restore here would set extract_pointer to None
@ -1895,6 +2035,7 @@ impl<St: Store> Pod<Box<dyn LlmClient>, St> {
interceptor_installed: false,
compact_state: None,
usage_tracker: Arc::new(UsageTracker::new()),
metrics_tracker: Arc::new(crate::compact::metrics_tracker::MetricsTracker::new()),
usage_history: Arc::new(Mutex::new(Vec::new())),
tracker: None,
system_prompt_template: common.system_prompt_template,
@ -1908,6 +2049,7 @@ impl<St: Store> Pod<Box<dyn LlmClient>, St> {
workflow_registry: common.workflow_registry,
memory_layout: common.memory_layout,
inject_resident_knowledge: true,
pending_scope_snapshot: Arc::new(Mutex::new(None)),
extract_in_flight: Arc::new(AtomicBool::new(false)),
consolidation_in_flight: Arc::new(AtomicBool::new(false)),
extract_pointer: Mutex::new(None),
@ -1956,6 +2098,7 @@ impl<St: Store> Pod<Box<dyn LlmClient>, St> {
interceptor_installed: false,
compact_state: None,
usage_tracker: Arc::new(UsageTracker::new()),
metrics_tracker: Arc::new(crate::compact::metrics_tracker::MetricsTracker::new()),
usage_history: Arc::new(Mutex::new(Vec::new())),
tracker: None,
system_prompt_template: common.system_prompt_template,
@ -1969,6 +2112,7 @@ impl<St: Store> Pod<Box<dyn LlmClient>, St> {
workflow_registry: common.workflow_registry,
memory_layout: common.memory_layout,
inject_resident_knowledge: true,
pending_scope_snapshot: Arc::new(Mutex::new(None)),
extract_in_flight: Arc::new(AtomicBool::new(false)),
consolidation_in_flight: Arc::new(AtomicBool::new(false)),
extract_pointer: Mutex::new(None),
@ -2006,8 +2150,20 @@ impl<St: Store> Pod<Box<dyn LlmClient>, St> {
if state.head_hash.is_none() {
return Err(PodError::SessionEmpty { session_id });
}
let scope_snapshot = state
.pod_scope
.clone()
.ok_or(PodError::SessionScopeMissing { session_id })?;
let common = prepare_pod_common(&manifest, &loader, /* parse_template */ false)?;
let common = prepare_pod_common_with_scope(
&manifest,
&loader,
/* parse_template */ false,
ScopeConfig {
allow: scope_snapshot.allow,
deny: scope_snapshot.deny,
},
)?;
// Atomic: register_pod inside install_top_level rejects when
// another live allocation already holds `session_id`. Wrapping
@ -2018,11 +2174,12 @@ impl<St: Store> Pod<Box<dyn LlmClient>, St> {
.map_err(ScopeLockError::from)?
.join(&manifest.pod.name)
.join("sock");
let scope_allocation = pod_registry::install_top_level(
let scope_allocation = pod_registry::install_top_level_with_deny(
manifest.pod.name.clone(),
std::process::id(),
socket_path,
common.scope.allow_rules(),
common.scope.deny_rules(),
session_id,
)?;
@ -2067,6 +2224,7 @@ impl<St: Store> Pod<Box<dyn LlmClient>, St> {
interceptor_installed: false,
compact_state: None,
usage_tracker: Arc::new(UsageTracker::new()),
metrics_tracker: Arc::new(crate::compact::metrics_tracker::MetricsTracker::new()),
usage_history: Arc::new(Mutex::new(state.usage_history)),
tracker: None,
// Restore replays the saved system_prompt verbatim — no
@ -2082,6 +2240,7 @@ impl<St: Store> Pod<Box<dyn LlmClient>, St> {
workflow_registry: common.workflow_registry,
memory_layout: common.memory_layout,
inject_resident_knowledge: true,
pending_scope_snapshot: Arc::new(Mutex::new(None)),
extract_in_flight: Arc::new(AtomicBool::new(false)),
consolidation_in_flight: Arc::new(AtomicBool::new(false)),
extract_pointer: Mutex::new(extract_pointer),
@ -2296,6 +2455,11 @@ pub enum PodError {
#[error("session {session_id} has no entries to restore")]
SessionEmpty { session_id: SessionId },
#[error(
"session {session_id} has no persisted scope snapshot; refusing resume without explicit scope"
)]
SessionScopeMissing { session_id: SessionId },
}
/// Bundle of resources that every high-level Pod constructor needs:
@ -2329,6 +2493,27 @@ fn prepare_pod_common(
) -> Result<PodCommon, PodError> {
let pwd = current_pwd()?;
let scope = build_scope_with_memory(manifest, &pwd)?;
prepare_pod_common_from_scope(manifest, loader, parse_template, pwd, scope)
}
fn prepare_pod_common_with_scope(
manifest: &PodManifest,
loader: &PromptLoader,
parse_template: bool,
scope_config: ScopeConfig,
) -> Result<PodCommon, PodError> {
let pwd = current_pwd()?;
let scope = Scope::from_config(&scope_config).map_err(PodError::Scope)?;
prepare_pod_common_from_scope(manifest, loader, parse_template, pwd, scope)
}
fn prepare_pod_common_from_scope(
manifest: &PodManifest,
loader: &PromptLoader,
parse_template: bool,
pwd: PathBuf,
scope: Scope,
) -> Result<PodCommon, PodError> {
if !scope.is_readable(&pwd) {
return Err(PodError::PwdOutsideScope { pwd });
}

View File

@ -20,6 +20,7 @@ use manifest::{
use protocol::Method;
use protocol::stream::JsonLineWriter;
use serde::Deserialize;
use session_store::PodScopeSnapshot;
use tokio::net::UnixStream;
use tokio::process::Command;
use tokio::time::sleep;
@ -127,6 +128,9 @@ pub struct SpawnPodTool {
/// `effective_write` semantics: Write is the only permission
/// tracked across Pods, so revocation only touches Write.
spawner_scope: SharedScope,
/// Called after the spawner scope has been updated so the new
/// effective scope can be persisted to the session log.
scope_changed: Arc<dyn Fn(PodScopeSnapshot) + Send + Sync>,
}
impl SpawnPodTool {
@ -139,6 +143,7 @@ impl SpawnPodTool {
parent_socket: Option<PathBuf>,
spawner_model: ModelManifest,
spawner_scope: SharedScope,
scope_changed: Arc<dyn Fn(PodScopeSnapshot) + Send + Sync>,
) -> Self {
Self {
spawner_name,
@ -149,6 +154,7 @@ impl SpawnPodTool {
parent_socket,
spawner_model,
spawner_scope,
scope_changed,
}
}
}
@ -243,9 +249,12 @@ impl Tool for SpawnPodTool {
if !revoke_write.is_empty() {
self.spawner_scope
.update(|cur| cur.with_added_deny_rules(revoke_write.clone()))
.map_err(|e| {
ToolError::ExecutionFailed(format!("revoke spawner scope: {e}"))
})?;
.map_err(|e| ToolError::ExecutionFailed(format!("revoke spawner scope: {e}")))?;
let current = self.spawner_scope.snapshot();
(self.scope_changed)(PodScopeSnapshot {
allow: current.allow_rules(),
deny: current.deny_rules(),
});
}
send_run(&predicted_socket, &input.task).await?;
@ -488,6 +497,7 @@ pub fn spawn_pod_tool(
parent_socket: Option<PathBuf>,
spawner_model: ModelManifest,
spawner_scope: SharedScope,
scope_changed: Arc<dyn Fn(PodScopeSnapshot) + Send + Sync>,
) -> ToolDefinition {
Arc::new(move || {
let schema = schemars::schema_for!(SpawnPodInput);
@ -504,6 +514,7 @@ pub fn spawn_pod_tool(
parent_socket.clone(),
spawner_model.clone(),
spawner_scope.clone(),
scope_changed.clone(),
));
(meta, tool)
})

View File

@ -532,12 +532,16 @@ async fn notify_while_idle_auto_starts_turn_and_injects_system_message() {
.unwrap();
// Wait for the auto-started turn to complete.
let mut saw_notify_echo = false;
let mut saw_turn_end = false;
let deadline = tokio::time::Instant::now() + std::time::Duration::from_secs(2);
loop {
tokio::select! {
event = rx.recv() => {
match event {
Ok(Event::Notify { ref message }) if message == "turn finished" => {
saw_notify_echo = true;
}
Ok(Event::TurnEnd { .. }) => { saw_turn_end = true; break; }
Err(_) => break,
_ => {}
@ -546,28 +550,47 @@ async fn notify_while_idle_auto_starts_turn_and_injects_system_message() {
_ = tokio::time::sleep_until(deadline) => break,
}
}
assert!(
saw_notify_echo,
"Method::Notify on idle Pod should be echoed as Event::Notify"
);
assert!(saw_turn_end, "auto-triggered turn should complete");
// Status flips back to Idle on the controller thread after RunEnd.
tokio::time::sleep(std::time::Duration::from_millis(50)).await;
assert_eq!(handle.shared_state.get_status(), PodStatus::Idle);
// Exactly one request was made; it must contain the formatted
// notification as the last item (injected into request_context by
// PodInterceptor::pre_llm_request).
// notification as one of the items (committed to history by
// PodInterceptor::pending_history_appends and cloned into the
// request context for that turn).
let requests = client_for_assert.captured_requests();
assert_eq!(requests.len(), 1, "one LLM call expected");
let last_item_text = requests[0]
let notify_in_request = requests[0]
.items
.last()
.and_then(|i| i.as_text())
.unwrap_or_default()
.to_string();
.iter()
.any(|i| i.as_text().is_some_and(|t| t.contains("[Notification]") && t.contains("turn finished")));
assert!(
last_item_text.contains("[Notification]"),
"injected system message missing, got: {last_item_text:?}"
notify_in_request,
"injected system message missing from request, got items: {:?}",
requests[0]
.items
.iter()
.filter_map(|i| i.as_text())
.collect::<Vec<_>>()
);
// The notification must also be persisted into the Worker history
// (and therefore eventually into history.json), per
// tickets/notify-history-persist.md.
let history = handle.shared_state.history();
let notify_in_history = history
.iter()
.any(|i| i.as_text().is_some_and(|t| t.contains("[Notification]") && t.contains("turn finished")));
assert!(
notify_in_history,
"notify must be committed to worker.history, got items: {:?}",
history.iter().filter_map(|i| i.as_text()).collect::<Vec<_>>()
);
assert!(last_item_text.contains("turn finished"));
assert!(last_item_text.contains("not a blocking request"));
}
#[tokio::test]
@ -585,12 +608,18 @@ async fn pod_event_turn_ended_while_idle_auto_starts_turn_and_injects_system_mes
.await
.unwrap();
let mut saw_pod_event_echo = false;
let mut saw_turn_end = false;
let deadline = tokio::time::Instant::now() + std::time::Duration::from_secs(2);
loop {
tokio::select! {
event = rx.recv() => {
match event {
Ok(Event::PodEvent(protocol::PodEvent::TurnEnded { ref pod_name }))
if pod_name == "child" =>
{
saw_pod_event_echo = true;
}
Ok(Event::TurnEnd { .. }) => { saw_turn_end = true; break; }
Err(_) => break,
_ => {}
@ -599,6 +628,10 @@ async fn pod_event_turn_ended_while_idle_auto_starts_turn_and_injects_system_mes
_ = tokio::time::sleep_until(deadline) => break,
}
}
assert!(
saw_pod_event_echo,
"Method::PodEvent on idle Pod should be echoed as Event::PodEvent"
);
assert!(
saw_turn_end,
"PodEvent::TurnEnded on idle Pod should auto-start a turn"
@ -612,19 +645,33 @@ async fn pod_event_turn_ended_while_idle_auto_starts_turn_and_injects_system_mes
1,
"auto-kick should issue exactly one LLM request"
);
let last_item_text = requests[0]
.items
.last()
.and_then(|i| i.as_text())
.unwrap_or_default()
.to_string();
let event_in_request = requests[0].items.iter().any(|i| {
i.as_text().is_some_and(|t| {
t.contains("[Notification]") && t.contains("child") && t.contains("finished a turn")
})
});
assert!(
last_item_text.contains("[Notification]"),
"injected system message missing, got: {last_item_text:?}"
event_in_request,
"rendered TurnEnded text missing from request, got items: {:?}",
requests[0]
.items
.iter()
.filter_map(|i| i.as_text())
.collect::<Vec<_>>()
);
// Same item must be present in worker.history (persisted lane),
// not just the per-request clone — see tickets/notify-history-persist.md.
let history = handle.shared_state.history();
let event_in_history = history.iter().any(|i| {
i.as_text().is_some_and(|t| {
t.contains("[Notification]") && t.contains("child") && t.contains("finished a turn")
})
});
assert!(
last_item_text.contains("child") && last_item_text.contains("finished a turn"),
"rendered TurnEnded text missing, got: {last_item_text:?}"
event_in_history,
"PodEvent must be committed to worker.history, got items: {:?}",
history.iter().filter_map(|i| i.as_text()).collect::<Vec<_>>()
);
}
@ -644,6 +691,8 @@ async fn notify_while_running_does_not_emit_already_running_error() {
.unwrap();
// Drain events until the run ends; AlreadyRunning must never appear.
// The in-flight branch must still echo the Notify as a log element.
let mut saw_notify_echo = false;
let deadline = tokio::time::Instant::now() + std::time::Duration::from_secs(2);
loop {
tokio::select! {
@ -652,6 +701,9 @@ async fn notify_while_running_does_not_emit_already_running_error() {
Ok(Event::Error { code, .. }) if code == pod::ErrorCode::AlreadyRunning => {
panic!("Notify while running must not produce AlreadyRunning");
}
Ok(Event::Notify { ref message }) if message == "ping" => {
saw_notify_echo = true;
}
Ok(Event::TurnEnd { .. }) => break,
Err(_) => break,
_ => {}
@ -660,6 +712,10 @@ async fn notify_while_running_does_not_emit_already_running_error() {
_ = tokio::time::sleep_until(deadline) => break,
}
}
assert!(
saw_notify_echo,
"in-flight Notify must still be echoed as Event::Notify"
);
}
#[tokio::test]
@ -751,6 +807,7 @@ async fn socket_pod_event_turn_ended_while_idle_auto_starts_turn() {
.await
.unwrap();
let mut saw_pod_event_echo = false;
let mut saw_turn_start = false;
let mut saw_turn_end = false;
@ -759,6 +816,11 @@ async fn socket_pod_event_turn_ended_while_idle_auto_starts_turn() {
tokio::select! {
event = reader.next::<Event>() => {
match event {
Ok(Some(Event::PodEvent(protocol::PodEvent::TurnEnded { pod_name })))
if pod_name == "child" =>
{
saw_pod_event_echo = true;
}
Ok(Some(Event::TurnStart { .. })) => saw_turn_start = true,
Ok(Some(Event::TurnEnd { .. })) => {
saw_turn_end = true;
@ -772,6 +834,10 @@ async fn socket_pod_event_turn_ended_while_idle_auto_starts_turn() {
}
}
assert!(
saw_pod_event_echo,
"PodEvent::TurnEnded via socket should be echoed as Event::PodEvent"
);
assert!(
saw_turn_start,
"PodEvent::TurnEnded via socket should auto-start a turn"

View File

@ -80,3 +80,31 @@ async fn restore_from_manifest_rejects_empty_session_log() {
Ok(_) => panic!("expected empty session log to fail"),
}
}
#[tokio::test]
async fn restore_from_manifest_rejects_session_without_scope_snapshot() {
let _lock = ENV_LOCK.lock().unwrap_or_else(|e| e.into_inner());
let store_tmp = tempfile::tempdir().unwrap();
let store = FsStore::new(store_tmp.path()).await.unwrap();
let manifest = pod::PodManifest::from_toml(MINIMAL_MANIFEST_TOML).unwrap();
let id = session_store::new_session_id();
let state = session_store::SessionStartState {
system_prompt: None,
config: &Default::default(),
history: &[],
};
session_store::create_session_with_id(&store, id, state)
.await
.unwrap();
let result =
Pod::restore_from_manifest(id, manifest, store, pod::PromptLoader::builtins_only()).await;
match result {
Err(PodError::SessionScopeMissing { session_id }) => assert_eq!(session_id, id),
Err(other) => panic!("expected SessionScopeMissing, got {other:?}"),
Ok(_) => panic!("expected missing scope snapshot to fail"),
}
}

View File

@ -0,0 +1,467 @@
//! End-to-end coverage for the prune-projection metrics path.
//!
//! Drives a Pod with a scripted mock LLM client and a custom tool that
//! returns a long `ToolOutput.content`, then inspects the persisted
//! session log to verify:
//!
//! - `prune.skip { reason: "no_candidates" }` lands when the protected-turn
//! window covers the entire history.
//! - `prune.fire` lands once enough turns + usage measurements exist for
//! the projection to actually apply.
//! - The fire metric and the immediately-following `prune.post_request`
//! metric share the same `correlation_id`, so cache_read / cache_write
//! from the LlmUsage that triggered the projection can be joined back
//! to the originating event.
//! - `prune.skip { reason: "below_min_savings" }` lands when candidates
//! exist but their estimated savings are below the configured floor.
use std::pin::Pin;
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
use async_trait::async_trait;
use futures::Stream;
use llm_worker::Worker;
use llm_worker::llm_client::event::{
Event as LlmEvent, ResponseStatus, StatusEvent, UsageEvent,
};
use llm_worker::llm_client::{ClientError, LlmClient, Request};
use llm_worker::tool::{Tool, ToolDefinition, ToolError, ToolMeta, ToolOutput};
use session_metrics::{DOMAIN, Metric, metrics_from_extensions};
use session_store::{
EntryHash, FsStore, HashedEntry, LogEntry, SessionId, Store, StoreError, TraceEntry,
};
use pod::{Pod, PodManifest};
#[derive(Clone)]
struct MockClient {
responses: Arc<Vec<Vec<LlmEvent>>>,
call_count: Arc<AtomicUsize>,
}
impl MockClient {
fn new(responses: Vec<Vec<LlmEvent>>) -> Self {
Self {
responses: Arc::new(responses),
call_count: Arc::new(AtomicUsize::new(0)),
}
}
}
#[async_trait]
impl LlmClient for MockClient {
fn clone_boxed(&self) -> Box<dyn LlmClient> {
Box::new(self.clone())
}
async fn stream(
&self,
_request: Request,
) -> Result<Pin<Box<dyn Stream<Item = Result<LlmEvent, ClientError>> + Send>>, ClientError>
{
let count = self.call_count.fetch_add(1, Ordering::SeqCst);
if count >= self.responses.len() {
return Err(ClientError::Config("mock client exhausted".into()));
}
let events = self.responses[count].clone();
let stream = futures::stream::iter(events.into_iter().map(Ok));
Ok(Box::pin(stream))
}
}
/// Tool that returns a fixed `ToolOutput { summary, content: Some(big) }`.
/// `content` is long enough for prune savings to comfortably clear small
/// `min_savings` thresholds.
struct BigContentTool {
summary: &'static str,
content: String,
}
#[async_trait]
impl Tool for BigContentTool {
async fn execute(&self, _input: &str) -> Result<ToolOutput, ToolError> {
Ok(ToolOutput {
summary: self.summary.into(),
content: Some(self.content.clone()),
})
}
}
fn big_content_tool_definition(name: &'static str) -> ToolDefinition {
Arc::new(move || {
let summary = "tool result summary";
let content = "x".repeat(2048);
(
ToolMeta::new(name)
.description("test tool that returns a long content")
.input_schema(serde_json::json!({"type": "object"})),
Arc::new(BigContentTool { summary, content }) as Arc<dyn Tool>,
)
})
}
fn usage_event(input_total: u64, cache_read: u64, cache_write: u64, output: u64) -> LlmEvent {
LlmEvent::Usage(UsageEvent {
input_tokens: Some(input_total),
output_tokens: Some(output),
total_tokens: Some(input_total + output),
cache_read_input_tokens: Some(cache_read),
cache_creation_input_tokens: Some(cache_write),
})
}
/// Tool-call response from the assistant: emits a `tool_use` block then a
/// usage event so usage_history gains a measurement on this turn.
fn tool_use_response(call_id: &str, tool_name: &str) -> Vec<LlmEvent> {
vec![
LlmEvent::tool_use_start(0, call_id, tool_name),
LlmEvent::tool_input_delta(0, "{}"),
LlmEvent::tool_use_stop(0),
usage_event(500, 0, 0, 10),
LlmEvent::Status(StatusEvent {
status: ResponseStatus::Completed,
}),
]
}
/// Plain text response with explicit cache_read/cache_write so that
/// `prune.post_request` can carry meaningful values when this is the
/// LLM call that follows a `prune.fire` event.
fn text_response_with_cache(text: &str, cache_read: u64, cache_write: u64) -> Vec<LlmEvent> {
vec![
LlmEvent::text_block_start(0),
LlmEvent::text_delta(0, text),
LlmEvent::text_block_stop(0, None),
usage_event(800, cache_read, cache_write, 5),
LlmEvent::Status(StatusEvent {
status: ResponseStatus::Completed,
}),
]
}
fn manifest_toml(prune_protected_turns: usize, prune_min_savings: u64) -> String {
format!(
r#"
[pod]
name = "test-pod"
pwd = "./"
[model]
scheme = "anthropic"
model_id = "test-model"
[worker]
max_tokens = 100
[compaction]
prune_protected_turns = {prune_protected_turns}
prune_min_savings = {prune_min_savings}
[[scope.allow]]
target = "./"
permission = "write"
"#
)
}
async fn make_pod(
manifest_toml: String,
client: MockClient,
tool_name: &'static str,
) -> (Pod<MockClient, FsStore>, tempfile::TempDir, tempfile::TempDir) {
let manifest = PodManifest::from_toml(&manifest_toml).unwrap();
let store_tmp = tempfile::tempdir().unwrap();
let store = FsStore::new(store_tmp.path()).await.unwrap();
let pwd_tmp = tempfile::tempdir().unwrap();
let pwd = pwd_tmp.path().to_path_buf();
let scope = pod::Scope::writable(&pwd).unwrap();
let mut worker = Worker::new(client);
worker.register_tool(big_content_tool_definition(tool_name));
let pod = Pod::new(manifest, worker, store, pwd, scope).await.unwrap();
(pod, store_tmp, pwd_tmp)
}
/// Drive Pod through enough runs to exercise both skip-no_candidates and
/// fire branches, then read the session log back and assert the metric
/// stream.
#[tokio::test]
async fn prune_metrics_emit_skip_then_fire_with_post_request_join() {
// Run 1 (request 0): tool_use → triggers tool execution → request 1
// on the second iteration to produce the assistant reply.
// Run 2 (request 2): plain assistant text. Prune evaluation here
// sees user1's tool_result outside the 1-protected-turn window and
// should fire.
let client = MockClient::new(vec![
tool_use_response("call-1", "big_tool"),
text_response_with_cache("ok", 0, 200),
text_response_with_cache("done", 1234, 50),
]);
let (mut pod, _store_tmp, _pwd_tmp) =
make_pod(manifest_toml(1, 1), client, "big_tool").await;
let session_id = pod.session_id();
// Cloning the store handle to read the session log back after the
// runs complete — the Pod retains its own copy.
let store = pod.store().clone();
pod.run_text("first").await.unwrap();
pod.run_text("second").await.unwrap();
let state = session_store::restore(&store, session_id).await.unwrap();
let metrics = metrics_from_extensions(&state.extensions);
// Run 1 has 2 LLM iterations (tool loop), each evaluates prune with
// only one user-message turn → 2x skip{no_candidates}.
// Run 2 has 1 LLM iteration with enough turns → 1x fire +
// 1x post_request paired by correlation_id.
let names: Vec<&str> = metrics.iter().map(|m| m.name.as_str()).collect();
assert!(
names.contains(&"prune.skip"),
"expected prune.skip in {names:?}"
);
assert!(
names.contains(&"prune.fire"),
"expected prune.fire in {names:?}"
);
assert!(
names.contains(&"prune.post_request"),
"expected prune.post_request in {names:?}"
);
// All skips in run 1 must record reason=no_candidates.
for m in metrics.iter().filter(|m| m.name == "prune.skip") {
assert_eq!(
m.dimensions.get("reason").map(String::as_str),
Some("no_candidates"),
"skip metric should be no_candidates here, got {m:?}"
);
assert!(m.correlation_id.is_none());
}
// The fire metric carries dimensions and correlation_id.
let fire = metrics
.iter()
.find(|m| m.name == "prune.fire")
.expect("prune.fire missing");
assert!(
fire.dimensions.contains_key("candidate_count"),
"fire missing candidate_count: {fire:?}"
);
assert!(
fire.dimensions.contains_key("border_turn"),
"fire missing border_turn: {fire:?}"
);
assert!(
fire.value.is_some(),
"fire missing estimated_savings value"
);
let fire_id = fire
.correlation_id
.as_ref()
.expect("fire metric missing correlation_id");
// Exactly one post_request metric should exist with the same id, and
// its value/dimension should reflect the cache numbers from the
// text_response_with_cache call (cache_read=1234, cache_write=50).
let post = metrics
.iter()
.find(|m| m.name == "prune.post_request")
.expect("prune.post_request missing");
assert_eq!(post.correlation_id.as_ref(), Some(fire_id));
assert_eq!(post.value, Some(1234.0));
assert_eq!(
post.dimensions.get("cache_write_tokens").map(String::as_str),
Some("50")
);
assert!(post.dimensions.contains_key("history_len"));
}
/// `min_savings` set high enough that candidates exist but the estimated
/// savings always fall short → the second run should record
/// `prune.skip { reason: "below_min_savings" }`.
#[tokio::test]
async fn prune_metrics_record_below_min_savings_skip() {
let client = MockClient::new(vec![
tool_use_response("call-1", "big_tool"),
text_response_with_cache("ok", 0, 100),
text_response_with_cache("done", 0, 0),
]);
let (mut pod, _store_tmp, _pwd_tmp) =
make_pod(manifest_toml(1, u64::MAX), client, "big_tool").await;
let session_id = pod.session_id();
let store = pod.store().clone();
pod.run_text("first").await.unwrap();
pod.run_text("second").await.unwrap();
let state = session_store::restore(&store, session_id).await.unwrap();
let metrics = metrics_from_extensions(&state.extensions);
let below = metrics
.iter()
.find(|m| {
m.name == "prune.skip"
&& m.dimensions.get("reason").map(String::as_str) == Some("below_min_savings")
})
.expect("expected prune.skip with reason=below_min_savings");
assert!(
below.dimensions.contains_key("candidate_count"),
"below_min_savings skip should report candidate_count: {below:?}"
);
assert!(
below.value.is_some(),
"below_min_savings skip should report estimated savings as value: {below:?}"
);
// No prune.fire for this scenario.
assert!(metrics.iter().all(|m| m.name != "prune.fire"));
// No prune.post_request either (no fire to join with).
assert!(metrics.iter().all(|m| m.name != "prune.post_request"));
}
/// `Store` wrapper that delegates to an inner `FsStore` for everything
/// except `LogEntry::Extension { domain: "metrics", .. }` appends, which
/// it rejects with an `Io` error. Lets us drive the `try_record_metric`
/// failure path without affecting any other persistence write.
#[derive(Clone)]
struct MetricFailingStore {
inner: FsStore,
}
impl Store for MetricFailingStore {
async fn append(&self, id: SessionId, entry: &HashedEntry) -> Result<(), StoreError> {
if let LogEntry::Extension { domain, .. } = &entry.entry {
if domain == DOMAIN {
return Err(StoreError::Io(std::io::Error::other("synthetic failure")));
}
}
self.inner.append(id, entry).await
}
async fn read_all(&self, id: SessionId) -> Result<Vec<HashedEntry>, StoreError> {
self.inner.read_all(id).await
}
async fn list_sessions(&self) -> Result<Vec<SessionId>, StoreError> {
self.inner.list_sessions().await
}
async fn create_session(
&self,
id: SessionId,
entries: &[HashedEntry],
) -> Result<(), StoreError> {
self.inner.create_session(id, entries).await
}
async fn exists(&self, id: SessionId) -> Result<bool, StoreError> {
self.inner.exists(id).await
}
async fn read_head_hash(&self, id: SessionId) -> Result<Option<EntryHash>, StoreError> {
self.inner.read_head_hash(id).await
}
async fn append_trace(&self, id: SessionId, entry: &TraceEntry) -> Result<(), StoreError> {
self.inner.append_trace(id, entry).await
}
}
/// Metric write failures are non-fatal: the run still completes, the
/// session log carries no metric entries (drops), but a `Warn` alert
/// fires on the alerter so the TUI surface picks it up.
#[tokio::test]
async fn metric_write_failure_emits_warn_alert_and_does_not_abort_run() {
use protocol::{AlertLevel, AlertSource, Event};
use tokio::sync::broadcast;
let manifest_toml = manifest_toml(1, 1);
let manifest = PodManifest::from_toml(&manifest_toml).unwrap();
let store_tmp = tempfile::tempdir().unwrap();
let inner = FsStore::new(store_tmp.path()).await.unwrap();
let store = MetricFailingStore { inner };
let pwd_tmp = tempfile::tempdir().unwrap();
let pwd = pwd_tmp.path().to_path_buf();
let scope = pod::Scope::writable(&pwd).unwrap();
// Even with a tool registered, this run will only emit
// `prune.skip { reason: "no_candidates" }` (one user message,
// protected_turns=1 covers everything). That is enough to drive
// the failure path: at least one metric attempts to write.
let client = MockClient::new(vec![text_response_with_cache("hi", 0, 0)]);
let worker = Worker::new(client);
let mut pod = Pod::new(manifest, worker, store.clone(), pwd, scope)
.await
.unwrap();
let (tx, mut rx) = broadcast::channel::<Event>(64);
let alerter = pod::Alerter::new(tx);
pod.attach_alerter(alerter);
let session_id = pod.session_id();
// Run completes successfully despite metric failure.
pod.run_text("hello").await.unwrap();
// No metrics ended up in the log (writes were rejected).
let state = session_store::restore(&store, session_id).await.unwrap();
let metrics = metrics_from_extensions(&state.extensions);
assert!(metrics.is_empty(), "metrics must drop on write failure");
// The alerter saw at least one Warn from AlertSource::Pod.
let mut saw_warn = false;
while let Ok(ev) = rx.try_recv() {
if let Event::Alert(a) = ev {
if a.level == AlertLevel::Warn
&& a.source == AlertSource::Pod
&& a.message.contains("metric")
{
saw_warn = true;
break;
}
}
}
assert!(saw_warn, "expected Warn/Pod alert about metric failure");
}
/// Sessions that have no metrics in the log restore cleanly: the
/// `RestoredState.extensions` simply contains no `metrics` domain
/// payloads, and `metrics_from_extensions` returns an empty Vec.
/// Backward-compatibility check for old logs predating this feature.
#[tokio::test]
async fn old_sessions_without_metrics_replay_cleanly() {
// Manifest without any `[compaction]` section → prune (and therefore
// the prune observer) is never installed, so no metrics get written.
let manifest_toml = r#"
[pod]
name = "test-pod"
pwd = "./"
[model]
scheme = "anthropic"
model_id = "test-model"
[worker]
max_tokens = 100
[[scope.allow]]
target = "./"
permission = "write"
"#;
let client = MockClient::new(vec![text_response_with_cache("hi", 0, 0)]);
let manifest = PodManifest::from_toml(manifest_toml).unwrap();
let store_tmp = tempfile::tempdir().unwrap();
let store = FsStore::new(store_tmp.path()).await.unwrap();
let pwd_tmp = tempfile::tempdir().unwrap();
let pwd = pwd_tmp.path().to_path_buf();
let scope = pod::Scope::writable(&pwd).unwrap();
let worker = Worker::new(client);
let mut pod = Pod::new(manifest, worker, store.clone(), pwd, scope)
.await
.unwrap();
let session_id = pod.session_id();
pod.run_text("hello").await.unwrap();
let state = session_store::restore(&store, session_id).await.unwrap();
let metrics = metrics_from_extensions(&state.extensions);
assert!(metrics.is_empty(), "no metrics should be recorded: {metrics:?}");
// And no extension entries at all in the metrics domain.
assert!(state.extensions.iter().all(|(d, _)| d != DOMAIN));
// Smoke check that fold helper is robust on a sentinel Metric value:
let m = Metric::now("smoke");
assert_eq!(m.name, "smoke");
}

View File

@ -187,6 +187,7 @@ async fn spawn_pod_delegates_scope_and_sends_run() {
None,
dummy_model(),
spawner_scope.clone(),
std::sync::Arc::new(|_| {}),
);
let (_meta, tool) = def();
@ -275,6 +276,7 @@ async fn spawn_pod_rejects_scope_outside_spawner() {
None,
dummy_model(),
spawner_scope.clone(),
std::sync::Arc::new(|_| {}),
);
let (_meta, tool) = def();
@ -346,6 +348,7 @@ async fn spawn_pod_rolls_back_reservation_when_socket_never_appears() {
None,
dummy_model(),
spawner_scope.clone(),
std::sync::Arc::new(|_| {}),
);
let (_meta, tool) = def();

View File

@ -214,6 +214,20 @@ pub enum Event {
UserMessage {
segments: Vec<Segment>,
},
/// Echo of `Method::Notify` received by this Pod. Broadcast on
/// receipt so subscribers can render the external input as a log
/// element. The same `message` is independently pushed into the
/// notification buffer for LLM injection (with prompt-pack
/// wrapping); this echo carries the raw payload and does not
/// imply any turn-boundary semantics.
Notify {
message: String,
},
/// Echo of `Method::PodEvent` received by this Pod. Same rationale
/// as `Notify`: subscribers render the event as a log element,
/// while a rendered summary is independently injected into the LLM
/// context via the notification buffer.
PodEvent(PodEvent),
TurnStart {
turn: usize,
},
@ -425,7 +439,7 @@ pub enum ErrorCode {
// ---------------------------------------------------------------------------
/// A single allow or deny rule inside a scope configuration.
#[derive(Debug, Clone, Serialize, Deserialize)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct ScopeRule {
/// Target path. Must be absolute by the time a `Scope` is built from
/// this rule — relative paths are resolved per-layer against the
@ -930,6 +944,43 @@ mod tests {
assert_eq!(parsed["data"]["code"], "already_running");
}
#[test]
fn event_notify_roundtrip() {
let event = Event::Notify {
message: "child-pod finished".into(),
};
let json = serde_json::to_string(&event).unwrap();
let parsed: serde_json::Value = serde_json::from_str(&json).unwrap();
assert_eq!(parsed["event"], "notify");
assert_eq!(parsed["data"]["message"], "child-pod finished");
let decoded: Event = serde_json::from_str(&json).unwrap();
match decoded {
Event::Notify { message } => assert_eq!(message, "child-pod finished"),
other => panic!("expected Notify, got {other:?}"),
}
}
#[test]
fn event_pod_event_roundtrip() {
let event = Event::PodEvent(PodEvent::TurnEnded {
pod_name: "child".into(),
});
let json = serde_json::to_string(&event).unwrap();
let parsed: serde_json::Value = serde_json::from_str(&json).unwrap();
assert_eq!(parsed["event"], "pod_event");
assert_eq!(parsed["data"]["kind"], "turn_ended");
assert_eq!(parsed["data"]["pod_name"], "child");
let decoded: Event = serde_json::from_str(&json).unwrap();
match decoded {
Event::PodEvent(PodEvent::TurnEnded { pod_name }) => {
assert_eq!(pod_name, "child");
}
other => panic!("expected PodEvent::TurnEnded, got {other:?}"),
}
}
#[test]
fn event_user_message_roundtrip() {
let event = Event::UserMessage {

View File

@ -0,0 +1,10 @@
[package]
name = "session-metrics"
version = "0.1.0"
edition.workspace = true
license.workspace = true
[dependencies]
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
session-store = { workspace = true }

View File

@ -0,0 +1,170 @@
//! Session metrics — generic ad-hoc measurement lane on top of
//! `LogEntry::Extension { domain: "metrics" }`.
//!
//! セッション中に積み上げて後で引きたい値prune の発火頻度・Hook の実行
//! 時間・ツールリトライ回数 等)を session-log に乗せるための薄い層。
//! session-store は payload を不透明な `serde_json::Value` として扱うので、
//! このクレートは型と読み書きヘルパーだけを提供する。
//!
//! # 設計
//!
//! - 厳格な label set は持たない。次元は sparse な `BTreeMap<String,String>`、
//! 観測できない値は `None` で明示する
//! - 「後から埋まる値」(例: prune 発火直後の `cache_read_tokens`)は前 entry に
//! 書き戻さず、`correlation_id` を共有する別 metric として流す。集計は読み手で join
//! - 集計 / 可視化 API はこのクレートには無い。session-log を読めば取り出せる、
//! までが到達点
use std::collections::BTreeMap;
use serde::{Deserialize, Serialize};
use session_store::{EntryHash, SessionId, Store, StoreError, save_extension, session_log};
/// Domain tag used in `LogEntry::Extension` for all metrics records.
pub const DOMAIN: &str = "metrics";
/// 単発の計測値。`name` は `namespace.metric` 形式の自由文字列
/// (例: `"prune.fire"`、`"hook.duration"`)。
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct Metric {
/// `namespace.metric` 形式の名前。
pub name: String,
/// epoch ms。
pub ts: u64,
/// sparse な次元label。観測できないものはキー自体を入れない。
#[serde(default, skip_serializing_if = "BTreeMap::is_empty")]
pub dimensions: BTreeMap<String, String>,
/// 主スカラ値。dimension では表現したくない数値を載せる場所。
#[serde(default, skip_serializing_if = "Option::is_none")]
pub value: Option<f64>,
/// 関連 metric を join するためのキー。同一 ID を持つ複数 metric は
/// 「同じ事象を多面的に観測している」という意味付けで読まれる。
#[serde(default, skip_serializing_if = "Option::is_none")]
pub correlation_id: Option<String>,
}
impl Metric {
/// 最小コンストラクタ。`ts` は呼び出し時刻epoch msで埋める。
pub fn now(name: impl Into<String>) -> Self {
Self {
name: name.into(),
ts: session_log::now_millis(),
dimensions: BTreeMap::new(),
value: None,
correlation_id: None,
}
}
pub fn with_dimension(mut self, key: impl Into<String>, value: impl Into<String>) -> Self {
self.dimensions.insert(key.into(), value.into());
self
}
pub fn with_value(mut self, value: f64) -> Self {
self.value = Some(value);
self
}
pub fn with_correlation_id(mut self, id: impl Into<String>) -> Self {
self.correlation_id = Some(id.into());
self
}
}
/// `LogEntry::Extension { domain: "metrics", payload: <metric> }` を append する。
///
/// `save_extension` の薄い wrapper。書き込み失敗は呼び出し側に返す
/// (メトリクスのために本体処理を止めるかは呼び出し側の判断)。
pub async fn record_metric(
store: &impl Store,
session_id: SessionId,
head_hash: &mut Option<EntryHash>,
metric: &Metric,
) -> Result<(), StoreError> {
let payload = serde_json::to_value(metric).expect("Metric serialization cannot fail");
save_extension(store, session_id, head_hash, DOMAIN, payload).await
}
/// `RestoredState.extensions` から metrics domain の payload を順に取り出し、
/// `Metric` 列に fold する。
///
/// schema 変更で deserialize できない payload は無視する(後方互換)。
pub fn metrics_from_extensions(
extensions: &[(String, serde_json::Value)],
) -> Vec<Metric> {
extensions
.iter()
.filter(|(domain, _)| domain == DOMAIN)
.filter_map(|(_, payload)| serde_json::from_value::<Metric>(payload.clone()).ok())
.collect()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn metric_round_trip_via_json() {
let metric = Metric::now("prune.fire")
.with_dimension("border_turn", "3")
.with_dimension("candidate_count", "2")
.with_value(4096.0)
.with_correlation_id("abc-123");
let json = serde_json::to_string(&metric).unwrap();
let parsed: Metric = serde_json::from_str(&json).unwrap();
assert_eq!(parsed, metric);
}
#[test]
fn metric_serializes_minimal_form_compactly() {
// dimensions が空 / value/correlation_id が None の時は出力に含めない。
let metric = Metric {
name: "x".into(),
ts: 1,
dimensions: BTreeMap::new(),
value: None,
correlation_id: None,
};
let json = serde_json::to_string(&metric).unwrap();
assert!(!json.contains("dimensions"));
assert!(!json.contains("value"));
assert!(!json.contains("correlation_id"));
}
#[test]
fn fold_skips_other_domains() {
let extensions = vec![
(
"memory.extract".into(),
serde_json::json!({ "processed_through_entry": 7 }),
),
(
DOMAIN.into(),
serde_json::to_value(Metric::now("a")).unwrap(),
),
(
DOMAIN.into(),
serde_json::to_value(Metric::now("b")).unwrap(),
),
];
let metrics = metrics_from_extensions(&extensions);
assert_eq!(metrics.len(), 2);
assert_eq!(metrics[0].name, "a");
assert_eq!(metrics[1].name, "b");
}
#[test]
fn fold_skips_undeserializable_payloads() {
// 将来 schema が変わって読めない payload も skip して落ちない。
let extensions = vec![
(DOMAIN.into(), serde_json::json!({ "garbage": true })),
(
DOMAIN.into(),
serde_json::to_value(Metric::now("ok")).unwrap(),
),
];
let metrics = metrics_from_extensions(&extensions);
assert_eq!(metrics.len(), 1);
assert_eq!(metrics[0].name, "ok");
}
}

View File

@ -16,6 +16,7 @@ thiserror = { workspace = true }
sha2 = { workspace = true }
hex = "0.4.3"
protocol = { workspace = true }
tracing.workspace = true
[dev-dependencies]
tokio = { workspace = true, features = ["macros", "rt-multi-thread"] }

View File

@ -41,11 +41,12 @@ pub use logged_item::{LoggedContentPart, LoggedItem, LoggedRole, from_logged, to
pub use session::{
SessionStartState, create_compacted_session, create_session, create_session_with_id,
ensure_head_or_fork, fork, fork_at, restore, save_config_changed, save_delta, save_extension,
save_run_completed, save_run_errored, save_turn_end, save_usage, save_user_input,
save_pod_scope, save_run_completed, save_run_errored, save_turn_end, save_usage,
save_user_input,
};
pub use session_log::{
EntryHash, HashedEntry, LogEntry, RestoredState, SessionOrigin, build_chain, collect_state,
compute_hash,
EntryHash, HashedEntry, LogEntry, POD_SCOPE_EXTENSION_DOMAIN, PodScopeSnapshot, RestoredState,
SessionOrigin, build_chain, collect_state, compute_hash,
};
pub use store::{Store, StoreError};

View File

@ -6,7 +6,7 @@
use crate::SessionId;
use crate::logged_item::{LoggedItem, to_logged};
use crate::session_log::{self, EntryHash, HashedEntry, LogEntry, SessionOrigin};
use crate::session_log::{self, EntryHash, HashedEntry, LogEntry, PodScopeSnapshot, SessionOrigin};
use crate::store::{Store, StoreError};
use llm_worker::WorkerResult;
use llm_worker::llm_client::RequestConfig;
@ -360,6 +360,24 @@ pub async fn save_extension(
.await
}
/// Log the Pod's latest runtime scope snapshot.
pub async fn save_pod_scope(
store: &impl Store,
session_id: SessionId,
head_hash: &mut Option<EntryHash>,
snapshot: &PodScopeSnapshot,
) -> Result<(), StoreError> {
let payload = serde_json::to_value(snapshot)?;
save_extension(
store,
session_id,
head_hash,
session_log::POD_SCOPE_EXTENSION_DOMAIN,
payload,
)
.await
}
/// Log a `ConfigChanged` entry.
pub async fn save_config_changed(
store: &impl Store,

View File

@ -10,7 +10,7 @@
use llm_worker::llm_client::types::{Item, RequestConfig};
use llm_worker::{UsageRecord, WorkerResult};
use protocol::Segment;
use protocol::{ScopeRule, Segment};
use serde::{Deserialize, Serialize};
use sha2::{Digest, Sha256};
@ -197,6 +197,16 @@ pub struct SessionOrigin {
pub at_hash: EntryHash,
}
/// Domain used by Pod to persist its latest effective runtime scope.
pub const POD_SCOPE_EXTENSION_DOMAIN: &str = "pod.scope";
/// Payload stored in `LogEntry::Extension { domain: "pod.scope", .. }`.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct PodScopeSnapshot {
pub allow: Vec<ScopeRule>,
pub deny: Vec<ScopeRule>,
}
/// State collected from log entries.
#[derive(Debug, Clone)]
pub struct RestoredState {
@ -214,6 +224,9 @@ pub struct RestoredState {
/// `LogEntry::Extension` を replay 順に積んだもの。`(domain, payload)`。
/// session-store は domain を不透明扱いし、各ドメインが自前で fold する。
pub extensions: Vec<(String, serde_json::Value)>,
/// Latest runtime scope snapshot persisted by the Pod. `None` means
/// the session predates scope persistence or the payload was corrupt.
pub pod_scope: Option<PodScopeSnapshot>,
/// User submissions in original typed form, in submit order.
/// One entry per `LogEntry::UserInput`; the K-th entry corresponds to
/// the K-th `Item::user_message` derived during replay (modulo
@ -234,6 +247,7 @@ pub fn collect_state(entries: &[HashedEntry]) -> RestoredState {
head_hash: None,
usage_history: Vec::new(),
extensions: Vec::new(),
pod_scope: None,
user_segments: Vec::new(),
};
@ -296,6 +310,17 @@ pub fn collect_state(entries: &[HashedEntry]) -> RestoredState {
LogEntry::Extension {
domain, payload, ..
} => {
if domain == POD_SCOPE_EXTENSION_DOMAIN {
match serde_json::from_value::<PodScopeSnapshot>(payload.clone()) {
Ok(snapshot) => state.pod_scope = Some(snapshot),
Err(err) => {
tracing::warn!(
error = %err,
"discarding malformed pod.scope snapshot from session log"
);
}
}
}
state.extensions.push((domain.clone(), payload.clone()));
}
}

View File

@ -53,7 +53,6 @@ pub struct App {
pub current_tool: Option<String>,
pub input: InputBuffer,
pub quit: bool,
pub shutdown_confirm: Option<std::time::Instant>,
/// 2-tap guard for `Ctrl-C` when the Pod is not running. First press
/// records the instant; a second press within the timeout exits the
/// TUI (the Pod itself stays alive).
@ -86,7 +85,6 @@ impl App {
current_tool: None,
input: InputBuffer::new(),
quit: false,
shutdown_confirm: None,
quit_confirm: None,
blocks: Vec::new(),
scroll: Scroll::default(),
@ -312,6 +310,14 @@ impl App {
self.blocks.push(Block::UserMessage { segments });
self.assistant_streaming = false;
}
Event::Notify { message } => {
self.blocks.push(Block::Notify { message });
self.assistant_streaming = false;
}
Event::PodEvent(event) => {
self.blocks.push(Block::PodEvent { event });
self.assistant_streaming = false;
}
Event::TurnStart { .. } => {
self.running = true;
self.paused = false;

View File

@ -9,7 +9,7 @@
use std::time::Instant;
use protocol::{AlertLevel, AlertSource, Greeting, Segment};
use protocol::{AlertLevel, AlertSource, Greeting, PodEvent, Segment};
pub enum Block {
Greeting(Greeting),
@ -19,6 +19,17 @@ pub enum Block {
UserMessage {
segments: Vec<Segment>,
},
/// Echo of `Method::Notify` received by this Pod, surfaced as a log
/// element so subscribers see the external input that drove any
/// following auto-kicked turn.
Notify {
message: String,
},
/// Echo of `Method::PodEvent` received by this Pod. Same role as
/// `Notify` — an input log element, not a turn-control signal.
PodEvent {
event: PodEvent,
},
AssistantText {
text: String,
},

View File

@ -12,7 +12,6 @@ mod ui;
use std::io;
use std::path::PathBuf;
use std::process::ExitCode;
use std::time::Duration;
use crossterm::event::{
self, DisableBracketedPaste, DisableMouseCapture, EnableBracketedPaste, EnableMouseCapture,
@ -201,7 +200,7 @@ async fn run_attach(
) -> Result<(), Box<dyn std::error::Error>> {
let socket_path = resolve_socket(&pod_name, socket_override);
let mut terminal = enter_fullscreen()?;
run(&mut terminal, pod_name, &socket_path, false).await
run(&mut terminal, pod_name, &socket_path).await
}
async fn run_resume() -> Result<(), Box<dyn std::error::Error>> {
@ -224,31 +223,18 @@ async fn run_spawn(resume_from: Option<SessionId>) -> Result<(), Box<dyn std::er
let SpawnReady {
pod_name,
socket_path,
mut child,
stderr_drain,
} = ready;
let mut terminal = enter_fullscreen()?;
let result = run(&mut terminal, pod_name, &socket_path, true).await;
let result = run(&mut terminal, pod_name, &socket_path).await;
// Leave alt-screen before reaping the child so any final pod stderr
// (drained off-line by `stderr_drain`) cannot collide with the
// restored scrollback.
// Leave alt-screen explicitly before `main`'s terminal restore path.
let _ = execute!(
terminal.backend_mut(),
DisableMouseCapture,
LeaveAlternateScreen
);
match tokio::time::timeout(Duration::from_secs(3), child.wait()).await {
Ok(Ok(_)) => {}
_ => {
let _ = child.start_kill();
let _ = child.wait().await;
}
}
stderr_drain.abort();
result
}
@ -264,7 +250,6 @@ async fn run(
terminal: &mut Terminal<CrosstermBackend<io::Stdout>>,
pod_name: String,
socket_path: &std::path::Path,
shutdown_pod_on_exit: bool,
) -> Result<(), Box<dyn std::error::Error>> {
let mut app = App::new(pod_name);
@ -272,7 +257,7 @@ async fn run(
Ok(mut client) => {
app.connected = true;
let _ = client.send(&Method::GetHistory).await;
run_loop(terminal, &mut app, client, shutdown_pod_on_exit).await?;
run_loop(terminal, &mut app, client).await?;
}
Err(e) => {
app.push_error(format!(
@ -290,15 +275,11 @@ async fn run_loop(
terminal: &mut Terminal<CrosstermBackend<io::Stdout>>,
app: &mut App,
mut client: PodClient,
shutdown_pod_on_exit: bool,
) -> Result<(), Box<dyn std::error::Error>> {
terminal.draw(|f| ui::draw(f, app))?;
loop {
if app.quit {
if shutdown_pod_on_exit {
let _ = client.send(&Method::Shutdown).await;
}
break;
}
@ -414,10 +395,12 @@ fn handle_key(app: &mut App, key: KeyEvent) -> Option<Method> {
KeyCode::Char('x') if ctrl => Some(if app.running {
Some(Method::Cancel)
} else {
app.push_error("Nothing to cancel (Pod is not running).");
None
Some(Method::Shutdown)
}),
KeyCode::Char('d') if ctrl => Some(handle_shutdown(app)),
KeyCode::Char('d') if ctrl => {
app.quit = true;
Some(None)
}
KeyCode::Enter if alt => {
app.insert_newline();
Some(app.refresh_completion())
@ -550,21 +533,6 @@ fn handle_key(app: &mut App, key: KeyEvent) -> Option<Method> {
const CONFIRM_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(3);
fn handle_shutdown(app: &mut App) -> Option<Method> {
if !app.running {
return Some(Method::Shutdown);
}
if let Some(t) = app.shutdown_confirm
&& t.elapsed() < CONFIRM_TIMEOUT
{
app.shutdown_confirm = None;
return Some(Method::Shutdown);
}
app.shutdown_confirm = Some(std::time::Instant::now());
app.push_error("Turn is running. Press Ctrl-D again to cancel and shut down.");
None
}
/// Running → send `Method::Pause`.
/// Idle / Paused → 2-tap to quit the TUI (the Pod keeps running).
fn handle_pause_or_quit(app: &mut App) -> Option<Method> {

View File

@ -3,9 +3,9 @@
//! Rendered at the user's current cursor position when `tui` is invoked
//! with no positional argument. Walks the cwd for a `.insomnia/manifest.toml`
//! to seed defaults, prompts for the Pod's name, and on confirmation
//! launches the `pod` binary as a subprocess with a freshly built
//! launches the `pod` binary as an independent process with a freshly built
//! overlay (name + cwd scope when no project manifest exists). Once
//! the child reports its socket via the `INSOMNIA-READY` stderr line,
//! the process reports its socket via the `INSOMNIA-READY` stderr line,
//! the dialog hands control back so main can switch the terminal to
//! alternate-screen mode.
//!
@ -18,7 +18,9 @@ use std::process::Stdio;
use std::time::Duration;
use crossterm::event::{self, Event as TermEvent, KeyCode, KeyEventKind, KeyModifiers};
use manifest::{PodManifestConfig, find_project_manifest_from, load_layer, user_manifest_path};
use manifest::{
PodManifestConfig, ScopeConfig, find_project_manifest_from, load_layer, user_manifest_path,
};
use ratatui::Terminal;
use ratatui::backend::CrosstermBackend;
use ratatui::layout::{Constraint, Layout};
@ -27,9 +29,7 @@ use ratatui::text::{Line, Span};
use ratatui::widgets::Paragraph;
use ratatui::{Frame, TerminalOptions, Viewport};
use session_store::SessionId;
use tokio::io::{AsyncBufReadExt, BufReader};
use tokio::process::{Child, Command};
use tokio::task::JoinHandle;
use tokio::process::Command;
const READY_PREFIX: &str = "INSOMNIA-READY\t";
const VIEWPORT_LINES: u16 = 6;
@ -38,8 +38,6 @@ const READY_TIMEOUT: Duration = Duration::from_secs(20);
pub struct SpawnReady {
pub pod_name: String,
pub socket_path: PathBuf,
pub child: Child,
pub stderr_drain: JoinHandle<()>,
}
pub enum SpawnOutcome {
@ -50,6 +48,8 @@ pub enum SpawnOutcome {
#[derive(Debug)]
pub enum SpawnError {
Io(io::Error),
Store(session_store::StoreError),
MissingResumeScope { session_id: SessionId },
PodLaunchFailed(io::Error),
PodExitedEarly { stderr_tail: String },
Timeout,
@ -59,6 +59,11 @@ impl std::fmt::Display for SpawnError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::Io(e) => write!(f, "io error: {e}"),
Self::Store(e) => write!(f, "failed to read session log: {e}"),
Self::MissingResumeScope { session_id } => write!(
f,
"session {session_id} has no persisted scope snapshot; refusing resume without explicit scope"
),
Self::PodLaunchFailed(e) => write!(f, "failed to launch pod: {e}"),
Self::PodExitedEarly { stderr_tail } => {
if stderr_tail.is_empty() {
@ -84,6 +89,12 @@ impl From<io::Error> for SpawnError {
}
}
impl From<session_store::StoreError> for SpawnError {
fn from(e: session_store::StoreError) -> Self {
Self::Store(e)
}
}
type InlineTerminal = Terminal<CrosstermBackend<io::Stdout>>;
/// Source session for a resume run. `None` = fresh spawn (current
@ -140,6 +151,7 @@ pub async fn run(resume_from: Option<SessionId>) -> Result<SpawnOutcome, SpawnEr
message: None,
editing: true,
resume_from,
resume_scope: None,
};
let mut terminal = make_inline_terminal()?;
@ -173,6 +185,9 @@ pub async fn run(resume_from: Option<SessionId>) -> Result<SpawnOutcome, SpawnEr
}
}
if let Some(id) = form.resume_from {
form.resume_scope = Some(load_resume_scope(id).await?);
}
let overlay_toml = build_overlay_toml(&form);
// Phase 2: launch pod and wait for ready line. Drop the cursor
@ -271,6 +286,16 @@ async fn wait_for_ready(
let pod_bin = resolve_pod_command();
let cwd = std::env::current_dir().map_err(SpawnError::Io)?;
let pod_runtime_dir = manifest::paths::pod_runtime_dir(&form.name).ok_or_else(|| {
io::Error::new(
io::ErrorKind::NotFound,
"could not resolve runtime directory (set INSOMNIA_HOME, INSOMNIA_RUNTIME_DIR, XDG_RUNTIME_DIR, or HOME)",
)
})?;
std::fs::create_dir_all(&pod_runtime_dir).map_err(SpawnError::Io)?;
let stderr_path = pod_runtime_dir.join("stderr.log");
let stderr_file = std::fs::File::create(&stderr_path).map_err(SpawnError::Io)?;
let mut command = Command::new(&pod_bin);
command
.arg("--overlay")
@ -278,77 +303,151 @@ async fn wait_for_ready(
.current_dir(&cwd)
.stdin(Stdio::null())
.stdout(Stdio::null())
.stderr(Stdio::piped())
.kill_on_drop(true);
.stderr(Stdio::from(stderr_file))
.process_group(0);
if let Some(id) = form.resume_from {
command.arg("--session").arg(id.to_string());
}
let mut child = command.spawn().map_err(SpawnError::PodLaunchFailed)?;
let stderr = child
.stderr
.take()
.expect("stderr is piped; take() must succeed");
let mut reader = BufReader::new(stderr).lines();
let mut tail = StderrTail::new();
// Default `kill_on_drop = false` plus `process_group(0)` makes this
// a detached Pod for TUI lifecycle purposes once startup succeeds:
// dropping the handle does not terminate it, and terminal-generated
// signals for the TUI's process group do not hit the Pod. Runtime
// state/socket files are the source of truth after that point.
let ready = match wait_for_ready_file(terminal, form, &stderr_path, &mut child).await {
Ok(ready) => ready,
Err(e) => {
let _ = child.start_kill();
let _ = child.wait().await;
return Err(e);
}
};
tokio::spawn(async move {
let _ = child.wait().await;
});
Ok(ready)
}
let timeout = tokio::time::sleep(READY_TIMEOUT);
tokio::pin!(timeout);
async fn wait_for_ready_file(
terminal: &mut InlineTerminal,
form: &mut Form,
stderr_path: &std::path::Path,
child: &mut tokio::process::Child,
) -> Result<SpawnReady, SpawnError> {
let mut tail = StderrTail::new();
let deadline = tokio::time::Instant::now() + READY_TIMEOUT;
let mut offset = 0usize;
loop {
tokio::select! {
line = reader.next_line() => {
match line {
Ok(Some(line)) => {
if let Some(rest) = line.strip_prefix(READY_PREFIX) {
let mut parts = rest.splitn(2, '\t');
let pod_name = parts.next().unwrap_or("").to_string();
let socket_str = parts.next().unwrap_or("").to_string();
if pod_name.is_empty() || socket_str.is_empty() {
return Err(SpawnError::PodExitedEarly {
stderr_tail: format!("malformed ready line: {line}"),
});
}
let socket_path = PathBuf::from(socket_str);
let stderr_drain = tokio::spawn(async move {
while let Ok(Some(_)) = reader.next_line().await {}
});
return Ok(SpawnReady {
pod_name,
socket_path,
child,
stderr_drain,
});
}
tail.push(&line);
form.message = Some((line, MessageKind::Progress));
let _ = terminal.draw(|f| draw_form(f, form));
}
Ok(None) => {
let _ = child.wait().await;
let content = match tokio::fs::read_to_string(stderr_path).await {
Ok(content) => content,
Err(e) if e.kind() == io::ErrorKind::NotFound => String::new(),
Err(e) => return Err(SpawnError::Io(e)),
};
if content.len() > offset {
for line in content[offset..].lines() {
if let Some(rest) = line.strip_prefix(READY_PREFIX) {
let mut parts = rest.splitn(2, '\t');
let pod_name = parts.next().unwrap_or("").to_string();
let socket_str = parts.next().unwrap_or("").to_string();
if pod_name.is_empty() || socket_str.is_empty() {
return Err(SpawnError::PodExitedEarly {
stderr_tail: tail.into_string(),
stderr_tail: format!("malformed ready line: {line}"),
});
}
Err(e) => return Err(SpawnError::Io(e)),
let socket_path = PathBuf::from(socket_str);
wait_for_socket(
&socket_path,
deadline,
child,
stderr_path,
&mut tail,
&mut offset,
)
.await?;
return Ok(SpawnReady {
pod_name,
socket_path,
});
}
tail.push(line);
form.message = Some((line.to_string(), MessageKind::Progress));
let _ = terminal.draw(|f| draw_form(f, form));
}
offset = content.len();
}
if tokio::time::Instant::now() >= deadline {
return Err(SpawnError::Timeout);
}
tokio::select! {
status = child.wait() => {
let _ = status;
// Pod は exit 直前に最終 stderr 行を flush することがある。
// child.wait() が解決した後に再読みして、原因行を取りこ
// ぼさず PodExitedEarly に載せる。
drain_stderr_into_tail(stderr_path, &mut tail, &mut offset).await;
return Err(SpawnError::PodExitedEarly {
stderr_tail: tail.into_string(),
});
}
_ = &mut timeout => {
let _ = child.start_kill();
return Err(SpawnError::Timeout);
}
_ = tokio::time::sleep(Duration::from_millis(100)) => {}
}
}
}
async fn wait_for_socket(
socket_path: &std::path::Path,
deadline: tokio::time::Instant,
child: &mut tokio::process::Child,
stderr_path: &std::path::Path,
tail: &mut StderrTail,
offset: &mut usize,
) -> Result<(), SpawnError> {
loop {
match tokio::net::UnixStream::connect(socket_path).await {
Ok(_) => return Ok(()),
Err(e)
if e.kind() == io::ErrorKind::NotFound
|| e.kind() == io::ErrorKind::ConnectionRefused => {}
Err(e) => return Err(SpawnError::Io(e)),
}
if tokio::time::Instant::now() >= deadline {
return Err(SpawnError::Timeout);
}
tokio::select! {
status = child.wait() => {
let _ = status;
drain_stderr_into_tail(stderr_path, tail, offset).await;
return Err(SpawnError::PodExitedEarly {
stderr_tail: tail.as_string(),
});
}
_ = tokio::time::sleep(Duration::from_millis(50)) => {}
}
}
}
async fn drain_stderr_into_tail(
stderr_path: &std::path::Path,
tail: &mut StderrTail,
offset: &mut usize,
) {
let Ok(content) = tokio::fs::read_to_string(stderr_path).await else {
return;
};
if content.len() <= *offset {
return;
}
for line in content[*offset..].lines() {
if !line.starts_with(READY_PREFIX) {
tail.push(line);
}
}
*offset = content.len();
}
fn build_overlay_toml(form: &Form) -> String {
let mut root = toml::value::Table::new();
@ -356,7 +455,12 @@ fn build_overlay_toml(form: &Form) -> String {
pod.insert("name".into(), toml::Value::String(form.name.clone()));
root.insert("pod".into(), toml::Value::Table(pod));
if !form.cascade_has_scope {
if let Some(scope_config) = form.resume_scope.as_ref() {
root.insert(
"scope".into(),
toml::Value::try_from(scope_config).expect("scope serialisation cannot fail"),
);
} else if !form.cascade_has_scope {
let mut rule = toml::value::Table::new();
rule.insert(
"target".into(),
@ -374,6 +478,24 @@ fn build_overlay_toml(form: &Form) -> String {
toml::to_string(&toml::Value::Table(root)).expect("overlay serialisation cannot fail")
}
async fn load_resume_scope(session_id: SessionId) -> Result<ScopeConfig, SpawnError> {
let store_dir = manifest::paths::sessions_dir().ok_or_else(|| {
io::Error::new(
io::ErrorKind::NotFound,
"could not resolve sessions directory (set INSOMNIA_HOME, INSOMNIA_DATA_DIR, or HOME)",
)
})?;
let store = session_store::FsStore::new(&store_dir).await?;
let state = session_store::restore(&store, session_id).await?;
let snapshot = state
.pod_scope
.ok_or(SpawnError::MissingResumeScope { session_id })?;
Ok(ScopeConfig {
allow: snapshot.allow,
deny: snapshot.deny,
})
}
/// Resolves the binary used to launch a child Pod. Must point at a
/// `pod`-compatible executable — the parent reads the child's stderr
/// directly looking for `INSOMNIA-READY`, so any wrapper that emits
@ -407,6 +529,9 @@ impl StderrTail {
}
self.lines.push_back(line.to_string());
}
fn as_string(&self) -> String {
self.lines.iter().cloned().collect::<Vec<_>>().join(" | ")
}
fn into_string(self) -> String {
self.lines.into_iter().collect::<Vec<_>>().join(" | ")
}
@ -450,6 +575,10 @@ struct Form {
/// child pod is launched with `--session <id>` so it restores
/// from `id` and appends to the same session log.
resume_from: Option<SessionId>,
/// Scope snapshot recovered from the source session log. Set only for
/// resume runs, and serialized into the overlay instead of cwd-default
/// scope so resume does not silently broaden access.
resume_scope: Option<ScopeConfig>,
}
impl Form {
@ -625,6 +754,7 @@ mod tests {
message: None,
editing: true,
resume_from: None,
resume_scope: None,
}
}
@ -649,6 +779,30 @@ mod tests {
assert!(parsed.get("scope").is_none());
}
#[test]
fn overlay_uses_resume_scope_snapshot() {
let mut f = form("agent-r", false);
f.resume_from = Some(session_store::new_session_id());
f.resume_scope = Some(ScopeConfig {
allow: vec![manifest::ScopeRule {
target: PathBuf::from("/work/example"),
permission: manifest::Permission::Write,
recursive: true,
}],
deny: vec![manifest::ScopeRule {
target: PathBuf::from("/work/example/child"),
permission: manifest::Permission::Write,
recursive: true,
}],
});
let toml_str = build_overlay_toml(&f);
let parsed: toml::Value = toml::from_str(&toml_str).unwrap();
assert_eq!(parsed["pod"]["name"].as_str(), Some("agent-r"));
assert_eq!(parsed["scope"]["allow"].as_array().unwrap().len(), 1);
let deny = parsed["scope"]["deny"].as_array().unwrap();
assert_eq!(deny[0]["target"].as_str(), Some("/work/example/child"));
}
#[test]
fn cascade_merge_detects_scope_from_any_layer() {
let user = PodManifestConfig::from_toml(

View File

@ -22,7 +22,7 @@ use ratatui::widgets::{
};
use unicode_width::{UnicodeWidthChar, UnicodeWidthStr};
use protocol::{AlertLevel, CompletionEntry, Greeting, Segment};
use protocol::{AlertLevel, CompletionEntry, Greeting, PodEvent, Segment};
use crate::app::{App, CompletionState, alert_source_label, fmt_tokens};
use crate::block::{Block, CompactEvent, ThinkingBlock, ThinkingState};
@ -361,6 +361,20 @@ fn render_block_into(lines: &mut Vec<Line<'static>>, block: &Block, width: u16,
)));
}
Block::UserMessage { segments } => render_user_message(lines, segments, width, mode),
Block::Notify { message } => {
let text = format!("[notify] {message}");
match mode {
Mode::Overview => push_overview_line(lines, &text, width, MessageKind::Notify, ""),
_ => push_padded_lines(lines, &text, MessageKind::Notify),
}
}
Block::PodEvent { event } => {
let text = format_pod_event(event);
match mode {
Mode::Overview => push_overview_line(lines, &text, width, MessageKind::Notify, ""),
_ => push_padded_lines(lines, &text, MessageKind::Notify),
}
}
Block::AssistantText { text } => match mode {
Mode::Overview => push_overview_line(lines, text, width, MessageKind::Assistant, ""),
_ => push_padded_lines(lines, text, MessageKind::Assistant),
@ -913,6 +927,10 @@ fn greeting_lines(g: &Greeting) -> Vec<Line<'static>> {
pub enum MessageKind {
TurnHeader,
User,
/// External-input echoes (`Method::Notify` / `Method::PodEvent`).
/// Visually distinct from User / Assistant / Notice so it's clear
/// the line came from another Pod or operator, not the local user.
Notify,
Assistant,
Thinking,
TurnStats,
@ -924,6 +942,7 @@ pub fn kind_style(kind: MessageKind) -> Style {
match kind {
MessageKind::TurnHeader => Style::default().fg(Color::DarkGray),
MessageKind::User => Style::default().fg(Color::Green),
MessageKind::Notify => Style::default().fg(Color::Yellow),
MessageKind::Assistant => Style::default().fg(Color::White),
MessageKind::Thinking => Style::default()
.fg(Color::Magenta)
@ -939,3 +958,26 @@ pub fn kind_style(kind: MessageKind) -> Style {
.add_modifier(Modifier::BOLD),
}
}
/// One-line summary of a `PodEvent` for display in the activity log.
/// Independent from the LLM-injection wrapper (`crate::ipc::event::render_event`
/// in the pod crate) — that path applies prompt-pack wrapping, while
/// this is the human-facing rendering of the raw structured event.
fn format_pod_event(event: &PodEvent) -> String {
match event {
PodEvent::TurnEnded { pod_name } => {
format!("[pod_event] {pod_name} → turn_ended")
}
PodEvent::Errored { pod_name, message } => {
format!("[pod_event] {pod_name} → errored: {message}")
}
PodEvent::ShutDown { pod_name } => {
format!("[pod_event] {pod_name} → shut_down")
}
PodEvent::ScopeSubDelegated {
parent_pod, sub_pod, ..
} => {
format!("[pod_event] {parent_pod} → scope_sub_delegated: {sub_pod}")
}
}
}

View File

@ -90,32 +90,34 @@ Paused 中に Enter すると、入力の有無で 2 通り:
| キー | Running 中 | Idle / Paused |
|---|---|---|
| `Ctrl-X` | `Method::Cancel`(進行中ターンを破棄 → Idle | no-opエラー表示のみ |
| `Ctrl-X` | `Method::Cancel`(進行中ターンを破棄 → Idle | `Method::Shutdown`Pod を終了 |
| `Ctrl-C` | `Method::Pause`(進行中ターンを中断 → Paused | 1 回目 warn、3 秒以内の 2 回目で TUI 終了Pod は残る) |
| `Ctrl-D` | 1 回目 warn、3 秒以内の 2 回目で `Method::Shutdown` | `Method::Shutdown`Pod を終了 |
| `Ctrl-D` | TUI 終了Pod は残る、Pause しない) | TUI 終了Pod は残る |
### Cancel と Pause の違い
- **Cancel** は「ターンを捨てる」: 進行中の LLM リクエスト・未完了 tool を打ち切り、状態は Idle。続きは Resume できない
- **Pause** は「止めるけど続けられるように」: 同じく打ち切るが状態は Paused、空 Enter で Resume 可能
Running 中に割り込みたい場合、ほとんどのケースで `Ctrl-C`Pauseが自然。Ctrl-XCancelは明示的に破棄したい時LLM が暴走した時など)用。
Running 中に割り込みたい場合、ほとんどのケースで `Ctrl-C`Pauseが自然。Ctrl-XCancelは明示的に破棄したい時LLM が暴走した時など)用。Pod を終了したい場合は、先に Running ではない状態Idle / Pausedにしてから `Ctrl-X` で Shutdown する。
### Ctrl-C と Ctrl-D の 2 段階 UX
どちらも「破壊的に見える操作」は確認を挟む:
### Ctrl-C と Ctrl-D の終了 UX
- Ctrl-X Running 中: `Method::Cancel`。終了したい場合は、明示的にターンを止めて Idle に戻してからもう一度 `Ctrl-X`
- Ctrl-X Idle / Paused: `Method::Shutdown` を送って Pod を終了
- Ctrl-C Running 中: 1 回目で即 Pause破壊的ではない
- Ctrl-C Idle / Paused: 1 回目で warn メッセージ、3 秒以内の 2 回目で TUI 終了Pod は残る)
- Ctrl-D Running 中: 1 回目で warn、3 秒以内の 2 回目で Shutdown
- Ctrl-D Idle / Paused: 1 回目で即 Shutdown
- Ctrl-D: 状態に関わらず即 TUI 終了Pod は残る。Running 中でも Pause / Cancel / Shutdown は送らない
`Ctrl-C` は Pod は落とさず TUI プロセスだけ抜ける。`Ctrl-D` は Pod 自体に `Method::Shutdown` を送って終了させるPod プロセスが消える)。
`Ctrl-X` は Running 中だけ Cancel、Idle / Paused では Shutdown。`Ctrl-C` は Running 中だけ Pod に `Method::Pause` を送り、それ以外では Pod は落とさず TUI プロセスだけ抜ける。`Ctrl-D` は常に Pod へ制御メソッドを送らず TUI プロセスだけ抜ける。
TUI のダイアログから Pod を起動する経路では、起動した Pod は TUI の子プロセスとして管理・終了されず、独立したプロセスとして残る。TUI 終了後は `tui <pod-name>` で再接続できる。
## 履歴メモ
- かつて存在した `Ctrl-R`Resume 専用)は、空 Enter での Resume に統合されたため廃止
- かつて存在した `Esc`TUI 終了)は、`Ctrl-C` の 2 連打 UX に統合されたため廃止
- かつて `Ctrl-D` は Pod に `Method::Shutdown` を送っていたが、TUI だけを抜けるデタッチ操作に変更された
- 旧 inline viewport 時代は履歴スクロールを端末側スクロールバックに
任せていたため TUI 内のスクロールキーは存在しなかった。全画面 alt screen
への移行(`tickets/tui-fullscreen-overhaul.md`)で `Shift-Up/Down` ほか

View File

@ -0,0 +1,83 @@
# 注入される system message をワーカー履歴に永続化する
## 背景
現状、`Method::Notify` および `Method::PodEvent``TurnEnded` / `Errored` / `ShutDown` / `ScopeSubDelegated`)は、親 Pod 側で `NotifyBuffer` に積まれ、`PodInterceptor::pre_llm_request` で **その1リクエスト限り**のsystem messageとして注入される`crates/pod/src/ipc/interceptor.rs:147-159`)。
しかし `Worker::run``let mut request_context = self.history.clone();` してから interceptor を呼ぶ(`crates/llm-worker/src/worker.rs:862, 913`ため、interceptor が push した notification は clone 側にしか乗らない。一方、その後の LLM 応答(`assistant_items`)は `crates/llm-worker/src/worker.rs:946`**本体の `self.history`** に extend される。
結果、履歴上は「文脈ゼロから Pod がいきなり `ReadPodOutput` を呼んだ」「何の前触れもなく `child Pod が落ちた件について調べる』と発言した」などの状態になる。次のリクエスト時点で既に LLM 自身の自己一貫性が壊れており、コンパクション・再起動を待たずして破綻している。
同じ問題は `tickets/session-todo-reminder.md` で予定している `<system-reminder>` 注入にも存在する。当初「reminder は同じ内容を繰り返し注入する可能性があるから履歴を汚さない」という方針を取っていたが、これは category error だった:
- **キャッシュ破壊**: 揮発で last user message を mutate する設計だと、worker.history 側は元のままなので、毎回 reminder の有無/内容差分で「実際に LLM へ送る user message の content」が変動する。Anthropic prompt cache は anchor までしか効かないため、anchor 直後の生成が毎回 cache miss になる
- **LLM の自己一貫性**: turn N で reminder を見て `TaskUpdate` を叩いた → turn N+1 では reminder が消えて、自分の `TaskUpdate` 呼び出しだけ残る。Notify と全く同じ因果切断
- **resume 時の不整合**: ロードした history からは reminder が完全消失している状態で再開する
- **「繰り返し注入で履歴肥大」の前提も弱い**: cooldown 設計上 reminder は idle 期間に1回 + 反応で counter リセット。連発は元々しない。仮に複数回出ても、それぞれが「その時点での active Task の snapshot」として履歴に並ぶのは因果として正しい
つまり「LLM に投げた system message は、その時点で history に commit する」が原則で、Notify / PodEvent / `<system-reminder>` を一律にこの原則に揃える。
## 方針
- LLM リクエスト直前に注入される system message は、`request_context: &mut Vec<Item>` の clone 側ではなく **worker 本体の `history` 側** に append する
- `NotifyBuffer`**「次のリクエスト直前で `worker.history` に append するキュー」** として再定義する
- 永続化(`history.json`)は worker.history 経由で自動的についてくる(`PodSharedState::history_json` → `RuntimeDir::write_history`
- 対象は現状の `Method::Notify` / `Method::PodEvent` に加え、session-todo-reminder で予定している `<system-reminder>` 注入も含む(後者は前者と同じ NotifyBuffer に乗せるか別キューを立てるかは実装裁量。重要なのは「history に commit される」点)
- `notify_wrapper` の文言(`[Notification] ... not a blocking request`)はそのまま履歴に残してよい。後から見ても「これは ambient 通知だった」と分かる方が望ましい
- `<system-reminder>` も同様、タグ込みのまま history に残す(タグ形式 `<system-reminder>...</system-reminder>` の規約自体は維持)
## 要件
### Notify / PodEvent 経路の挙動変更
- `NotifyBuffer::drain` 由来の Item は `request_context` ではなく `worker.history` に append される
- append は **次の LLM リクエスト直前** に1回だけ起きる複数 notify が貯まっていれば順序を保って複数 Item として並ぶ)
- append 後、`history.json` への永続化が通常の history mutation と同じパスで起きる
- 永続化された Item は次回 resume 時にそのまま履歴の一部として読み戻される
### `<system-reminder>` 注入経路session-todo-reminder の前提変更)
- `<system-reminder>` ブロックを「直近 user message を mutate して append」する設計を撤回し、**新規 system message Item として `worker.history` に append** する形に変更する
- ライフサイクルは Notify / PodEvent と同じ: 注入条件を満たした時点で history に commit、`history.json` に永続化、resume 後も読み戻される
- session-todo-reminder.md 側の「履歴を汚さない」「`get_history` / セッションログには現れない」「last user message を mutate」記述は本ticketの方針で上書きする
### 注入レーンの統一
- 「LLM リクエスト直前に注入される system message」は一律 history レーンに乗せる、と `crates/pod/src/ipc/notify_buffer.rs` のモジュールdocに明記する
- 「揮発history を汚さない)レーン」の概念は廃する。将来 reminder 系を追加する際も同じ原則に従う
- 命名・配置を見直す必要があれば実装内で判断してよい(例: `NotifyBuffer``PendingSystemMessages` 等に改称、reminder 用の別キューを作る等。本ticketは挙動の正しさが最優先で、抽象の作り方は実装者裁量
### 既存テスト・ドキュメントの更新
- `crates/pod/src/ipc/interceptor.rs``pre_llm_request_drains_pending_notifies_into_context` 系テストは、`request_context` ではなく `worker.history` への反映を検証する形に書き換える
- `crates/pod/tests/pod_events_test.rs``PodEvent` 受信後に history に対応 Item が現れることを E2E に近い粒度で確認するケースを追加する
- 既存の「揮発レーン」前提のコメント(`crates/pod/src/ipc/notify_buffer.rs:5` の `(never into the Worker's persistent history)` 等)を新方針に合わせて書き換える
- `tickets/session-todo-reminder.md` の方針記述を本ticketの完了に合わせて更新するまたは本ticket完了時点で先行修正してよい
- `TODO.md` 末尾の「タグ形式と『履歴を汚さない』原則は session-todo で先行確立」記述から後者を撤回する
## 完了条件
- 親 Pod が `Method::Notify` または `Method::PodEvent` を受信すると、その後の最初の LLM リクエスト直前に対応 system message が **`worker.history` に append** され、リクエストにも含まれる
- 同じ Item が `history.json` に書かれており、`Pod::resume` 後に履歴の一部として読み戻される
- LLM が notification に反応して取った行動tool call / 応答)と、そのトリガーとなった notification Item が、履歴上で因果順に並んでいる
- session-todo-reminder で導入される `<system-reminder>` 注入も同じく history append として実装されるまたは、実装順次第で本ticketは Notify / PodEvent 側だけ完了させ、session-todo-reminder 実装時にこの原則に従う形でもよい。後者の場合は session-todo-reminder.md 側の方針記述を本ticket完了時に更新済みであることが必須
- 単体テストで上記が確認できる
## 範囲外
- `notify_wrapper` の文言・phrasing の見直し
- TUI 側の Notify / PodEvent 表示(`Event::PodEvent` 経路は既存通り)
- compaction 時の notify Item の扱い(通常 Item と同じく compaction 対象になればよい。特別扱いは不要)
- `<system-reminder>` 注入機構の汎用化(`TODO.md` の既存項目。本ticketは個別実装の方針統一だけ扱う
## 参照
- 設計指針: `CLAUDE.md`
- 関連: `crates/pod/src/ipc/notify_buffer.rs`、`crates/pod/src/ipc/interceptor.rs`、`crates/pod/src/ipc/event.rs`、`crates/pod/src/controller.rs`(受信ハンドリング)、`crates/llm-worker/src/worker.rs:862, 913, 946`clone する側)
- 方針反転対象: `tickets/session-todo-reminder.md`「履歴を汚さない」前提を本ticketで撤回
## Review
- 状態: Approve
- レビュー詳細: [./notify-history-persist.review.md](./notify-history-persist.review.md)
- 対象コミット: `e804577 feat: notify-history-persist実装`
- 日付: 2026-05-03

View File

@ -0,0 +1,47 @@
# Review: 注入される system message をワーカー履歴に永続化する
対象コミット: `e804577 feat: notify-history-persist実装`
## 前提・要件の確認
### Notify / PodEvent 経路の挙動変更
- **`NotifyBuffer::drain` 由来の Item を `worker.history` に append**: 達成。`crates/llm-worker/src/worker.rs:859-867` で turn loop の先頭、per-request clone の前に `interceptor.pending_history_appends().await``self.history.extend(...)` で本体側に commit する。`crates/pod/src/ipc/interceptor.rs:127-145` で旧 `pre_llm_request` の注入ロジックを新メソッドに移送。
- **次の LLM リクエスト直前に1回だけ、複数 notify は順序保持で並ぶ**: 達成。`drain()` は破壊的、ループ各 iteration で1回呼ばれる。`pending_history_appends_drains_buffer_into_items` テストで複数 entry が順序保持で返ること、再呼出で空が返ることを検証。
- **`history.json` への永続化**: 達成。`worker.history` への通常の mutation と同じ経路(既存の `PodSharedState::history_json``RuntimeDir::write_history`)に乗る。実装上の追加配線は不要で、設計意図通り。
- **resume 時に読み戻される**: 達成(同上、既存パスを通る)。
### `<system-reminder>` 注入経路session-todo-reminder の前提変更)
- 本コミットでは reminder 実装そのものは扱わず、`tickets/session-todo.md` の方針記述のみ書き換え。`pending_history_appends` のドキュメント (`crates/llm-worker/src/interceptor.rs:121-148`) が新規 system message を history に commit する責務であることを明示し、reminder もこのレーンに乗ることが構造上保証される。チケットが許容する後者パターンsession-todo-reminder 実装時に同原則に従う)を取っており、整合済み。
### 注入レーンの統一
- **`notify_buffer.rs` モジュール doc**: 達成。`crates/pod/src/ipc/notify_buffer.rs:1-19` で「single lane for system messages produced by Pod state」「no transient, history-skipping lane」を明記し、`tickets/notify-history-persist.md` と `AGENTS.md` を参照。
- **trait レベルの規約**: `crates/llm-worker/src/interceptor.rs:121-148``pending_history_appends``pre_llm_request` の役割を doc で対比。`pre_llm_request` を「決定的・履歴非依存変換」のみに使い、外部 input は前者に乗せる旨を明記。CLAUDE.md / AGENTS.md の「LLM コンテキストの加工原則」の体現として適切。
### 既存テスト・ドキュメント更新
- `pre_llm_request_drains_pending_notifies_into_context``pending_history_appends_drains_buffer_into_items` にリネームし、新レーン側の挙動検証に置換。
- `pre_llm_request_skips_notification_injection_when_yielding``pre_llm_request_does_not_touch_pending_notifies` に変更。新方針上、yield 経路で buffer を保持する責務は不要history に既に commit 済みであれば resume 後に再取得不要)なので、合理的な置換。
- `controller_test.rs``notify_while_idle...` および `pod_event_turn_ended_while_idle...``handle.shared_state.history()` への assertion を追加し、request 側だけでなく history 側にも反映されることを検証。
- `tickets/session-todo.md` / `TODO.md` 末尾の「履歴を汚さない」記述を撤回・上書き済み。
## アーキテクチャ・スコープ
- **責務分離**: `Interceptor` trait に default 実装付きの新メソッドを1個追加し、Worker が turn loop で1回呼ぶだけ。llm-worker 層は「外部 input の中身」を知らず、低レベル基盤として hook されるだけ、という layer 区分を維持している。
- **抽象の規模**: 新規概念は `pending_history_appends` 1個のみ。NotifyBuffer の改名や別キュー新設には踏み込まず、ticket の「実装裁量」のうち最小コストの選択(既存 buffer をそのまま流用、レーンの解釈を doc で更新)を取った。歪みなし。
- **挙動の対称性**: 旧来 yield 時に buffer を保持していた設計が消えるが、新原則では「外部 input は受信時点で history 化されるべき」なので、turn loop 先頭での無条件 append は正しい。compaction 後 resume では buffer は既に空・履歴に entry あり、という状態に揃うのが意図通り。
- **LLM コンテキストの加工原則との整合**: 本ticketは原則そのものの適用であり、doc 側でも `pre_llm_request``pending_history_appends` の責務境界を明示している。原則に正面から沿った変更で、コードベースを歪めるどころか、これまでの破綻を正している。
## 指摘事項
### Blocking
- なし。
### Non-blocking / Follow-up
- **ticket が名指しした test ファイル位置との乖離**: ticket 要件に「`crates/pod/tests/pod_events_test.rs` で `PodEvent` 受信後に history に対応 Item が現れることを E2E に近い粒度で確認するケースを追加する」と明記されているが、本コミットは既存 `controller_test.rs::pod_event_turn_ended_while_idle_auto_starts_turn_and_injects_system_message` に history assertion を追加する形で代替している。機能カバレッジは同等以上(既存テストを重複させない方が健全)だが、文字通りの要件には合わない。今後の維持で問題は出ない見込みだが、必要なら ticket 側を「カバレッジは controller_test.rs 側で確保済み」と注記しておくと将来の読み手が混乱しない。
- **compaction yield 経路の挙動が暗黙化**: 旧 `pre_llm_request_skips_notification_injection_when_yielding` は「yield 中は buffer に積み残す」を保証していた。新設計では「turn loop 冒頭で必ず history に commit、yield しても history は残る」だが、これを直接検証するテストcompaction 発火 → resume 後に history へ notify が残ること)は追加されていない。既存 compaction テストが history 側で間接的に保護してくれる範囲だが、回帰検出力としては薄い。session-todo-reminder で reminder を載せる際に同種の検証を入れる前提なら follow-up で十分。
### Nits
- `crates/pod/src/ipc/interceptor.rs:14-22` の use 順序: `tracing::warn`line 20`llm_worker::interceptor::...`line 16-19`llm_worker::tool::ToolOutput`line 21の間に入り、`llm_worker::` グループが分断されている。`tracing::warn` を `tracing::info` の隣line 22に置けば従来のグルーピングが保てる。`cargo fmt` がこれを揃えているなら無視可。
## 判断
**Approve完了可** — 要件は実質すべて満たされており、CLAUDE.md / AGENTS.md の「LLM コンテキストの加工原則」を体現する模範的な変更。ticket が名指しした test ファイル位置との微妙な不一致は機能カバレッジ等価のため非ブロッキング。

View File

@ -1,40 +0,0 @@
# Resume 時の Scope Claim の改善
## 背景
`tickets/dynamic-scope.md` で in-process Scope の縮小SpawnPod による委譲時の Write revokeと pod-registry 上の delegation 記録が揃った。これにより「セッション中に scope が縮む」状態を Pod / registry の双方が一貫して表現できる。
一方で `tui -r` 経由の resume は、`crates/tui/src/spawn.rs` の `build_overlay_toml` を通じて fresh spawn と同じロジックで overlay を合成する。manifest cascade に scope 宣言が無い場合、cwd 直下に `write` 再帰の rule を毎回付ける挙動。
このため次のような衝突が起きる:
- セッション S が稼働中に SpawnPod で子 C を作り、cwd 配下のサブパスを委譲した
- 親が exit、子 C は registry 上にエントリが残存(あるいはまだ稼働中)
- ユーザーが S を resume しようとすると、新しい Pod が cwd 全体に `write` を claim → 委譲された部分と overlap して registry が拒否
resume の意図は「過去のセッションの続きを取る」であって「過去の effective scope より広い範囲を新たに掴み直す」ではない。現状は後者になっており、過去に手放した scope を resume が勝手に取り戻そうとする形になっている。
## ゴール
セッション resume 時に claim する scope が、当該セッションが最後に持っていた effective scope に揃う。委譲済み・他 Pod が保持中の部分は claim 対象から外れ、resume された Pod は当時と同じ範囲だけで動作する。
## 要件
- resume 時の overlay 合成は cwd 盲信ではなく、当該セッションが過去に持っていた scope を反映する。情報源は session log / registry / その他のいずれでも良いが、何らかの永続情報から復元できること
- 過去の scope 情報が取得できないセッション(旧形式 / 破損)は、明示的なエラーで止めるか、ユーザーに確認させてから fresh claim にフォールバックする(黙って広げない)
- claim 試行が registry の既存 allocation と衝突した場合、エラーメッセージで衝突相手の Pod 名 と target rule の双方が伝わる(現状は Pod 名のみ)
- 委譲済みエントリ(`delegated_from` を持つ allocationが同じセッションの委譲チェーンに属する場合、resume はその範囲を claim せずに進行する
## 完了条件
- 「親 Pod がセッション中に SpawnPod を実行 → 子に委譲 → 親 exit → 親セッションを resume」のフローが、既存子 allocation を残したまま衝突なしで成功する
- 既存の無関係な Pod と衝突するケースは、衝突 rule と相手 Pod 名を含む明確なエラーで失敗する
- 単体テスト or 統合テストで上記 2 ケースが検証される
- 既存の fresh spawn (resume なし) の挙動には変化なし
## 範囲外
- 過去スコープの永続化スキーマを新規導入するかの判断は実装時に決めるsession log の既存フィールドで足りるなら追加しない)
- 自動的に既存 Pod を kill / reclaim して claim を通す挙動
- protocol 経由の外部からの GrantScope / RevokeScope`tickets/dynamic-scope.md` の範囲外宣言を継承)
- registry 側のエラー型の全面再設計rule 情報を含めるための最小限の拡張のみで足りる想定)

View File

@ -1,74 +0,0 @@
# セッションメトリクス: Extension 経由の汎用計測レーン
## 背景
セッション中の挙動を後から定量評価したい需要が増えている。直近の動機は Prune projection の効果測定どこでどれくらいの頻度で発火したか、KV キャッシュ無効化のコストに対して回収トークンがどれだけあったかだが、Compact 起動条件、Hook の実行時間、ツール呼び出しのリトライ回数など、同種の「セッション中に積み上げて後で引きたい」値は今後も発生する。
現状の session-log は `UserInput` / `AssistantItems` / `LlmUsage` といった状態遷移を典型化した variant のみで、ad hoc な計測値の置き場所がない。一方 `LogEntry::Extension { domain, payload }` という汎用エスケープハッチが既に用意されており、ここに乗せれば hash chain・replay 経路を流用できる。
## 方針
- `LogEntry::Extension``domain = "metrics"` 名前空間として実装する。session-store の型は触らない
- メトリクス型は `name + dimensions(sparse map) + value(Option) + correlation_id(Option)` 程度の最小スキーマ。値が測れない次元は `None` で明示し、Prometheus 的な厳格な label set は持たない
- 「後から埋まる値」(例: prune 発火直後の LLM 呼び出しで観測される `cache_read_tokens`)は前 entry に書き戻さず、`correlation_id` を共有する別 metric として流す。集計は読み手側で join
- 集計 / 可視化 API は本チケットでは作らない。session-log を読めば取り出せる、までを到達点とする
## 要件
### メトリクス型
- 専用 crateまたは既存の適切な配置に定義。`serde` で JSON ラウンドトリップ可能
- 必須: `name``namespace.metric` 形式の文字列、例: `prune.fire`)、`ts`u64 epoch ms
- 任意: `dimensions: BTreeMap<String, String>`、`value: Option<f64>`、`correlation_id: Option<String>`
- 「unknown」は対応フィールドを `None` にすることで表現。schema レベルで dimension の網羅性は要求しない
### 書き込み経路
- `Pod` から呼べる薄いヘルパー(例: `Pod::record_metric(&self, metric)`) を session-store に追加し、`LogEntry::Extension { domain: "metrics", payload: serde_json::to_value(metric) }` として append する
- 既存の `append_entry` フローを踏襲し、hash chain に乗る
- 書き込み失敗store IO エラー)はメトリクス側で握りつぶす。本体処理を阻害しない
### 読み出し経路
- replay 時、`RestoredState.extensions` に `("metrics", payload)` として既に積まれる(既存挙動)
- メトリクスドメイン側で payload を `Vec<Metric>` に fold するヘルパーを提供
- session-store のテストハーネスから「特定セッションの metric 列を取り出す」サンプルが書ける状態にする
### 最初の利用者: Prune projection
本チケットの完了は、最低 1 つの実利用者が乗っていることを条件とする。Prune を最初の利用者として組み込む:
- `pod::compact::prune``attach_prune` 経路で、projection 評価のたびに以下を発行
- 発火時: `name = "prune.fire"`, `dimensions = { border_turn, candidate_count }`, `value = estimated_savings`, `correlation_id = <次の LLM 呼び出しと紐付ける ID>`
- スキップ時: `name = "prune.skip"`, `dimensions = { reason }``below_min_savings` / `no_candidates` 等)
- 直後の LLM リクエストで `LlmUsage` が記録される際、同じ `correlation_id` を持つ補助 metric `prune.post_request` を併発し、`cache_read_tokens` / `cache_write_tokens` を value/dimension として記録
- `correlation_id` の生成・伝搬経路は実装側で決定。既存の request-id 系があれば再利用
### Resume 互換
- 旧セッションmetric entry を持たないログ)の replay は何も変えない。`extensions` に metrics domain が無いだけ
- payload schema が将来変わった場合、deserialize 失敗した metric は無視してよいfold ヘルパー側で `serde_json::from_value` の Err を skip
## 完了条件
- メトリクス型と書き込み / 読み出しヘルパーが定義され、unit test がある
- Prune projection から `prune.fire` / `prune.skip` / `prune.post_request` が session-log に乗る
- 既存セッションログの replay が壊れない(後方互換)
- セッションログから prune metric 列を取り出すテストが通る
- correlation_id で prune 発火と直後の LLM 呼び出しの cache 値が join できることを test で示す
## 範囲外
- 集計 / 可視化ツールCLI / TUI。後続で別途
- ワークスペースまたぎのメトリクス集約(複数セッション横断分析)
- リアルタイム購読 APIWatcher 経由の stream 配信)
- session-log 以外の sinkjsonl 別系統、外部時系列 DB 等)
- Prune 以外のメトリクス利用者の追加Compact / Hook 等は別チケット)
- メトリクス保存量の自動圧縮 / 退避
## 参照
- 設計指針: `CLAUDE.md`(最小の構造化 / 概念の追加は不在が問題になってから)
- `crates/session-store/src/session_log.rs``LogEntry::Extension` と `RestoredState.extensions` の既存仕様)
- `crates/llm-worker/src/usage_record.rs`、`crates/llm-worker/src/llm_client/event.rs`cache_read / cache_write の取得経路)
- `crates/pod/src/compact/prune.rs`、`crates/llm-worker/src/prune.rs`(最初の利用者の挿入点)

View File

@ -14,8 +14,8 @@
- **保存先は `tools` 層の session-lifetime 状態**。`Tracker` と同じ生存スコープで `Pod` が所有。`Arc<Mutex<Vec<TodoItem>>>` ベースの `TodoStore` を tool に注入する
- **永続化は専用レーンを持たない**。`tool_call.arguments` がセッションログに既に乗っているため、resume 時には履歴 replay の中で最後の `todo_write` 引数を `TodoStore` に再適用すれば状態が復元される
- **注意機構は `pre_llm_request` Interceptor**。直近の user message に `<system-reminder>` ブロックを揮発的に append するだけ。履歴・ログには載せない
- **system-reminder 注入の汎用化はやらない**。利用者が TODO 1個しかない段階で抽象を立てないCLAUDE.md「概念の追加は不在が問題になってから」。ただし「タグ形式は `<system-reminder>...</system-reminder>` で揃える」「履歴は汚さない」の2点は本実装で確立し、将来の追加機構が同じ規約に乗れるようにする
- **注意機構は `Interceptor::pending_history_appends`**。未完了 TODO がある場合に新規 system message Item として `worker.history` に append する。Notify / PodEvent と同じ lane に乗せ、`history.json` への永続化と resume 後の読み戻しは worker.history 経由で自動的についてくる(→ `tickets/notify-history-persist.md`
- **system-reminder 注入の汎用化はやらない**。利用者が TODO 1個しかない段階で抽象を立てないCLAUDE.md「概念の追加は不在が問題になってから」。ただし「タグ形式は `<system-reminder>...</system-reminder>` で揃える」点は本実装で確立し、将来の追加機構が同じ規約に乗れるようにする
## 要件
@ -43,19 +43,19 @@
### 注意機構Interceptor
- `pre_llm_request` で `Vec<Item>` を受け取り、未完了 TODO`pending` または `in_progress`)が 1 件でも存在する場合に発動
- 直近の user message の contentまたは content[最終 text part])の末尾に `<system-reminder>` ブロックを append
- ブロック内には現在の TODO リストを、status を含む簡潔な形式で列挙
- 履歴 (`Worker` の保持する `Vec<Item>`) は変更しない。リクエスト送信時の Vec のみ加工
- TODO が空の場合は何も差し込まない
- `pending_history_appends` で未完了 TODO`pending` または `in_progress`)が 1 件でも存在する場合に発動し、`<system-reminder>` ブロックを含む新規 system message Item を返す
- Worker はこれを `worker.history` に append し、その後の per-request clone でリクエストにも含める。永続化 / resume / compaction は通常 Item と同じ扱い
- ブロック内には現在の TODO リストを、status を含む簡潔な形式で列挙する
- TODO が空の場合は空の `Vec<Item>` を返し、何も差し込まない
- cooldown は idle 期間に1回 + 反応で counter リセットの設計上、reminder の連続注入は構造的に起きない(仮に複数回出ても、それぞれが「その時点での active TODO snapshot」として履歴に並ぶのは因果として正しい
## 完了条件
- `todo_write` ツールが builtin tool として登録され、Pod で利用できる
- LLM が `todo_write` を呼ぶと TodoStore が更新され、その後の `pre_llm_request` で system-reminder として LLM に再提示される
- LLM が `todo_write` を呼ぶと TodoStore が更新され、その後の `pending_history_appends` で system-reminder Item として `worker.history` に append され、リクエストにも含まれる
- セッションを resume すると、最後の `todo_write` の状態から再開される
- compact を跨いでも、未完了 TODO が新セッション冒頭の system message として残る
- system-reminder の注入は揮発的で、`get_history` / セッションログには現れない
- 注入された system-reminder Item は `worker.history` / `history.json` / `get_history` のいずれにも現れる(揮発レーンは持たない方針 → `tickets/notify-history-persist.md`
- 単体テストで `todo_write` の更新挙動 / replay 復元 / Interceptor の差し込みがカバーされる
## 範囲外

View File

@ -1,20 +1,29 @@
# TUI で auto-kick 由来のターンが表示されない
# TUI で Pod への外部入力 (Notify / PodEvent) が描画されない
## 背景
Pod が `Method::PodEvent::TurnEnded` などを socket 経由で受信すると、controller は notification を notify buffer に積み、Idle なら `pod.run_for_notification()` で新しいターンを起動する(`crates/pod/src/controller.rs:611-687`)。このターンの assistant 出力 (`Event::TurnStart` / `TextDelta` / `TurnEnd` 等) は通常通り broadcast Event として全クライアントTUI 含む)に配信されるはず
Pod が `Method::Notify` / `Method::PodEvent` を socket 経由で受信すると、controller は内容を notify buffer に積み、Idle なら `pod.run_for_notification()` で新しいターンを起動する(`crates/pod/src/controller.rs`。auto-kick されたターンの assistant 出力 (`Event::TurnStart` / `TextDelta` / `TurnEnd` 等) は通常通り broadcast Event として全クライアントTUI 含む)に配信される。
## 問題
socat で稼働中の codex-oauth pod の socket に `Method::PodEvent::TurnEnded` を 1 行流したところ、socat 側の subscribe には turn が完全に流れてきたthinking_delta / text_done / turn_end 取得済み)が、同じ pod を起動している TUI 画面には新ターンが描画されなかった。
`Method::Run` 経由の通常ターンは TUI に表示されるので、broadcast 配信そのものは生きている。auto-kick 由来のターンuser_message を伴わない turnに固有の表示パスで落ちている可能性が高い
`Method::Run` 経由の通常ターンは TUI に表示されるので、broadcast 配信そのものは生きている。原因は **Pod が受信した外部入力 (`Method::Notify` / `Method::PodEvent`) が broadcast event として echo されておらず**、TUI からは「何も入力がない状態で突然ターンが始まる」ように見えていることにある。auto-kick ターンが描画されない件はこの下流症状の一つ
## 要件
- auto-kick で起動したターンuser 入力を伴わないターンも、user 由来ターンと同様に TUI 履歴に表示される。
- turn header 等の見た目で「通知由来である」ことを示す表記を入れるかは別議論。
- Pod が socket で受信した外部入力のうち、活動ログとして残すべきもの (`Method::Notify` / `Method::PodEvent`) を broadcast event として全 subscriber に echo する。
- TUI はその event を user message / assistant text と並列のログ要素として描画する。
- auto-kick 由来ターン (`TurnStart` 以降) は既存経路で従来通り表示される。Notify / PodEvent 受信が表示されるようになれば、ターン境界の出所はログ上で自然に区別できる。
## 範囲外 / 非目標
- LLM 注入テキスト (`notify_wrapper` 適用後の wrapped string) を UI に見せるかは別判断。本チケットでは **raw メッセージをそのまま echo** する形で着地する。UI 側で wrapper を適用したくなったら、別途 catalog を引く形で対応する。
- `starts_turn` 等の「auto-kick フラグ」を新 event に持たせない。ターン境界制御は `TurnStart` の責務であり、入力 echo event はあくまで入力ログ要素のみを表す。
- protocol に追加する Event variant は **入力 echo の責務だけ**を持ち、UI 通知 (toast / OS 通知) を兼ねない。
## 完了条件
- 親 pod が PodEvent を受信して auto-kick した際、TUI 上で thinking / assistant text / turn_end が user 由来ターンと同様に表示される。
- socket に `Method::Notify { message }` を流すと、全 subscriberTUI 含む)のログにその通知本文が user / assistant と並列の要素として表示される。
- socket に `Method::PodEvent::TurnEnded` 等を流すと、その受信を示すログ要素 + 後続 auto-kick ターンの thinking / assistant text / turn_end が、user 由来ターンと同様に TUI に表示される。
- 追加した broadcast event は `Method::Notify` / `Method::PodEvent` の payload と一対一対応し、`starts_turn` のような派生フラグを持たない。