close: prompt occupancy estimator
This commit is contained in:
parent
e51944f045
commit
623f45a254
|
|
@ -2,12 +2,12 @@
|
||||||
id: 20260601-001616-prompt-occupancy-token-estimator
|
id: 20260601-001616-prompt-occupancy-token-estimator
|
||||||
slug: prompt-occupancy-token-estimator
|
slug: prompt-occupancy-token-estimator
|
||||||
title: Token estimator must keep prompt occupancy accounting whole
|
title: Token estimator must keep prompt occupancy accounting whole
|
||||||
status: open
|
status: closed
|
||||||
kind: task
|
kind: task
|
||||||
priority: P1
|
priority: P1
|
||||||
labels: [compaction, token-accounting]
|
labels: [compaction, token-accounting]
|
||||||
created_at: 2026-06-01T00:16:16Z
|
created_at: 2026-06-01T00:16:16Z
|
||||||
updated_at: 2026-06-01T00:59:20Z
|
updated_at: 2026-06-01T01:10:06Z
|
||||||
assignee: null
|
assignee: null
|
||||||
legacy_ticket: null
|
legacy_ticket: null
|
||||||
---
|
---
|
||||||
|
|
@ -0,0 +1,15 @@
|
||||||
|
Merged and completed.
|
||||||
|
|
||||||
|
Implementation:
|
||||||
|
- Merged branch `prompt-occupancy-token-estimator` into `develop` with `merge: prompt occupancy estimator`.
|
||||||
|
- `llm-worker` token counter extrapolation now keeps exact measured prompt occupancy authoritative and no longer extrapolates one-measurement growth via `total_input_tokens / history_bytes`.
|
||||||
|
- Extrapolation past the latest measurement uses a measured incremental span rate when available; otherwise it adds a conservative byte fallback for the unmeasured delta.
|
||||||
|
- Added pod interceptor regression coverage for the fresh-session / one-measurement overestimation case.
|
||||||
|
|
||||||
|
Validation after merge:
|
||||||
|
- `cargo test -p llm-worker token_counter` passed.
|
||||||
|
- `cargo test -p pod pre_llm_request_does_not_yield_from_single_measurement_history_rate_projection` passed.
|
||||||
|
- `./tickets.sh doctor` passed.
|
||||||
|
|
||||||
|
Review:
|
||||||
|
- External reviewer approved with no blockers.
|
||||||
|
|
@ -122,4 +122,27 @@ Non-blocking follow-up:
|
||||||
- Some comments still describe extrapolation as a latest/final measurement rate even though the implementation is now latest measured incremental span or byte fallback. Reviewer classified this as documentation drift only, not a blocker.
|
- Some comments still describe extrapolation as a latest/final measurement rate even though the implementation is now latest measured incremental span or byte fallback. Reviewer classified this as documentation drift only, not a blocker.
|
||||||
|
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
<!-- event: close author: hare at: 2026-06-01T01:10:06Z status: closed -->
|
||||||
|
|
||||||
|
## Closed
|
||||||
|
|
||||||
|
Merged and completed.
|
||||||
|
|
||||||
|
Implementation:
|
||||||
|
- Merged branch `prompt-occupancy-token-estimator` into `develop` with `merge: prompt occupancy estimator`.
|
||||||
|
- `llm-worker` token counter extrapolation now keeps exact measured prompt occupancy authoritative and no longer extrapolates one-measurement growth via `total_input_tokens / history_bytes`.
|
||||||
|
- Extrapolation past the latest measurement uses a measured incremental span rate when available; otherwise it adds a conservative byte fallback for the unmeasured delta.
|
||||||
|
- Added pod interceptor regression coverage for the fresh-session / one-measurement overestimation case.
|
||||||
|
|
||||||
|
Validation after merge:
|
||||||
|
- `cargo test -p llm-worker token_counter` passed.
|
||||||
|
- `cargo test -p pod pre_llm_request_does_not_yield_from_single_measurement_history_rate_projection` passed.
|
||||||
|
- `./tickets.sh doctor` passed.
|
||||||
|
|
||||||
|
Review:
|
||||||
|
- External reviewer approved with no blockers.
|
||||||
|
|
||||||
|
|
||||||
---
|
---
|
||||||
Loading…
Reference in New Issue
Block a user