close: prompt occupancy estimator
This commit is contained in:
parent
e51944f045
commit
623f45a254
|
|
@ -2,12 +2,12 @@
|
|||
id: 20260601-001616-prompt-occupancy-token-estimator
|
||||
slug: prompt-occupancy-token-estimator
|
||||
title: Token estimator must keep prompt occupancy accounting whole
|
||||
status: open
|
||||
status: closed
|
||||
kind: task
|
||||
priority: P1
|
||||
labels: [compaction, token-accounting]
|
||||
created_at: 2026-06-01T00:16:16Z
|
||||
updated_at: 2026-06-01T00:59:20Z
|
||||
updated_at: 2026-06-01T01:10:06Z
|
||||
assignee: null
|
||||
legacy_ticket: null
|
||||
---
|
||||
|
|
@ -0,0 +1,15 @@
|
|||
Merged and completed.
|
||||
|
||||
Implementation:
|
||||
- Merged branch `prompt-occupancy-token-estimator` into `develop` with `merge: prompt occupancy estimator`.
|
||||
- `llm-worker` token counter extrapolation now keeps exact measured prompt occupancy authoritative and no longer extrapolates one-measurement growth via `total_input_tokens / history_bytes`.
|
||||
- Extrapolation past the latest measurement uses a measured incremental span rate when available; otherwise it adds a conservative byte fallback for the unmeasured delta.
|
||||
- Added pod interceptor regression coverage for the fresh-session / one-measurement overestimation case.
|
||||
|
||||
Validation after merge:
|
||||
- `cargo test -p llm-worker token_counter` passed.
|
||||
- `cargo test -p pod pre_llm_request_does_not_yield_from_single_measurement_history_rate_projection` passed.
|
||||
- `./tickets.sh doctor` passed.
|
||||
|
||||
Review:
|
||||
- External reviewer approved with no blockers.
|
||||
|
|
@ -122,4 +122,27 @@ Non-blocking follow-up:
|
|||
- Some comments still describe extrapolation as a latest/final measurement rate even though the implementation is now latest measured incremental span or byte fallback. Reviewer classified this as documentation drift only, not a blocker.
|
||||
|
||||
|
||||
---
|
||||
|
||||
<!-- event: close author: hare at: 2026-06-01T01:10:06Z status: closed -->
|
||||
|
||||
## Closed
|
||||
|
||||
Merged and completed.
|
||||
|
||||
Implementation:
|
||||
- Merged branch `prompt-occupancy-token-estimator` into `develop` with `merge: prompt occupancy estimator`.
|
||||
- `llm-worker` token counter extrapolation now keeps exact measured prompt occupancy authoritative and no longer extrapolates one-measurement growth via `total_input_tokens / history_bytes`.
|
||||
- Extrapolation past the latest measurement uses a measured incremental span rate when available; otherwise it adds a conservative byte fallback for the unmeasured delta.
|
||||
- Added pod interceptor regression coverage for the fresh-session / one-measurement overestimation case.
|
||||
|
||||
Validation after merge:
|
||||
- `cargo test -p llm-worker token_counter` passed.
|
||||
- `cargo test -p pod pre_llm_request_does_not_yield_from_single_measurement_history_rate_projection` passed.
|
||||
- `./tickets.sh doctor` passed.
|
||||
|
||||
Review:
|
||||
- External reviewer approved with no blockers.
|
||||
|
||||
|
||||
---
|
||||
Loading…
Reference in New Issue
Block a user