llm_worker_rs/docs/research/2026-01-02-llm-streaming.md
2026-01-05 23:03:48 +09:00

27 lines
3.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 2026-01-02 LLM Streaming & Hooks Research
## Summary Table
| Topic | Key Takeaways | Sources |
| --- | --- | --- |
| Fine-grained tool streaming | Anthropic beta header `fine-grained-tool-streaming-2025-05-14` streams tool parameters without intermediate JSON validation; reduces latency but may emit invalid/partial JSON that callers must sanitize. | Anthropic Docs Fine-grained tool streaming (https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/fine-grained-tool-streaming) [turn1search0]; AWS Bedrock Anthropic tool-use reference (https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages-tool-use.html) [turn1search5]; Anthropic release notes (https://docs.anthropic.com/en/release-notes/api) [turn1search3]
| Streaming SSE events | Anthropic SSE stream emits typed events (`message_start`, `content_block_start`, `content_block_delta`, etc.) that clients must map to internal state machines for deterministic playback. | Anthropic Streaming Messages (https://docs.anthropic.com/en/docs/build-with-claude/streaming) [turn1search6]
| Tool streaming ergonomics | LangChain Anthropic integration exposes `betas=["fine-grained-tool-streaming-2025-05-14"]` and warns about invalid JSON, reinforcing need for resilient parsers. | LangChain Anthropic integration guide (https://docs.langchain.com/oss/python/integrations/chat/anthropic) [turn1search8]
| Hook architectures | Claude Code hook lifecycle (SessionStart, UserPromptSubmit, Tool Use, etc.) keeps hooks non-blocking, context-injecting, and failure-isolated—useful template for worker hooks/routers. | Claude-Mem hook architecture overview (https://docs.claude-mem.ai/hooks-architecture) [turn0search0]; Claude Blog configuring hooks (https://claude.com/blog/how-to-configure-hooks) [turn0search2]
## Detailed Notes
### Fine-grained Tool Streaming
- Anthropics beta header `fine-grained-tool-streaming-2025-05-14` enables parameter streams, shrinking first-byte latency from ~15s to ~3s in their example. Clients must prepare for partial JSON and wrap invalid payloads before echoing them back. [turn1search0]
- AWS Bedrock mirrors the same header, confirming availability on Claude Sonnet 4.5/4 and Opus 4. Their docs explicitly caution about invalid/partial JSON and show the request schema. [turn1search5]
- Anthropics June 11, 2025 release notes document the beta launch, signaling freshness and likely API stability requirements. [turn1search3]
### SSE Event Model
- Anthropic exposes SSE event names plus JSON `type` fields; implementers should parse events like `content_block_start`, `content_block_delta`, and `message_stop` to drive a deterministic timeline. This justifies a dedicated timeline router/state machine as in the spec. [turn1search6]
### Tool Streaming Ergonomics in SDKs
- LangChains Anthropic integration demonstrates how third-party SDKs surface the beta header and reiterates the need for error handling when incomplete JSON arrives because `max_tokens` can stop a stream mid-parameter. This informs library-level abstractions for `scheme` and `llm_client` layers. [turn1search8]
### Hook / Lifecycle Design Patterns
- Claude-Mems architecture treats hooks as lifecycle-triggered closures that must stay non-blocking, degrade gracefully, and respect security constraints (frozen configs, permission prompts). This maps closely to the proposed `Tools/Hooks` system. [turn0search0]
- Claudes official hook configuration guide enumerates eight hook types (PreToolUse, PostToolUse, PermissionRequest, SessionStart, Stop, etc.) and their contexts, reinforcing the need for a trait/macro system to statically describe hooks and route events. [turn0search2]