close: webfetch readable extraction

This commit is contained in:
Keisuke Hirata 2026-05-31 05:55:56 +09:00
parent c83bb9360f
commit 5187a77dcd
No known key found for this signature in database
4 changed files with 28 additions and 2 deletions

View File

@ -2,12 +2,12 @@
id: 20260530-204045-webfetch-readable-extraction
slug: webfetch-readable-extraction
title: WebFetch: extract main HTML content with lightweight readability
status: open
status: closed
kind: task
priority: P2
labels: [web, tools, html]
created_at: 2026-05-30T20:40:45Z
updated_at: 2026-05-30T20:54:26Z
updated_at: 2026-05-30T20:55:13Z
assignee: null
legacy_ticket: null
---

View File

@ -0,0 +1 @@
Implemented `WebFetch` HTML reader-mode extraction with pure-Rust `readability-rs`, preserving existing safety checks and fallback to local `html_to_text`. Output now reports `html_extraction` metadata and bounded main text without exposing extracted HTML by default. Reviewed externally and approved; validation passed including focused tools tests and `nix build .#insomnia`.

View File

@ -75,4 +75,29 @@ Non-blocking follow-up:
- Optional future direct test for a stable readability error path; current fallback coverage is sufficient for this ticket.
---
<!-- event: implementation_report author: hare at: 2026-05-30T20:55:12Z -->
## Implementation report
Main workspace validation after merge:
- `cargo fmt --check` passed
- `cargo test -p tools web` passed (10 passed)
- `cargo check -p tools` passed with existing `llm-worker` dead_code warning
- `./tickets.sh doctor` passed
- `git diff --check` passed
- `nix build .#insomnia` passed (with dirty tree warning due to unrelated `.insomnia/workflow/multi-agent-workflow.md` local modification)
---
<!-- event: close author: hare at: 2026-05-30T20:55:13Z status: closed -->
## Closed
Implemented `WebFetch` HTML reader-mode extraction with pure-Rust `readability-rs`, preserving existing safety checks and fallback to local `html_to_text`. Output now reports `html_extraction` metadata and bounded main text without exposing extracted HTML by default. Reviewed externally and approved; validation passed including focused tools tests and `nix build .#insomnia`.
---