review: webfetch local reader
This commit is contained in:
parent
aa81aa8c6f
commit
d2ad131eb2
|
|
@ -7,7 +7,7 @@ kind: task
|
|||
priority: P2
|
||||
labels: [web, tools, html]
|
||||
created_at: 2026-05-30T21:59:28Z
|
||||
updated_at: 2026-05-30T22:00:33Z
|
||||
updated_at: 2026-05-30T22:20:54Z
|
||||
assignee: null
|
||||
legacy_ticket: null
|
||||
---
|
||||
|
|
|
|||
|
|
@ -20,4 +20,59 @@ Implementation plan:
|
|||
6. Validate with focused web tests, tools check, doctor, diff check, and Nix build/hash handling.
|
||||
|
||||
|
||||
---
|
||||
|
||||
<!-- event: implementation_report author: hare at: 2026-05-30T22:20:54Z -->
|
||||
|
||||
## Implementation report
|
||||
|
||||
Implementation report from coder Pod `webfetch-local-reader-coder-20260530`:
|
||||
|
||||
- Branch: `webfetch-local-reader-markdown`
|
||||
- Commits:
|
||||
- `2a3208b96e151d748fd80d5a1c4236ba616b3c6e` `web: render readable html as markdown`
|
||||
- `44262c42365cbba1dad87629c06ce539cbd25105` `web: harden reader navigation fallbacks`
|
||||
- Changed files: `crates/tools/src/web.rs`, `crates/tools/Cargo.toml`, `Cargo.lock`, `package.nix`
|
||||
- Removed direct `readability-rs` dependency and replaced it with a local `html5ever`/rcdom reader implementation.
|
||||
- Added `include_navigation: Option<bool>` to `WebFetchInput`.
|
||||
- Renders readable HTML as Markdown-ish text, preserving inline body links as `[label](absolute-url)`.
|
||||
- Detects generic navigation/sidebar/toc/menu/breadcrumb/pagination/prev-next content, omits it by default with metadata/notice, and includes bounded `## Navigation` when requested.
|
||||
- Reader failure/fallback is reported with `readable=false`/fallback metadata; fallback output omits detected navigation by default so metadata and text agree.
|
||||
- Added/strengthened regression tests for link preservation, navigation omission/inclusion, link-heavy main failure, fallback navigation consistency, and navigation truncation.
|
||||
|
||||
Validation reported by coder:
|
||||
|
||||
- `cargo fmt --check` passed
|
||||
- `cargo test -p tools web` passed (14 passed)
|
||||
- `cargo check -p tools` passed
|
||||
- `./tickets.sh doctor` passed
|
||||
- `git diff --check` passed
|
||||
- `nix build .#insomnia` passed
|
||||
|
||||
Unresolved issues: none.
|
||||
|
||||
|
||||
---
|
||||
|
||||
<!-- event: review author: hare at: 2026-05-30T22:20:54Z status: approve -->
|
||||
|
||||
## Review: approve
|
||||
|
||||
External review by reviewer Pod `webfetch-local-reader-reviewer-20260530`: approve.
|
||||
|
||||
First review requested changes for two blockers:
|
||||
|
||||
1. link-heavy `body` / `main` could be accepted as readable main content;
|
||||
2. fallback could claim navigation omission while returning detected navigation text.
|
||||
|
||||
Follow-up commit `44262c42365cbba1dad87629c06ce539cbd25105` resolved both:
|
||||
|
||||
- `candidate_score` rejects high link density for all candidate tags, including `body` and `main`;
|
||||
- fallback text is generated through the DOM reader path so detected navigation is omitted by default when `include_navigation=false`;
|
||||
- metadata aligns with included/omitted navigation state;
|
||||
- tests cover link-heavy main, fallback nav omission consistency, strengthened omitted nav labels, and navigation truncation metadata.
|
||||
|
||||
Reviewer found no new blocker. Reported validation is adequate.
|
||||
|
||||
|
||||
---
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user