521 lines
21 KiB
Markdown
521 lines
21 KiB
Markdown
# Workspace DB canonical schema v0 design
|
|
|
|
## Purpose
|
|
|
|
This document defines the first concrete Workspace control-plane schema target. It is precise enough that implementation work can create migrations and API read/write surfaces without inventing meanings ad hoc.
|
|
|
|
The important corrections in this version are:
|
|
|
|
- **Ticket thread/events remain the execution history authority**.
|
|
- A separate top-level `Run` entity is not part of v0.
|
|
- Separate `validation_results` / CI tables are not part of v0.
|
|
- Worker is not a DB-canonical entity in v0. Worker state is obtained from runtime inspection / Host protocol as a live view.
|
|
- Ticket-to-Worker management is represented by typed WorkerRef snapshots on Ticket events and Ticket-Worker association records.
|
|
- v0 does **not** use generic JSON payload/metadata columns. If a value matters, give it a typed column or a small relation table. If it is large evidence, store it as an Artifact.
|
|
|
|
## Schema categories
|
|
|
|
1. **Current-state records**: long-lived records with stable ids and current snapshots, such as Ticket, Objective, Repository, Artifact.
|
|
2. **Event logs**: append-oriented records attached to current-state records, primarily `ticket_events` and `audit_events`.
|
|
3. **Relationship records**: explicit links such as Ticket-to-WorkerRef, Ticket-to-Repository target, Objective-to-Ticket.
|
|
4. **Snapshot references**: typed authorship / worker / host references embedded in event or relation records. These are not full entities in v0.
|
|
5. **Live views**: API results produced by inspecting local runtime or future Host protocol state. Host/Worker lists are live views in v0, not canonical DB tables.
|
|
|
|
All main tables include `workspace_id`. v0 is SQLite-first, but table shapes should not prevent later Postgres/multi-workspace hosting.
|
|
|
|
## Design rules
|
|
|
|
- Ticket and Objective belong to Workspace, not to Repository.
|
|
- Repository is a Workspace-connected source/storage. Git Repository is one provider, not the definition of Repository.
|
|
- Ticket target selectors are mutable intent/scope. Evidence artifacts may record the concrete repository revision they were produced from with typed source fields.
|
|
- Ticket thread is the human-readable and structured execution/audit history for work on that Ticket.
|
|
- Ticket current state is a snapshot derived/maintained from structured state transition events.
|
|
- Worker is a logical agent/session participating in work, but Worker registry/persistence is out of v0 DB scope.
|
|
- Host is an execution environment or observed placement. In v0, Host/Worker information is returned as a live view from local runtime inspection or future Host protocol, not stored as canonical DB records.
|
|
- Ticket-associated Worker management uses WorkerRef fields and `ticket_worker_links` snapshots. This lets the Ticket be managed without making the Worker itself DB-canonical.
|
|
- Orchestrator should be able to operate from DB/API records only: Ticket, TicketEvents, TicketWorkerLinks, live Host/Worker views, Artifact, and review/evidence summaries.
|
|
- Raw fs/Bash/Git authority belongs to Host/Worker execution, not to Orchestrator.
|
|
- Memory/Knowledge are intentionally out of v0 canonical schema. They are deferred until Workspace storage migration for Memory.
|
|
- Event authorship is mandatory, but a full Actor table is not required in v0.
|
|
- Generic JSON columns are intentionally excluded in v0. Do not add `metadata_json`, `payload_json`, `diagnostics_json`, or similar catch-all fields.
|
|
|
|
## Common columns and conventions
|
|
|
|
### IDs
|
|
|
|
Use opaque string ids allocated by the control plane for DB-canonical records.
|
|
|
|
Recommended prefixes are implementation detail, but the type must be obvious from column names:
|
|
|
|
- `workspace_id`
|
|
- `ticket_id`
|
|
- `event_id`
|
|
- `objective_id`
|
|
- `repository_id`
|
|
- `target_id`
|
|
- `artifact_id`
|
|
- `audit_event_id`
|
|
|
|
Worker and Host references use `*_ref_kind` / `*_ref_key` in v0 because they are not canonical DB entities.
|
|
|
|
### Timestamps
|
|
|
|
Store UTC timestamps as RFC3339 strings in SQLite v0.
|
|
|
|
Common names:
|
|
|
|
- `created_at`
|
|
- `updated_at`
|
|
- `observed_at`
|
|
- `started_at`
|
|
- `finished_at`
|
|
- `closed_at`
|
|
- `last_seen_at`
|
|
|
|
### No catch-all payload columns
|
|
|
|
v0 avoids generic JSON/text payload columns because they make the schema ambiguous and move authority into untyped blobs.
|
|
|
|
Rules:
|
|
|
|
- Fields used for lifecycle transitions, permissions, joins, filtering, or orchestration decisions must be typed columns or relation tables.
|
|
- Event kinds may have nullable typed columns such as `subject_kind`, `subject_id`, `previous_state`, `new_state`, `status`, `activity_id`, or `artifact_id`.
|
|
- Repository capabilities are derived from `repositories.kind` / `repositories.provider` and backend configuration in v0; do not add a separate capability table until provider-specific overrides are actually needed.
|
|
- Paths use relation tables such as `ticket_target_paths`.
|
|
- Diagnostics that matter should be Ticket events or Artifacts.
|
|
- Large logs, diffs, transcripts, prompts, raw tool outputs, and file contents must not be embedded in records. Store them in an artifact file/blob store and link through Artifact URI records.
|
|
- Secrets are never stored in this schema. Secret references, if needed, use typed reference columns such as `auth_ref_kind` and `auth_ref_key`.
|
|
|
|
## Authorship fields v0
|
|
|
|
Authorship is an embedded typed snapshot, not a full table in v0.
|
|
|
|
Use the following columns on event/request/audit records that need authorship:
|
|
|
|
```text
|
|
author_kind text not null
|
|
author_key text not null
|
|
author_display text not null
|
|
author_source_kind text null
|
|
author_source_key text null
|
|
```
|
|
|
|
`author_kind` allowed values:
|
|
|
|
- `human`
|
|
- `agent`
|
|
- `system`
|
|
- `integration`
|
|
- `unknown`
|
|
|
|
`author_key` is stable within its source namespace, for example:
|
|
|
|
- `local-user`
|
|
- `agent:orchestrator`
|
|
- `worker:<worker_ref_key>`
|
|
- `system:yoi-control-plane`
|
|
- `integration:ci:<provider>`
|
|
|
|
`author_display` is a display snapshot at event creation time. It must be sufficient for historical display even if a future Actor/User record changes name.
|
|
|
|
`author_source_kind` and `author_source_key` can point to bounded source context such as `worker`, `profile`, `external_account`, or `provider`. They must not hold secrets.
|
|
|
|
A future `actors` table may be added for auth, assignment, team membership, and permissions. v0 must not require it. If it is added later, historical events still keep their authorship snapshot and may optionally link to `actor_id`.
|
|
|
|
## WorkerRef and HostRef v0
|
|
|
|
Worker and Host are runtime concepts in v0. They are referenced by typed snapshots instead of DB foreign keys.
|
|
|
|
Use WorkerRef fields where a Ticket event, Ticket association, artifact, or check report needs to identify a Worker:
|
|
|
|
```text
|
|
worker_ref_kind text null -- local_pod | remote_worker | hosted_worker | external | unknown
|
|
worker_ref_key text null
|
|
worker_display text null
|
|
```
|
|
|
|
Examples:
|
|
|
|
- `worker_ref_kind = local_pod`, `worker_ref_key = coder-sidebar`, `worker_display = Coder sidebar`
|
|
- `worker_ref_kind = hosted_worker`, `worker_ref_key = worker_...`, `worker_display = Hosted coder`
|
|
|
|
Use HostRef fields only when observed placement matters:
|
|
|
|
```text
|
|
host_ref_kind text null -- local | self_hosted | cloud | external | unknown
|
|
host_ref_key text null
|
|
host_display text null
|
|
```
|
|
|
|
HostRef is not ownership. It means “this Worker or event was observed on this execution environment at this time”.
|
|
|
|
Future work may add canonical `workers`, `hosts`, `worker_archive`, and `host_connections` tables when Worker lifecycle, persistence, and archive requirements are concrete. v0 deliberately does not create those tables.
|
|
|
|
## Execution model without a Run entity
|
|
|
|
v0 does not create a separate `runs` table.
|
|
|
|
A concrete execution attempt is represented by:
|
|
|
|
- a `ticket_event` such as `execution_requested`, `worker_assigned`, `worker_status`, `implementation_report`, `review`, `check_report`, `artifact_link`, or `state_transition`;
|
|
- optional `activity_id` on related `ticket_events` to group a burst of execution activity;
|
|
- `ticket_worker_links` records showing which WorkerRefs are associated with the Ticket and in what role/status;
|
|
- `artifacts` linked to `ticket_id`, `event_id`, optional WorkerRef fields, and optional typed repository source revision fields.
|
|
|
|
`activity_id` is a correlation key, not an authority entity. It can be generated when a user/Orchestrator accepts an execution request, but the Ticket thread remains the authority.
|
|
|
|
This avoids duplicating Ticket events and Run records while preserving machine-readable execution state.
|
|
|
|
## Live Host/Worker API view
|
|
|
|
v0 API may expose Host and Worker lists, but they are live views, not DB tables.
|
|
|
|
Examples:
|
|
|
|
- `GET /api/hosts` may inspect the backend-local machine and return one synthetic local Host.
|
|
- `GET /api/workers` may scan current local Pod metadata and sockets and return Worker summaries.
|
|
- Future Host protocol can provide the same API shape from heartbeat/connection state.
|
|
|
|
These API responses must not imply DB persistence. If a Worker disappears from runtime inspection, it can disappear from the live view. Durable history belongs to Ticket events, TicketWorkerLinks, and Artifacts.
|
|
|
|
## Tables
|
|
|
|
### `workspaces`
|
|
|
|
```text
|
|
workspace_id text primary key
|
|
display_name text not null
|
|
state text not null -- active | archived
|
|
created_at text not null
|
|
updated_at text not null
|
|
```
|
|
|
|
### `tickets`
|
|
|
|
Current Ticket state and body snapshot.
|
|
|
|
```text
|
|
workspace_id text not null
|
|
ticket_id text primary key
|
|
title text not null
|
|
state text not null -- planning | ready | queued | inprogress | done | closed
|
|
priority text null
|
|
assignee_kind text null
|
|
assignee_key text null
|
|
assignee_display text null
|
|
body_md text not null
|
|
created_at text not null
|
|
updated_at text not null
|
|
closed_at text null
|
|
resolution_event_id text null
|
|
```
|
|
|
|
Notes:
|
|
|
|
- `tickets` stores the current read model.
|
|
- Historical changes belong to `ticket_events`.
|
|
- Ticket state transitions must be represented by structured `ticket_events`.
|
|
- Assignee is a snapshot, not a foreign key to `actors` in v0.
|
|
|
|
### `ticket_events`
|
|
|
|
Append-oriented Ticket thread/event log. This is also the execution history authority for work on a Ticket.
|
|
|
|
```text
|
|
workspace_id text not null
|
|
event_id text primary key
|
|
ticket_id text not null
|
|
event_seq integer not null
|
|
kind text not null
|
|
activity_id text null
|
|
author_kind text not null
|
|
author_key text not null
|
|
author_display text not null
|
|
author_source_kind text null
|
|
author_source_key text null
|
|
created_at text not null
|
|
body_md text null
|
|
subject_kind text null -- ticket | worker | artifact | check | repository | objective | system
|
|
subject_id text null
|
|
previous_state text null
|
|
new_state text null
|
|
status text null
|
|
artifact_id text null
|
|
worker_ref_kind text null
|
|
worker_ref_key text null
|
|
worker_display text null
|
|
host_ref_kind text null
|
|
host_ref_key text null
|
|
host_display text null
|
|
repository_id text null
|
|
caused_by_event_id text null
|
|
```
|
|
|
|
`kind` allowed values in v0:
|
|
|
|
- `comment`
|
|
- `plan`
|
|
- `decision`
|
|
- `review`
|
|
- `implementation_report`
|
|
- `state_transition`
|
|
- `close`
|
|
- `execution_requested`
|
|
- `worker_assigned`
|
|
- `worker_status`
|
|
- `check_report`
|
|
- `artifact_link`
|
|
- `system_note`
|
|
|
|
Constraints:
|
|
|
|
- unique `(ticket_id, event_seq)`.
|
|
- events are append-only except administrative repair migrations.
|
|
- state transitions and close events must include `previous_state` and `new_state` where applicable.
|
|
- execution events should use typed columns such as `activity_id`, WorkerRef fields, `artifact_id`, and `repository_id` instead of opaque payloads.
|
|
|
|
### `ticket_relations`
|
|
|
|
```text
|
|
workspace_id text not null
|
|
source_ticket_id text not null
|
|
target_ticket_id text not null
|
|
kind text not null -- depends_on | blocks | related | supersedes | duplicate_of
|
|
created_at text not null
|
|
author_kind text not null
|
|
author_key text not null
|
|
author_display text not null
|
|
author_source_kind text null
|
|
author_source_key text null
|
|
note text null
|
|
primary key (source_ticket_id, target_ticket_id, kind)
|
|
```
|
|
|
|
### `objectives`
|
|
|
|
```text
|
|
workspace_id text not null
|
|
objective_id text primary key
|
|
title text not null
|
|
state text not null -- active | paused | done | closed | archived
|
|
body_md text not null
|
|
created_at text not null
|
|
updated_at text not null
|
|
```
|
|
|
|
### `objective_ticket_links`
|
|
|
|
```text
|
|
workspace_id text not null
|
|
objective_id text not null
|
|
ticket_id text not null
|
|
kind text not null -- tracks | related | milestone | blocker
|
|
created_at text not null
|
|
primary key (objective_id, ticket_id, kind)
|
|
```
|
|
|
|
### `repositories`
|
|
|
|
Workspace-connected source/storage. Git is one provider.
|
|
|
|
```text
|
|
workspace_id text not null
|
|
repository_id text primary key
|
|
name text not null
|
|
kind text not null -- git | local | object_store | artifact_store | custom
|
|
provider text null -- git, local_fs, s3, etc.
|
|
uri text not null
|
|
default_ref text null
|
|
auth_ref_kind text null
|
|
auth_ref_key text null
|
|
created_at text not null
|
|
updated_at text not null
|
|
```
|
|
|
|
Notes:
|
|
|
|
- `uri` is identity/config data. It may be redacted in API responses.
|
|
- `auth_ref_kind` / `auth_ref_key` contain secret references only, never secret values.
|
|
- v0 does not store per-Repository capability rows. Capabilities are derived from `kind`, `provider`, and backend configuration. Add explicit capability/override records later only if a real provider needs per-Repository variance.
|
|
|
|
### `ticket_targets`
|
|
|
|
Ticket scope/intent against one or more Repositories.
|
|
|
|
```text
|
|
workspace_id text not null
|
|
ticket_id text not null
|
|
target_id text not null
|
|
repository_id text not null
|
|
role text not null -- primary | related | reference | check | output
|
|
intent text not null -- read | change | check | output
|
|
ref_selector text null
|
|
created_at text not null
|
|
updated_at text not null
|
|
primary key (ticket_id, target_id)
|
|
```
|
|
|
|
### `ticket_target_paths`
|
|
|
|
```text
|
|
workspace_id text not null
|
|
ticket_id text not null
|
|
target_id text not null
|
|
path text not null
|
|
primary key (ticket_id, target_id, path)
|
|
```
|
|
|
|
### `ticket_worker_links`
|
|
|
|
Current relationship between Ticket and a WorkerRef.
|
|
|
|
```text
|
|
workspace_id text not null
|
|
ticket_id text not null
|
|
worker_ref_kind text not null
|
|
worker_ref_key text not null
|
|
worker_display text null
|
|
role text not null -- companion | intake | orchestrator | coder | reviewer | validator | custom
|
|
status text not null -- requested | assigned | active | blocked | completed | released | failed | cancelled
|
|
activity_id text null
|
|
assigned_at text null
|
|
released_at text null
|
|
last_event_id text null
|
|
primary key (ticket_id, worker_ref_kind, worker_ref_key, role)
|
|
```
|
|
|
|
Notes:
|
|
|
|
- This is the main DB management relation for Ticket-associated Workers.
|
|
- It is not a Worker registry.
|
|
- Ticket thread events should record assignment/release/status changes.
|
|
|
|
### `artifacts`
|
|
|
|
Evidence/output linked to Ticket, Objective, event, WorkerRef, or Repository source revision.
|
|
|
|
Artifact content is not stored inline in the DB. Every Artifact points to a URI. The URI may be served by the Workspace backend's artifact/static-file service, a blob store, or an external system.
|
|
|
|
```text
|
|
workspace_id text not null
|
|
artifact_id text primary key
|
|
kind text not null -- diff | patch | log | report | check_report | review | file | external_link | summary
|
|
uri text not null
|
|
media_type text null
|
|
sha256 text null
|
|
size_bytes integer null
|
|
summary text null
|
|
created_at text not null
|
|
created_by_kind text not null
|
|
created_by_key text not null
|
|
created_by_display text not null
|
|
created_by_source_kind text null
|
|
created_by_source_key text null
|
|
ticket_id text null
|
|
objective_id text null
|
|
event_id text null
|
|
worker_ref_kind text null
|
|
worker_ref_key text null
|
|
worker_display text null
|
|
repository_id text null
|
|
source_kind text null -- git_commit | file_snapshot | object_version | custom
|
|
source_revision text null -- commit hash, snapshot id, or object version id
|
|
```
|
|
|
|
Rules:
|
|
|
|
- `uri` is mandatory.
|
|
- DB rows store metadata and summary only, never artifact body content.
|
|
- `source_kind` / `source_revision` are optional typed source fields for artifacts produced against a concrete repository revision. They do not represent branch/ref selectors; mutable selectors remain on `ticket_targets.ref_selector` or in the related Ticket event.
|
|
- Workspace-owned artifact content should use a stable internal URI scheme or backend-served URL, for example `artifact://<workspace_id>/<artifact_id>` or `/api/artifacts/<artifact_id>/content`.
|
|
- External artifacts may use redacted `https://...` or provider-specific URIs when policy allows.
|
|
- API list/detail responses return artifact metadata and URI by default. Fetching content is a separate artifact-content operation with bounds and permission checks.
|
|
|
|
## CI / actions-like checks are future work
|
|
|
|
v0 does not add `validation_results`, `ci_results`, or action tables.
|
|
|
|
For now, local checks, CI summaries, and check evidence are represented by:
|
|
|
|
- `ticket_events.kind = check_report` or `artifact_link`;
|
|
- Artifacts such as logs, check reports, or external CI URLs;
|
|
- Ticket state transitions or review events that reference those artifacts.
|
|
|
|
If first-class CI status is needed, design it as a separate actions-like subsystem rather than a generic validation table inside the core Ticket schema. That future subsystem should model workflow/check names, jobs, steps, attempts, statuses, logs, annotations, external provider ids, retention, and rerun semantics explicitly.
|
|
|
|
### `audit_events`
|
|
|
|
Control-plane operation audit trail.
|
|
|
|
```text
|
|
workspace_id text not null
|
|
audit_event_id text primary key
|
|
created_at text not null
|
|
actor_kind text not null
|
|
actor_key text not null
|
|
actor_display text not null
|
|
actor_source_kind text null
|
|
actor_source_key text null
|
|
action text not null
|
|
target_kind text not null
|
|
target_id text null
|
|
outcome text not null -- allowed | denied | succeeded | failed
|
|
request_id text null
|
|
summary text null
|
|
```
|
|
|
|
Audit events record the control-plane action and outcome. They should not duplicate full Ticket event payloads unless needed for audit.
|
|
|
|
## Read surfaces for Orchestrator without fs/Bash
|
|
|
|
The DB/API must let an Orchestrator read:
|
|
|
|
- Ticket current state, thread events, relations, targets, and TicketWorkerLinks.
|
|
- Objective body and linked Tickets.
|
|
- Repository summaries, Ticket target selectors, and Artifact source revision fields.
|
|
- Live Host/Worker views from runtime inspection or future Host protocol.
|
|
- Artifact summaries and selected artifact contents through bounded artifact APIs.
|
|
- Check/CI summaries as TicketEvents and Artifacts.
|
|
- Review evidence as TicketEvents/Artifacts.
|
|
|
|
## Write surfaces for Orchestrator without fs/Bash
|
|
|
|
The DB/API must let an Orchestrator create:
|
|
|
|
- Ticket comments/decisions/state transition requests.
|
|
- Ticket execution request events with target selectors and optional `activity_id`.
|
|
- TicketWorkerLink assignment/release/status changes.
|
|
- Review/check request events.
|
|
- Artifact links for logs, reports, diffs, CI/external check URLs, and review evidence.
|
|
- Close/done decisions that reference evidence artifacts and structured Ticket events.
|
|
|
|
The Orchestrator must not need raw repository filesystem reads, shell execution, or direct Git merge authority to perform control-plane routing.
|
|
|
|
## Migration stance
|
|
|
|
v0 implementation should support three modes conceptually:
|
|
|
|
1. `filesystem_read_through`: current `.yoi/tickets` and `.yoi/objectives` remain authority; DB holds runtime/projection tables.
|
|
2. `imported_projection`: filesystem records are imported into DB read models, but filesystem remains the write authority.
|
|
3. `db_authority`: Ticket/Objective write path moves to DB; filesystem export becomes compatibility/export snapshot.
|
|
|
|
This Ticket designs the schema target and can implement non-breaking migrations, but it does not require switching active authority to DB.
|
|
|
|
## Minimal implementation guidance
|
|
|
|
If implementation is included in this Ticket, prefer a small non-breaking migration:
|
|
|
|
- Keep Host/Worker API as live runtime views in v0.
|
|
- Add explicit schema versioning.
|
|
- Add tables that are safe to create empty: `repositories`, `ticket_targets`, `ticket_target_paths`, `ticket_worker_links`, `artifacts`, `audit_events`.
|
|
- Keep existing filesystem read APIs working.
|
|
- Do not create a full `actors` table in v0.
|
|
- Do not create `hosts` / `workers` canonical tables in v0.
|
|
- Do not create a separate `runs` table in v0; use structured Ticket events and TicketWorkerLink relationships.
|
|
|
|
## Implementation alignment notes
|
|
|
|
The `yoi-workspace-server` SQLite bootstrap migration implements this v0 schema as schema version 2. Fresh databases create the typed tables listed above and deliberately do not create canonical `runs`, `hosts`, `workers`, `actors`, or check/validation result tables. Host and Worker HTTP read APIs remain live runtime views backed by local inspection, not DB tables.
|
|
|
|
For databases created by the earlier workspace-server bootstrap, migration version 2 preserves old `workspaces`, `repositories`, `runs`, `artifacts`, `ticket_projections`, and `objective_projections` data by renaming those tables to `legacy_workspaces`, `legacy_repositories`, `legacy_runs`, `legacy_artifacts`, `legacy_ticket_projections`, and `legacy_objective_projections`, then creating the v0 typed tables. Existing legacy workspace rows are copied into the canonical v0 `workspaces` table with `state = active` when the old row had no typed state. The legacy names are compatibility preservation only and are not canonical schema tables or active write authority.
|