feat: add ontology material ingestion

This commit is contained in:
user
2026-04-26 17:49:35 +09:00
parent a413f1ef15
commit 4936cdf4c9
19 changed files with 766 additions and 13 deletions

View File

@@ -0,0 +1,37 @@
# Phase 5 Context: Ontology and Learning Materials
**Status:** Ready for execution
**Started:** 2026-04-26
## Goal
Accept learning material input and produce source-backed ontology candidates.
## Inputs
- OpenSpec `learning-ontology` requirements.
- Existing workflow contracts for `OntologyGap`.
- Backend Developer Interview seed concepts.
## Decisions
- Use an in-memory ontology store for MVP proof.
- Accept JSON material ingestion before multipart file upload.
- Mark all generated nodes, edges, and gaps as `candidate`.
- Preserve source evidence for every supported ontology candidate.
## Boundaries
In scope:
- Material ingestion API.
- Source-backed ontology candidate nodes and edges.
- Gap detection for missing prerequisites and weak evidence.
- Ontology snapshot API.
Out of scope:
- File storage.
- PDF/PPT parsing.
- Human review UI.
- Canonical promotion workflow.

View File

@@ -0,0 +1,42 @@
# Phase 5 Plan: Ontology and Learning Materials
**Status:** Ready for execution
**Phase Goal:** Ingest learning materials into source-backed ontology candidates.
## Requirements Covered
- ONTO-01: User or operator can upload learning materials.
- ONTO-02: System creates source-backed ontology candidate nodes and edges.
- ONTO-03: System detects missing prerequisites and weakly supported concepts.
- ONTO-04: Generated or inferred content is marked as candidate until reviewed.
## Tasks
### 1. Add ontology package
- Define material, concept candidate, edge candidate, gap, and snapshot types.
- Add in-memory store and service.
### 2. Implement deterministic MVP analyzer
- Extract known backend interview concept candidates from material text.
- Create prerequisite edges for supported concept pairs.
- Create gap candidates for missing prerequisites and weak evidence.
### 3. Add HTTP endpoints
- `POST /api/v1/materials`
- `GET /api/v1/ontology`
### 4. Add tests and verification
- Test material ingestion creates source-backed candidates.
- Test gaps are candidate-only.
- Test HTTP ingestion and ontology snapshot flow.
- Run Go tests, OpenSpec validation, line-count check, and smoke.
## Out of Scope
- Multipart upload.
- Real document parsers.
- Human review promotion.

View File

@@ -0,0 +1,28 @@
# Phase 5 Research: Ontology and Learning Materials
## Findings
The first useful ontology proof does not need heavy parsing. It needs a clean
boundary that proves uploaded material can become inspectable candidate
knowledge with provenance.
The MVP should:
- store material metadata and source text
- extract concept candidates from known backend interview concepts
- create prerequisite edges from a small deterministic rule set
- identify weak concepts when source support is thin
- never mark generated or inferred content as canonical
## Recommended Shape
- `internal/ontology` owns material ingestion, candidate storage, and snapshot.
- HTTP exposes JSON ingestion first.
- Evidence references use the existing workflow shared type.
- Gap records distinguish source-backed weakness from generated inference.
## Risks
- Overbuilding parsers too early would violate YAGNI.
- Treating keyword extraction as canonical knowledge would violate OpenSpec.
- A future parser can replace the analyzer behind the same service boundary.

View File

@@ -0,0 +1,36 @@
# Phase 5 Summary
**Status:** Complete
**Completed:** 2026-04-26
## Delivered
- Added `internal/ontology` for materials, concept candidates, edge candidates,
gaps, and snapshots.
- Added deterministic MVP analyzer for known backend interview concepts.
- Added source evidence to every supported concept and edge candidate.
- Added candidate-only gap records for missing prerequisites and weak evidence.
- Added HTTP endpoints:
- `POST /api/v1/materials`
- `GET /api/v1/ontology`
- Added ontology unit tests and HTTP flow tests.
## Verification
```powershell
gofmt -w cmd internal
go test ./...
openspec validate bootstrap-job-tutor-platform --strict
```
Additional smoke check:
- Material ingestion followed by ontology snapshot returned candidate concepts,
edges, and gaps.
## Deferred
- Multipart uploads.
- PPT/PDF/document parsing.
- Human review and canonical promotion.
- Graph database persistence.

View File

@@ -0,0 +1,29 @@
# Phase 5 Verification
## Verdict
PASS
## Requirement Coverage
- ONTO-01: PASS. JSON material ingestion API accepts operator-provided learning
material.
- ONTO-02: PASS. Ingestion creates source-backed candidate concepts and
prerequisite edges.
- ONTO-03: PASS. The analyzer creates candidate gaps for missing prerequisites
and weak source evidence.
- ONTO-04: PASS. All generated ontology candidates and gaps use `candidate`
review state.
## Evidence
- `go test ./...` passed.
- `openspec validate bootstrap-job-tutor-platform --strict` passed.
- Live material ingestion and ontology snapshot smoke passed.
- Go source line-count check passed.
## Residual Risk
The analyzer is deterministic and intentionally shallow. It proves the product
boundary but should later be replaced or supplemented with parser-backed and
LLM-assisted extraction.