feat: add ontology material ingestion
This commit is contained in:
37
.planning/phases/005-ontology-materials/005-CONTEXT.md
Normal file
37
.planning/phases/005-ontology-materials/005-CONTEXT.md
Normal file
@@ -0,0 +1,37 @@
|
||||
# Phase 5 Context: Ontology and Learning Materials
|
||||
|
||||
**Status:** Ready for execution
|
||||
**Started:** 2026-04-26
|
||||
|
||||
## Goal
|
||||
|
||||
Accept learning material input and produce source-backed ontology candidates.
|
||||
|
||||
## Inputs
|
||||
|
||||
- OpenSpec `learning-ontology` requirements.
|
||||
- Existing workflow contracts for `OntologyGap`.
|
||||
- Backend Developer Interview seed concepts.
|
||||
|
||||
## Decisions
|
||||
|
||||
- Use an in-memory ontology store for MVP proof.
|
||||
- Accept JSON material ingestion before multipart file upload.
|
||||
- Mark all generated nodes, edges, and gaps as `candidate`.
|
||||
- Preserve source evidence for every supported ontology candidate.
|
||||
|
||||
## Boundaries
|
||||
|
||||
In scope:
|
||||
|
||||
- Material ingestion API.
|
||||
- Source-backed ontology candidate nodes and edges.
|
||||
- Gap detection for missing prerequisites and weak evidence.
|
||||
- Ontology snapshot API.
|
||||
|
||||
Out of scope:
|
||||
|
||||
- File storage.
|
||||
- PDF/PPT parsing.
|
||||
- Human review UI.
|
||||
- Canonical promotion workflow.
|
||||
42
.planning/phases/005-ontology-materials/005-PLAN.md
Normal file
42
.planning/phases/005-ontology-materials/005-PLAN.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Phase 5 Plan: Ontology and Learning Materials
|
||||
|
||||
**Status:** Ready for execution
|
||||
**Phase Goal:** Ingest learning materials into source-backed ontology candidates.
|
||||
|
||||
## Requirements Covered
|
||||
|
||||
- ONTO-01: User or operator can upload learning materials.
|
||||
- ONTO-02: System creates source-backed ontology candidate nodes and edges.
|
||||
- ONTO-03: System detects missing prerequisites and weakly supported concepts.
|
||||
- ONTO-04: Generated or inferred content is marked as candidate until reviewed.
|
||||
|
||||
## Tasks
|
||||
|
||||
### 1. Add ontology package
|
||||
|
||||
- Define material, concept candidate, edge candidate, gap, and snapshot types.
|
||||
- Add in-memory store and service.
|
||||
|
||||
### 2. Implement deterministic MVP analyzer
|
||||
|
||||
- Extract known backend interview concept candidates from material text.
|
||||
- Create prerequisite edges for supported concept pairs.
|
||||
- Create gap candidates for missing prerequisites and weak evidence.
|
||||
|
||||
### 3. Add HTTP endpoints
|
||||
|
||||
- `POST /api/v1/materials`
|
||||
- `GET /api/v1/ontology`
|
||||
|
||||
### 4. Add tests and verification
|
||||
|
||||
- Test material ingestion creates source-backed candidates.
|
||||
- Test gaps are candidate-only.
|
||||
- Test HTTP ingestion and ontology snapshot flow.
|
||||
- Run Go tests, OpenSpec validation, line-count check, and smoke.
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- Multipart upload.
|
||||
- Real document parsers.
|
||||
- Human review promotion.
|
||||
28
.planning/phases/005-ontology-materials/005-RESEARCH.md
Normal file
28
.planning/phases/005-ontology-materials/005-RESEARCH.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Phase 5 Research: Ontology and Learning Materials
|
||||
|
||||
## Findings
|
||||
|
||||
The first useful ontology proof does not need heavy parsing. It needs a clean
|
||||
boundary that proves uploaded material can become inspectable candidate
|
||||
knowledge with provenance.
|
||||
|
||||
The MVP should:
|
||||
|
||||
- store material metadata and source text
|
||||
- extract concept candidates from known backend interview concepts
|
||||
- create prerequisite edges from a small deterministic rule set
|
||||
- identify weak concepts when source support is thin
|
||||
- never mark generated or inferred content as canonical
|
||||
|
||||
## Recommended Shape
|
||||
|
||||
- `internal/ontology` owns material ingestion, candidate storage, and snapshot.
|
||||
- HTTP exposes JSON ingestion first.
|
||||
- Evidence references use the existing workflow shared type.
|
||||
- Gap records distinguish source-backed weakness from generated inference.
|
||||
|
||||
## Risks
|
||||
|
||||
- Overbuilding parsers too early would violate YAGNI.
|
||||
- Treating keyword extraction as canonical knowledge would violate OpenSpec.
|
||||
- A future parser can replace the analyzer behind the same service boundary.
|
||||
36
.planning/phases/005-ontology-materials/005-SUMMARY.md
Normal file
36
.planning/phases/005-ontology-materials/005-SUMMARY.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# Phase 5 Summary
|
||||
|
||||
**Status:** Complete
|
||||
**Completed:** 2026-04-26
|
||||
|
||||
## Delivered
|
||||
|
||||
- Added `internal/ontology` for materials, concept candidates, edge candidates,
|
||||
gaps, and snapshots.
|
||||
- Added deterministic MVP analyzer for known backend interview concepts.
|
||||
- Added source evidence to every supported concept and edge candidate.
|
||||
- Added candidate-only gap records for missing prerequisites and weak evidence.
|
||||
- Added HTTP endpoints:
|
||||
- `POST /api/v1/materials`
|
||||
- `GET /api/v1/ontology`
|
||||
- Added ontology unit tests and HTTP flow tests.
|
||||
|
||||
## Verification
|
||||
|
||||
```powershell
|
||||
gofmt -w cmd internal
|
||||
go test ./...
|
||||
openspec validate bootstrap-job-tutor-platform --strict
|
||||
```
|
||||
|
||||
Additional smoke check:
|
||||
|
||||
- Material ingestion followed by ontology snapshot returned candidate concepts,
|
||||
edges, and gaps.
|
||||
|
||||
## Deferred
|
||||
|
||||
- Multipart uploads.
|
||||
- PPT/PDF/document parsing.
|
||||
- Human review and canonical promotion.
|
||||
- Graph database persistence.
|
||||
29
.planning/phases/005-ontology-materials/005-VERIFICATION.md
Normal file
29
.planning/phases/005-ontology-materials/005-VERIFICATION.md
Normal file
@@ -0,0 +1,29 @@
|
||||
# Phase 5 Verification
|
||||
|
||||
## Verdict
|
||||
|
||||
PASS
|
||||
|
||||
## Requirement Coverage
|
||||
|
||||
- ONTO-01: PASS. JSON material ingestion API accepts operator-provided learning
|
||||
material.
|
||||
- ONTO-02: PASS. Ingestion creates source-backed candidate concepts and
|
||||
prerequisite edges.
|
||||
- ONTO-03: PASS. The analyzer creates candidate gaps for missing prerequisites
|
||||
and weak source evidence.
|
||||
- ONTO-04: PASS. All generated ontology candidates and gaps use `candidate`
|
||||
review state.
|
||||
|
||||
## Evidence
|
||||
|
||||
- `go test ./...` passed.
|
||||
- `openspec validate bootstrap-job-tutor-platform --strict` passed.
|
||||
- Live material ingestion and ontology snapshot smoke passed.
|
||||
- Go source line-count check passed.
|
||||
|
||||
## Residual Risk
|
||||
|
||||
The analyzer is deterministic and intentionally shallow. It proves the product
|
||||
boundary but should later be replaced or supplemented with parser-backed and
|
||||
LLM-assisted extraction.
|
||||
Reference in New Issue
Block a user