docs: bootstrap tutor planning

This commit is contained in:
user
2026-04-26 15:35:26 +09:00
commit 9edaddd092
25 changed files with 2205 additions and 0 deletions

View File

@@ -0,0 +1,156 @@
---
name: openspec-apply-change
description: Implement tasks from an OpenSpec change. Use when the user wants to start implementing, continue implementation, or work through tasks.
license: MIT
compatibility: Requires openspec CLI.
metadata:
author: openspec
version: "1.0"
generatedBy: "1.2.0"
---
Implement tasks from an OpenSpec change.
**Input**: Optionally specify a change name. If omitted, check if it can be inferred from conversation context. If vague or ambiguous you MUST prompt for available changes.
**Steps**
1. **Select the change**
If a name is provided, use it. Otherwise:
- Infer from conversation context if the user mentioned a change
- Auto-select if only one active change exists
- If ambiguous, run `openspec list --json` to get available changes and use the **AskUserQuestion tool** to let the user select
Always announce: "Using change: <name>" and how to override (e.g., `/opsx:apply <other>`).
2. **Check status to understand the schema**
```bash
openspec status --change "<name>" --json
```
Parse the JSON to understand:
- `schemaName`: The workflow being used (e.g., "spec-driven")
- Which artifact contains the tasks (typically "tasks" for spec-driven, check status for others)
3. **Get apply instructions**
```bash
openspec instructions apply --change "<name>" --json
```
This returns:
- Context file paths (varies by schema - could be proposal/specs/design/tasks or spec/tests/implementation/docs)
- Progress (total, complete, remaining)
- Task list with status
- Dynamic instruction based on current state
**Handle states:**
- If `state: "blocked"` (missing artifacts): show message, suggest using openspec-continue-change
- If `state: "all_done"`: congratulate, suggest archive
- Otherwise: proceed to implementation
4. **Read context files**
Read the files listed in `contextFiles` from the apply instructions output.
The files depend on the schema being used:
- **spec-driven**: proposal, specs, design, tasks
- Other schemas: follow the contextFiles from CLI output
5. **Show current progress**
Display:
- Schema being used
- Progress: "N/M tasks complete"
- Remaining tasks overview
- Dynamic instruction from CLI
6. **Implement tasks (loop until done or blocked)**
For each pending task:
- Show which task is being worked on
- Make the code changes required
- Keep changes minimal and focused
- Mark task complete in the tasks file: `- [ ]` → `- [x]`
- Continue to next task
**Pause if:**
- Task is unclear → ask for clarification
- Implementation reveals a design issue → suggest updating artifacts
- Error or blocker encountered → report and wait for guidance
- User interrupts
7. **On completion or pause, show status**
Display:
- Tasks completed this session
- Overall progress: "N/M tasks complete"
- If all done: suggest archive
- If paused: explain why and wait for guidance
**Output During Implementation**
```
## Implementing: <change-name> (schema: <schema-name>)
Working on task 3/7: <task description>
[...implementation happening...]
✓ Task complete
Working on task 4/7: <task description>
[...implementation happening...]
✓ Task complete
```
**Output On Completion**
```
## Implementation Complete
**Change:** <change-name>
**Schema:** <schema-name>
**Progress:** 7/7 tasks complete ✓
### Completed This Session
- [x] Task 1
- [x] Task 2
...
All tasks complete! Ready to archive this change.
```
**Output On Pause (Issue Encountered)**
```
## Implementation Paused
**Change:** <change-name>
**Schema:** <schema-name>
**Progress:** 4/7 tasks complete
### Issue Encountered
<description of the issue>
**Options:**
1. <option 1>
2. <option 2>
3. Other approach
What would you like to do?
```
**Guardrails**
- Keep going through tasks until done or blocked
- Always read context files before starting (from the apply instructions output)
- If task is ambiguous, pause and ask before implementing
- If implementation reveals issues, pause and suggest artifact updates
- Keep code changes minimal and scoped to each task
- Update task checkbox immediately after completing each task
- Pause on errors, blockers, or unclear requirements - don't guess
- Use contextFiles from CLI output, don't assume specific file names
**Fluid Workflow Integration**
This skill supports the "actions on a change" model:
- **Can be invoked anytime**: Before all artifacts are done (if tasks exist), after partial implementation, interleaved with other actions
- **Allows artifact updates**: If implementation reveals design issues, suggest updating artifacts - not phase-locked, work fluidly

View File

@@ -0,0 +1,114 @@
---
name: openspec-archive-change
description: Archive a completed change in the experimental workflow. Use when the user wants to finalize and archive a change after implementation is complete.
license: MIT
compatibility: Requires openspec CLI.
metadata:
author: openspec
version: "1.0"
generatedBy: "1.2.0"
---
Archive a completed change in the experimental workflow.
**Input**: Optionally specify a change name. If omitted, check if it can be inferred from conversation context. If vague or ambiguous you MUST prompt for available changes.
**Steps**
1. **If no change name provided, prompt for selection**
Run `openspec list --json` to get available changes. Use the **AskUserQuestion tool** to let the user select.
Show only active changes (not already archived).
Include the schema used for each change if available.
**IMPORTANT**: Do NOT guess or auto-select a change. Always let the user choose.
2. **Check artifact completion status**
Run `openspec status --change "<name>" --json` to check artifact completion.
Parse the JSON to understand:
- `schemaName`: The workflow being used
- `artifacts`: List of artifacts with their status (`done` or other)
**If any artifacts are not `done`:**
- Display warning listing incomplete artifacts
- Use **AskUserQuestion tool** to confirm user wants to proceed
- Proceed if user confirms
3. **Check task completion status**
Read the tasks file (typically `tasks.md`) to check for incomplete tasks.
Count tasks marked with `- [ ]` (incomplete) vs `- [x]` (complete).
**If incomplete tasks found:**
- Display warning showing count of incomplete tasks
- Use **AskUserQuestion tool** to confirm user wants to proceed
- Proceed if user confirms
**If no tasks file exists:** Proceed without task-related warning.
4. **Assess delta spec sync state**
Check for delta specs at `openspec/changes/<name>/specs/`. If none exist, proceed without sync prompt.
**If delta specs exist:**
- Compare each delta spec with its corresponding main spec at `openspec/specs/<capability>/spec.md`
- Determine what changes would be applied (adds, modifications, removals, renames)
- Show a combined summary before prompting
**Prompt options:**
- If changes needed: "Sync now (recommended)", "Archive without syncing"
- If already synced: "Archive now", "Sync anyway", "Cancel"
If user chooses sync, use Task tool (subagent_type: "general-purpose", prompt: "Use Skill tool to invoke openspec-sync-specs for change '<name>'. Delta spec analysis: <include the analyzed delta spec summary>"). Proceed to archive regardless of choice.
5. **Perform the archive**
Create the archive directory if it doesn't exist:
```bash
mkdir -p openspec/changes/archive
```
Generate target name using current date: `YYYY-MM-DD-<change-name>`
**Check if target already exists:**
- If yes: Fail with error, suggest renaming existing archive or using different date
- If no: Move the change directory to archive
```bash
mv openspec/changes/<name> openspec/changes/archive/YYYY-MM-DD-<name>
```
6. **Display summary**
Show archive completion summary including:
- Change name
- Schema that was used
- Archive location
- Whether specs were synced (if applicable)
- Note about any warnings (incomplete artifacts/tasks)
**Output On Success**
```
## Archive Complete
**Change:** <change-name>
**Schema:** <schema-name>
**Archived to:** openspec/changes/archive/YYYY-MM-DD-<name>/
**Specs:** ✓ Synced to main specs (or "No delta specs" or "Sync skipped")
All artifacts complete. All tasks complete.
```
**Guardrails**
- Always prompt for change selection if not provided
- Use artifact graph (openspec status --json) for completion checking
- Don't block archive on warnings - just inform and confirm
- Preserve .openspec.yaml when moving to archive (it moves with the directory)
- Show clear summary of what happened
- If sync is requested, use openspec-sync-specs approach (agent-driven)
- If delta specs exist, always run the sync assessment and show the combined summary before prompting

View File

@@ -0,0 +1,288 @@
---
name: openspec-explore
description: Enter explore mode - a thinking partner for exploring ideas, investigating problems, and clarifying requirements. Use when the user wants to think through something before or during a change.
license: MIT
compatibility: Requires openspec CLI.
metadata:
author: openspec
version: "1.0"
generatedBy: "1.2.0"
---
Enter explore mode. Think deeply. Visualize freely. Follow the conversation wherever it goes.
**IMPORTANT: Explore mode is for thinking, not implementing.** You may read files, search code, and investigate the codebase, but you must NEVER write code or implement features. If the user asks you to implement something, remind them to exit explore mode first and create a change proposal. You MAY create OpenSpec artifacts (proposals, designs, specs) if the user asks—that's capturing thinking, not implementing.
**This is a stance, not a workflow.** There are no fixed steps, no required sequence, no mandatory outputs. You're a thinking partner helping the user explore.
---
## The Stance
- **Curious, not prescriptive** - Ask questions that emerge naturally, don't follow a script
- **Open threads, not interrogations** - Surface multiple interesting directions and let the user follow what resonates. Don't funnel them through a single path of questions.
- **Visual** - Use ASCII diagrams liberally when they'd help clarify thinking
- **Adaptive** - Follow interesting threads, pivot when new information emerges
- **Patient** - Don't rush to conclusions, let the shape of the problem emerge
- **Grounded** - Explore the actual codebase when relevant, don't just theorize
---
## What You Might Do
Depending on what the user brings, you might:
**Explore the problem space**
- Ask clarifying questions that emerge from what they said
- Challenge assumptions
- Reframe the problem
- Find analogies
**Investigate the codebase**
- Map existing architecture relevant to the discussion
- Find integration points
- Identify patterns already in use
- Surface hidden complexity
**Compare options**
- Brainstorm multiple approaches
- Build comparison tables
- Sketch tradeoffs
- Recommend a path (if asked)
**Visualize**
```
┌─────────────────────────────────────────┐
│ Use ASCII diagrams liberally │
├─────────────────────────────────────────┤
│ │
│ ┌────────┐ ┌────────┐ │
│ │ State │────────▶│ State │ │
│ │ A │ │ B │ │
│ └────────┘ └────────┘ │
│ │
│ System diagrams, state machines, │
│ data flows, architecture sketches, │
│ dependency graphs, comparison tables │
│ │
└─────────────────────────────────────────┘
```
**Surface risks and unknowns**
- Identify what could go wrong
- Find gaps in understanding
- Suggest spikes or investigations
---
## OpenSpec Awareness
You have full context of the OpenSpec system. Use it naturally, don't force it.
### Check for context
At the start, quickly check what exists:
```bash
openspec list --json
```
This tells you:
- If there are active changes
- Their names, schemas, and status
- What the user might be working on
### When no change exists
Think freely. When insights crystallize, you might offer:
- "This feels solid enough to start a change. Want me to create a proposal?"
- Or keep exploring - no pressure to formalize
### When a change exists
If the user mentions a change or you detect one is relevant:
1. **Read existing artifacts for context**
- `openspec/changes/<name>/proposal.md`
- `openspec/changes/<name>/design.md`
- `openspec/changes/<name>/tasks.md`
- etc.
2. **Reference them naturally in conversation**
- "Your design mentions using Redis, but we just realized SQLite fits better..."
- "The proposal scopes this to premium users, but we're now thinking everyone..."
3. **Offer to capture when decisions are made**
| Insight Type | Where to Capture |
|--------------|------------------|
| New requirement discovered | `specs/<capability>/spec.md` |
| Requirement changed | `specs/<capability>/spec.md` |
| Design decision made | `design.md` |
| Scope changed | `proposal.md` |
| New work identified | `tasks.md` |
| Assumption invalidated | Relevant artifact |
Example offers:
- "That's a design decision. Capture it in design.md?"
- "This is a new requirement. Add it to specs?"
- "This changes scope. Update the proposal?"
4. **The user decides** - Offer and move on. Don't pressure. Don't auto-capture.
---
## What You Don't Have To Do
- Follow a script
- Ask the same questions every time
- Produce a specific artifact
- Reach a conclusion
- Stay on topic if a tangent is valuable
- Be brief (this is thinking time)
---
## Handling Different Entry Points
**User brings a vague idea:**
```
User: I'm thinking about adding real-time collaboration
You: Real-time collab is a big space. Let me think about this...
COLLABORATION SPECTRUM
════════════════════════════════════════════
Awareness Coordination Sync
│ │ │
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│Presence│ │Cursors │ │ CRDT │
│ "3 │ │ Multi │ │Conflict│
│online" │ │ select │ │ free │
└────────┘ └────────┘ └────────┘
│ │ │
trivial moderate complex
Where's your head at?
```
**User brings a specific problem:**
```
User: The auth system is a mess
You: [reads codebase]
┌─────────────────────────────────────────────┐
│ CURRENT AUTH FLOW │
└─────────────────────────────────────────────┘
┌─────────────┼─────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Google │ │ GitHub │ │ Email │
│ OAuth │ │ OAuth │ │ Magic │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└─────────────┼─────────────┘
┌───────────┐
│ Session │
└─────┬─────┘
┌───────────┐
│ Perms │
└───────────┘
I see three tangles. Which one's burning?
```
**User is stuck mid-implementation:**
```
User: /opsx:explore add-auth-system
The OAuth integration is more complex than expected
You: [reads change artifacts]
You're on task 4: "Implement OAuth flow"
Let me trace what's involved...
[draws diagram, explores options, suggests paths]
Want to update the design to reflect this?
Or add a spike task to investigate?
```
**User wants to compare options:**
```
User: Should we use Postgres or SQLite?
You: Generic answer is boring. What's the context?
User: A CLI tool that tracks local dev environments
You: That changes everything.
┌─────────────────────────────────────────────────┐
│ CLI TOOL DATA STORAGE │
└─────────────────────────────────────────────────┘
Key constraints:
• No daemon running
• Must work offline
• Single user
SQLite Postgres
Deployment embedded ✓ needs server ✗
Offline yes ✓ no ✗
Single file yes ✓ no ✗
SQLite. Not even close.
Unless... is there a sync component?
```
---
## Ending Discovery
There's no required ending. Discovery might:
- **Flow into a proposal**: "Ready to start? I can create a change proposal."
- **Result in artifact updates**: "Updated design.md with these decisions"
- **Just provide clarity**: User has what they need, moves on
- **Continue later**: "We can pick this up anytime"
When it feels like things are crystallizing, you might summarize:
```
## What We Figured Out
**The problem**: [crystallized understanding]
**The approach**: [if one emerged]
**Open questions**: [if any remain]
**Next steps** (if ready):
- Create a change proposal
- Keep exploring: just keep talking
```
But this summary is optional. Sometimes the thinking IS the value.
---
## Guardrails
- **Don't implement** - Never write code or implement features. Creating OpenSpec artifacts is fine, writing application code is not.
- **Don't fake understanding** - If something is unclear, dig deeper
- **Don't rush** - Discovery is thinking time, not task time
- **Don't force structure** - Let patterns emerge naturally
- **Don't auto-capture** - Offer to save insights, don't just do it
- **Do visualize** - A good diagram is worth many paragraphs
- **Do explore the codebase** - Ground discussions in reality
- **Do question assumptions** - Including the user's and your own

View File

@@ -0,0 +1,110 @@
---
name: openspec-propose
description: Propose a new change with all artifacts generated in one step. Use when the user wants to quickly describe what they want to build and get a complete proposal with design, specs, and tasks ready for implementation.
license: MIT
compatibility: Requires openspec CLI.
metadata:
author: openspec
version: "1.0"
generatedBy: "1.2.0"
---
Propose a new change - create the change and generate all artifacts in one step.
I'll create a change with artifacts:
- proposal.md (what & why)
- design.md (how)
- tasks.md (implementation steps)
When ready to implement, run /opsx:apply
---
**Input**: The user's request should include a change name (kebab-case) OR a description of what they want to build.
**Steps**
1. **If no clear input provided, ask what they want to build**
Use the **AskUserQuestion tool** (open-ended, no preset options) to ask:
> "What change do you want to work on? Describe what you want to build or fix."
From their description, derive a kebab-case name (e.g., "add user authentication" → `add-user-auth`).
**IMPORTANT**: Do NOT proceed without understanding what the user wants to build.
2. **Create the change directory**
```bash
openspec new change "<name>"
```
This creates a scaffolded change at `openspec/changes/<name>/` with `.openspec.yaml`.
3. **Get the artifact build order**
```bash
openspec status --change "<name>" --json
```
Parse the JSON to get:
- `applyRequires`: array of artifact IDs needed before implementation (e.g., `["tasks"]`)
- `artifacts`: list of all artifacts with their status and dependencies
4. **Create artifacts in sequence until apply-ready**
Use the **TodoWrite tool** to track progress through the artifacts.
Loop through artifacts in dependency order (artifacts with no pending dependencies first):
a. **For each artifact that is `ready` (dependencies satisfied)**:
- Get instructions:
```bash
openspec instructions <artifact-id> --change "<name>" --json
```
- The instructions JSON includes:
- `context`: Project background (constraints for you - do NOT include in output)
- `rules`: Artifact-specific rules (constraints for you - do NOT include in output)
- `template`: The structure to use for your output file
- `instruction`: Schema-specific guidance for this artifact type
- `outputPath`: Where to write the artifact
- `dependencies`: Completed artifacts to read for context
- Read any completed dependency files for context
- Create the artifact file using `template` as the structure
- Apply `context` and `rules` as constraints - but do NOT copy them into the file
- Show brief progress: "Created <artifact-id>"
b. **Continue until all `applyRequires` artifacts are complete**
- After creating each artifact, re-run `openspec status --change "<name>" --json`
- Check if every artifact ID in `applyRequires` has `status: "done"` in the artifacts array
- Stop when all `applyRequires` artifacts are done
c. **If an artifact requires user input** (unclear context):
- Use **AskUserQuestion tool** to clarify
- Then continue with creation
5. **Show final status**
```bash
openspec status --change "<name>"
```
**Output**
After completing all artifacts, summarize:
- Change name and location
- List of artifacts created with brief descriptions
- What's ready: "All artifacts created! Ready for implementation."
- Prompt: "Run `/opsx:apply` or ask me to implement to start working on the tasks."
**Artifact Creation Guidelines**
- Follow the `instruction` field from `openspec instructions` for each artifact type
- The schema defines what each artifact should contain - follow it
- Read dependency artifacts for context before creating new ones
- Use `template` as the structure for your output file - fill in its sections
- **IMPORTANT**: `context` and `rules` are constraints for YOU, not content for the file
- Do NOT copy `<context>`, `<rules>`, `<project_context>` blocks into the artifact
- These guide what you write, but should never appear in the output
**Guardrails**
- Create ALL artifacts needed for implementation (as defined by schema's `apply.requires`)
- Always read dependency artifacts before creating a new one
- If context is critically unclear, ask the user - but prefer making reasonable decisions to keep momentum
- If a change with that name already exists, ask if user wants to continue it or create a new one
- Verify each artifact file exists after writing before proceeding to next

1
.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
.omx/

84
.planning/PROJECT.md Normal file
View File

@@ -0,0 +1,84 @@
# Tutor Platform
## What This Is
Tutor Platform is a web service for software job seekers preparing for
technical interviews. It combines adaptive interview practice, evidence-backed
learner memory, source-backed learning ontology, and game-inspired progression
to make interview readiness visible and repeatable.
The backend is Go because the product will internalize `agent-farm-go` workflow
patterns and call `third-one` with `deepseek-v4-flash` through typed service
boundaries.
## Core Value
The user should feel and prove that they are becoming more interview-ready after
each short practice loop.
## Requirements
### Validated
(None yet; ship to validate.)
### Active
- [ ] Developer job seekers can complete a diagnostic technical interview.
- [ ] Answers are graded with rubrics and preserved as evidence.
- [ ] Learner memory tracks concept mastery, misconceptions, evidence, and
interventions.
- [ ] The system selects the next best interview challenge from learner state.
- [ ] The user sees a readiness map and meaningful progression after each loop.
- [ ] Uploaded learning materials can become source-backed ontology candidates.
- [ ] Generated learning assets preserve prompt, source, and review lineage.
- [ ] Backend implementation uses Go and keeps `agent-farm-go` workflow patterns
internalized behind typed interfaces.
### Out of Scope
- Full school LMS replacement; the first product target is job seekers.
- Marketplace course publishing; not needed to prove the learning loop.
- Automatic certification or hiring decisions; readiness is advisory.
- Unreviewed generated canonical content; generated ontology and assets require
provenance and review state.
- Gambling-like rewards or shame-based leaderboards; progression must be tied to
learning evidence.
## Context
- Product planning lives in `docs/planning/`.
- OpenSpec change baseline lives in
`openspec/changes/bootstrap-job-tutor-platform/`.
- The service should use a Go backend.
- Workflow behavior should be configuration-first and inspired by
`agent-farm-go`.
- LLM execution should use `third-one`, defaulting to `deepseek-v4-flash`.
- Memory should be structured learner state, not a flat RAG transcript.
- Gamification should use Flow, adaptive difficulty, growth lines, and strong
session endings without exploitative mechanics.
## Constraints
- **Backend stack**: Go, to align with internalized `agent-farm-go` workflow
patterns.
- **File size**: manually authored source files must stay at or below 600 lines.
- **Design principles**: SOLID, KISS, and YAGNI govern implementation.
- **Workflow state**: product state changes should use typed contracts, not
freeform prose parsing.
- **Privacy**: learner memory and evidence may become sensitive, especially for
future student/school expansion.
## Key Decisions
| Decision | Rationale | Outcome |
|----------|-----------|---------|
| Start with software job seekers | Clear, testable interview-practice loop | Pending |
| Use Go backend | Aligns service with internalized `agent-farm-go` substrate | Pending |
| Use `third-one` and `deepseek-v4-flash` by default | Matches current local model/runtime direction | Pending |
| Structured learner memory, not RAG-first | Product value is learner modeling and readiness | Pending |
| Game-inspired progression must be evidence-backed | Creates retention without empty rewards | Pending |
| 600-line source limit | Forces responsibility boundaries early | Pending |
---
*Last updated: 2026-04-26 after Go backend and GSD planning direction were set.*

111
.planning/REQUIREMENTS.md Normal file
View File

@@ -0,0 +1,111 @@
# Requirements: Tutor Platform
**Defined:** 2026-04-26
**Core Value:** The user should feel and prove that they are becoming more
interview-ready after each short practice loop.
## v1 Requirements
### Backend and Workflow
- [ ] **BACK-01**: Backend service is implemented in Go.
- [ ] **BACK-02**: Backend exposes typed interfaces for tutor workflows.
- [ ] **BACK-03**: Backend integrates internalized `agent-farm-go` workflow
patterns without ad hoc handler shellouts.
- [ ] **BACK-04**: Workflow LLM execution uses `third-one` with configurable
runtime and default `deepseek-v4-flash`.
- [ ] **BACK-05**: Manually authored source files stay at or below 600 lines.
### Interview Practice
- [ ] **INT-01**: User can select target role, stack, and interview timeline.
- [ ] **INT-02**: User can complete a diagnostic technical interview.
- [ ] **INT-03**: System can generate role-specific interview questions.
- [ ] **INT-04**: System can grade user answers against explicit rubrics.
- [ ] **INT-05**: System can ask targeted follow-up questions for weak answers.
- [ ] **INT-06**: System preserves original answers and grading evidence.
### Learner Memory
- [ ] **MEM-01**: System stores learner profile with role, stack, timeline, and
preferences.
- [ ] **MEM-02**: System stores concept mastery states with evidence.
- [ ] **MEM-03**: System stores recurring misconceptions with supporting
answers.
- [ ] **MEM-04**: System stores intervention history and review schedule.
- [ ] **MEM-05**: Temporary session context does not become durable memory
without evidence.
### Progression
- [ ] **PROG-01**: User can see a role-specific readiness map.
- [ ] **PROG-02**: Concepts have challenge ladders from definition to interview
pressure.
- [ ] **PROG-03**: System selects next challenge based on learner memory and
grading evidence.
- [ ] **PROG-04**: System unlocks boss-style integrated questions after
prerequisite stability.
- [ ] **PROG-05**: Streaks and rewards avoid punitive or gambling-like mechanics.
### Ontology and Learning Materials
- [ ] **ONTO-01**: User or operator can upload learning materials.
- [ ] **ONTO-02**: System creates source-backed ontology candidate nodes and
edges.
- [ ] **ONTO-03**: System detects missing prerequisites and weakly supported
concepts.
- [ ] **ONTO-04**: Generated or inferred content is marked as candidate until
reviewed.
### Teaching Assets
- [ ] **ASSET-01**: System can generate prompt candidates for visual teaching
assets.
- [ ] **ASSET-02**: Generated assets store source concept, evidence, prompt,
model config, and review state.
- [ ] **ASSET-03**: Image model configuration verifies the actual OpenAI model
identifier before production calls.
## v2 Requirements
### General Student Expansion
- **GEN-01**: Support non-interview learning tracks.
- **GEN-02**: Support teacher or parent progress summaries.
- **GEN-03**: Support school or organization tenant policies.
### Advanced Content Operations
- **ADV-01**: Human review workflow for promoted ontology content.
- **ADV-02**: Multi-format teaching material generation beyond static images.
- **ADV-03**: Company-specific or role-specific premium interview campaigns.
## Out of Scope
| Feature | Reason |
|---------|--------|
| Full LMS | Too broad for v1; first validate job-seeker loop |
| Hiring decisions | Product gives practice/readiness, not employment judgment |
| Course marketplace | Not needed for core value |
| Social leaderboards | Risk of shame mechanics and weak learning signal |
| Random reward economy | Misaligned with evidence-backed learning |
## Traceability
| Requirement | Phase | Status |
|-------------|-------|--------|
| BACK-01..BACK-05 | Phase 1 | Pending |
| INT-01..INT-06 | Phase 2 | Pending |
| MEM-01..MEM-05 | Phase 3 | Pending |
| PROG-01..PROG-05 | Phase 4 | Pending |
| ONTO-01..ONTO-04 | Phase 5 | Pending |
| ASSET-01..ASSET-03 | Phase 6 | Pending |
**Coverage:**
- v1 requirements: 28 total
- Mapped to phases: 28
- Unmapped: 0
---
*Requirements defined: 2026-04-26*
*Last updated: 2026-04-26 after Go backend direction was locked.*

98
.planning/ROADMAP.md Normal file
View File

@@ -0,0 +1,98 @@
# Roadmap: Tutor Platform
## Milestone 1: Job-Seeker Interview Tutor MVP
### Phase 1: Go Backend Foundation and Workflow Boundary
**Goal:** Establish the Go service skeleton and typed workflow boundary for
internalized `agent-farm-go` patterns.
**Requirements:** BACK-01, BACK-02, BACK-03, BACK-04, BACK-05
**Success Criteria:**
- Go backend scaffold exists with clear module boundaries.
- No manually authored source file exceeds 600 lines.
- Workflow interfaces are typed and isolated from HTTP handlers.
- Runtime config can identify the `third-one` / `deepseek-v4-flash` target.
- Basic build/test command is documented in `AGENTS.md`.
### Phase 2: Diagnostic Interview Loop
**Goal:** Prove the first job-seeker loop from role selection through graded
diagnostic interview.
**Requirements:** INT-01, INT-02, INT-03, INT-04, INT-05, INT-06
**Success Criteria:**
- User can choose target role and stack.
- Backend can create a diagnostic session.
- System produces role-specific interview questions.
- User answers are graded through typed workflow results.
- Grading evidence and original answer are persisted.
### Phase 3: Learner Memory
**Goal:** Convert graded answer evidence into structured learner memory.
**Requirements:** MEM-01, MEM-02, MEM-03, MEM-04, MEM-05
**Success Criteria:**
- Learner profile is persisted.
- Concept mastery updates require evidence.
- Misconceptions link to supporting answers.
- Session context and durable memory remain separate.
- Memory extraction workflow emits typed candidates.
### Phase 4: Progression and Gamified Learning Routine
**Goal:** Make readiness and next challenge visible without empty rewards.
**Requirements:** PROG-01, PROG-02, PROG-03, PROG-04, PROG-05
**Success Criteria:**
- Readiness map displays concept states.
- Challenge ladder exists for the first backend interview track.
- Next challenge is selected from learner memory and grading evidence.
- Boss question unlocks after prerequisite stability.
- Streak/reward behavior avoids punitive and random-reward mechanics.
### Phase 5: Source-Backed Ontology Builder
**Goal:** Start material ingestion and ontology candidate generation.
**Requirements:** ONTO-01, ONTO-02, ONTO-03, ONTO-04
**Success Criteria:**
- User/operator can add source material.
- Concepts, prerequisites, rubrics, and question candidates carry provenance.
- Missing prerequisites and weak areas are flagged.
- Generated/inferred content is not promoted as canonical automatically.
### Phase 6: Visual Teaching Asset Pipeline
**Goal:** Generate reviewable teaching asset candidates from ontology concepts.
**Requirements:** ASSET-01, ASSET-02, ASSET-03
**Success Criteria:**
- Asset prompt generation contract exists.
- Generated assets store prompt lineage, source concept, source evidence, model
config, and review state.
- Actual image model identifier is verified before production image calls.
## Parking Lot
- General student mode.
- Teacher/parent dashboards.
- School tenant administration.
- Company-specific interview packs.
- Human ontology review console.
---
*Roadmap created: 2026-04-26 after initial product planning and Go backend decision.*

36
.planning/STATE.md Normal file
View File

@@ -0,0 +1,36 @@
# Project State
## Project Reference
See: `.planning/PROJECT.md` (updated 2026-04-26)
**Core value:** The user should feel and prove that they are becoming more
interview-ready after each short practice loop.
**Current focus:** Phase 1: Go Backend Foundation and Workflow Boundary.
## Current Decisions
- Backend is Go.
- `agent-farm-go` workflow patterns should be internalized behind typed backend
interfaces.
- `third-one` is the LLM execution kernel.
- Default runtime target is `deepseek-v4-flash`.
- Source files should stay at or below 600 lines.
- SOLID, KISS, and YAGNI are active implementation constraints.
- OpenSpec is the intent/requirements source of truth.
## Next Actions
1. Choose first interview track and canonical concept seed list.
2. Define typed contracts for diagnostic, grading, memory extraction, next
challenge, readiness update, ontology gap, and asset prompt.
3. Plan Phase 1 with GSD before writing backend code.
## Validation Log
- 2026-04-26: `openspec validate bootstrap-job-tutor-platform --strict` passed
before GSD planning docs were created.
---
*State initialized: 2026-04-26.*

14
.planning/config.json Normal file
View File

@@ -0,0 +1,14 @@
{
"mode": "yolo",
"granularity": "standard",
"parallelization": true,
"commit_docs": true,
"model_profile": "inherit",
"workflow": {
"research": true,
"plan_check": true,
"verifier": true,
"nyquist_validation": true,
"auto_advance": true
}
}

60
AGENTS.md Normal file
View File

@@ -0,0 +1,60 @@
# AGENTS.md
## Planning and Spec Source of Truth
When doing coding work in this repository, treat OpenSpec as the source of truth
for product intent and requirements. Before implementation, read the relevant
change, specs, and tasks under `openspec/`.
For medium or larger changes, multi-file changes, API/DTO/service-boundary
changes, or architecture-impacting changes, update the touched OpenSpec
surfaces after implementation.
## Engineering Principles
Use these principles as default implementation constraints:
- Backend code is Go unless an OpenSpec change explicitly revisits the stack.
- `agent-farm-go` workflow patterns should be internalized behind typed backend
interfaces, not treated as ad hoc shell scripts from request handlers.
- SOLID: keep responsibilities clear, dependencies explicit, and extension
points narrow.
- KISS: choose the simplest design that satisfies the current requirement.
- YAGNI: do not add future-facing abstractions, services, queues, plugins, or
configuration surfaces until there is a current need.
- Evidence-first: prefer tests, builds, smoke checks, and inspectable artifacts
over claims.
## File Size Rule
No source file should exceed 600 lines.
If a file approaches 600 lines, split it by responsibility before adding more
behavior. Prefer small modules with explicit boundaries over large coordinator
files. Generated files, lockfiles, vendored files, and external data snapshots
are exempt, but should not be edited manually unless necessary.
## Design Style
- Prefer small, cohesive functions and modules.
- Keep business rules close to the domain they describe.
- Keep workflow orchestration separate from low-level adapters.
- Keep learner memory, ontology, grading, progression, and asset generation as
separate responsibilities.
- Prefer typed contracts for workflow inputs and outputs.
- Avoid broad helper packages that become dumping grounds.
- Avoid speculative generic frameworks.
## Validation
After code changes, run the narrowest meaningful validation first, then broader
checks when the touched surface justifies it.
For planning/spec-only changes, run:
```powershell
openspec validate bootstrap-job-tutor-platform --strict
```
For future code changes, add project-specific build and test commands here once
the implementation stack is initialized.

View File

@@ -0,0 +1,157 @@
# Tutor Platform Architecture
## System Shape
The platform is a web service built around workflow-driven tutoring and
structured learner memory.
```text
Web App
Student interview practice
Review plan
Readiness map
Challenge ladder
Material ingestion
Asset review
API Backend
Go service
Auth and accounts
Learning sessions
Interview questions
Learner memory
Ontology and source evidence
Asset generation jobs
Workflow Runtime
internalized agent-farm-go workflow substrate
YAML/config-authored workflow definitions
diagnostic interview
answer grading
memory extraction
ontology analysis
review-plan generation
asset prompt generation
progression and challenge selection
LLM Kernel
third-one
default model_key: deepseek-v4-flash
Memory and Knowledge
learner memory tables
ontology graph tables
source evidence ledger
generated asset lineage
```
## Workflow Responsibilities
Use a Go backend as the product service boundary and internalize
`agent-farm-go` workflow patterns there. Workflow behavior should still be
configuration-first: prefer YAML/config composition for agent behavior and only
add code when a capability cannot be expressed through existing workflow or
runtime-loadable node patterns.
Implementation should follow the engineering rules in
`docs/planning/ENGINEERING.md`: no manually authored source file over 600 lines,
SOLID responsibility boundaries, KISS implementation choices, and YAGNI for
future-only abstractions.
Initial workflow set:
- `diagnose_job_seeker`
- `generate_interview_question`
- `grade_interview_answer`
- `ask_followup_question`
- `extract_learning_memory`
- `build_review_plan`
- `select_next_challenge`
- `update_readiness_map`
- `award_learning_progress`
- `ingest_learning_material`
- `build_learning_ontology`
- `detect_ontology_gaps`
- `generate_teaching_asset_prompt`
- `verify_generated_learning_asset`
## Gamification Strategy
Game-inspired engagement should live on top of learner memory and evidence, not
beside it. The product should not award progress just for time spent. Progress
is earned through answer quality, misconception repair, review completion, and
successful transfer to harder interview scenarios.
Core progression surfaces:
- readiness map by role and concept
- challenge ladder per concept
- short daily interview loops
- boss questions for integrated concept clusters
- strong-answer portfolio
- interview-date campaign plan
Progression decisions should read from learner memory and grading evidence.
They should be exposed as workflow outputs so the service can explain why a
question, reward, or unlock appeared.
## LLM Runtime
Use `third-one` as the bounded model execution kernel. The default target is
`deepseek-v4-flash` through runtime configuration. Product workflows should pass
explicit task contracts and consume typed outputs rather than relying on freeform
assistant prose.
The Go backend should call the workflow/runtime layer through narrow typed
interfaces. Product domain code should not shell out ad hoc from handlers or
parse arbitrary assistant text to mutate learner state.
## Memory Strategy
Do not make RAG the product center. Retrieval can support evidence lookup, but
the durable product memory should be structured:
- learner profile
- concept mastery
- misconceptions
- practice evidence
- intervention history
- spaced review schedule
- readiness progression
- challenge history
MemPalace can inform temporal, scoped, evidence-preserving memory design.
Graphify can inform ontology extraction from mixed source materials. The service
should own its privacy, review, tenant, and deletion semantics directly.
## Ontology Strategy
Uploaded materials should produce a learning graph:
- concepts
- prerequisites
- examples
- interview questions
- rubrics
- source evidence
- missing areas
- generated candidate assets
Every inferred or generated node should carry provenance and review state.
## Visual Asset Strategy
Use image generation behind a provider abstraction. Product language may call
the desired provider key `gpt-image-v2`, but implementation must confirm the
current OpenAI model identifier and API surface before production wiring.
Generated asset types:
- concept diagrams
- slide-like lesson slices
- interview explanation cards
- worksheet visuals
- analogy images
Each asset should store prompt, source concept, source evidence, model config,
generation time, review state, and usage context.

View File

@@ -0,0 +1,121 @@
# Engineering Principles
## Purpose
This document defines how the tutor platform should be implemented once coding
begins. The product is expected to grow across web UI, API backend, workflow
orchestration, learner memory, ontology processing, and generated learning
assets. Without explicit constraints, those areas can easily collapse into large
files and overbuilt abstractions.
## Core Rules
### 0. Backend language
The backend is Go.
This decision aligns the service with `agent-farm-go` so workflow orchestration
can become an internal product capability rather than a loosely attached
automation script. Keep the Go service modular and avoid turning the backend
into one large workflow coordinator.
### 1. File size limit
No source file should exceed 600 lines.
When a file approaches the limit, split it by responsibility. Good split points:
- route registration vs handler logic
- handler logic vs service logic
- service orchestration vs domain rules
- domain rules vs persistence adapter
- workflow contract definitions vs workflow execution
- UI page shell vs reusable components
Exemptions:
- generated files
- lockfiles
- vendored files
- external source snapshots
- large fixture data
### 2. SOLID
Apply SOLID pragmatically:
- Single responsibility: learner memory, ontology, grading, progression, and
asset generation should not live in one service.
- Open/closed: add new interview tracks or asset types through data/config and
narrow extension points where practical.
- Liskov substitution: adapters should honor shared contracts without hidden
behavior changes.
- Interface segregation: avoid giant service interfaces.
- Dependency inversion: domain logic should not depend directly on provider SDKs
or database details.
### 3. KISS
Prefer the simplest implementation that proves the current product loop:
```text
question -> answer -> grading -> memory update -> next challenge
```
Do not introduce queues, distributed workers, plugin systems, or complex
multi-agent orchestration until the MVP loop needs them.
### 4. YAGNI
Do not build features only because they may be useful later.
Examples to defer until proven necessary:
- multi-school LMS administration
- marketplace course publishing
- company-specific interview packs
- generalized ontology editor
- multiple image providers
- complex economy systems
- social leaderboards
## Product Module Boundaries
Initial implementation should keep these responsibilities distinct:
- `auth`: users, sessions, identity providers.
- `interview`: questions, rubrics, answers, grading records.
- `learner_memory`: profiles, concept mastery, misconceptions, evidence.
- `ontology`: concepts, prerequisites, source evidence, generated candidates.
- `progression`: readiness maps, challenge ladders, boss questions, streaks.
- `workflows`: typed contracts and calls into `agent-farm-go` / `third-one`.
- `assets`: generated diagrams, lesson slices, prompt lineage, review state.
The names can change with the chosen stack, but the responsibilities should stay
separate.
## Workflow Contracts
LLM workflow outputs should be typed and inspectable. Avoid relying on freeform
assistant prose for product state changes.
First contracts to define:
- `DiagnosticResult`
- `GradedAnswer`
- `MemoryUpdateCandidate`
- `NextChallenge`
- `ReadinessUpdate`
- `OntologyGap`
- `TeachingAssetPrompt`
## Review Checklist
Before considering an implementation slice complete:
- No manually authored source file exceeds 600 lines.
- New behavior maps to an OpenSpec requirement or updates OpenSpec.
- The implementation keeps responsibilities separated.
- The simplest useful design was chosen.
- No future-only abstraction was added.
- Tests or smoke checks prove the touched behavior.

View File

@@ -0,0 +1,166 @@
# Learning Gamification Design
## Source Reference
This note adapts ideas from the user-provided game-design summary:
Attached game-design markdown summary provided by the user.
The product should use game design to create healthy learning persistence, not
exploitative compulsion. The goal is a strong achievement loop that makes users
want to return because they feel measurable progress toward interview readiness.
## Design Translation
### Experience before content volume
The source material emphasizes that content is surface and experience is the
core. For this product, a large question bank is not enough. The core experience
must be:
- "I know what I am weak at."
- "The next question is exactly the right challenge."
- "I can feel myself becoming interview-ready."
- "Every session ends with a clear win and a next step."
### Flow and adaptive difficulty
Learning sessions should target a flow band:
- too easy: boredom and low trust
- too hard: shame, avoidance, and churn
- just above current ability: useful struggle and pride
The tutor should adjust difficulty using learner memory:
- lower difficulty after repeated failure
- increase specificity after vague but correct answers
- add time pressure only after concept mastery is stable
- switch from recall to applied scenario questions as mastery rises
- insert recovery questions after a hard miss
### Learning main loop
Use a repeated loop similar to challenge, action, reward:
```text
Readiness goal
-> interview question
-> user answer
-> rubric feedback
-> follow-up or correction
-> memory update
-> visible progress
-> next best challenge
```
This loop should be short enough to complete in 5-10 minutes, with optional
longer sessions composed from multiple loops.
### Expectation curve
Each session needs a visible promise:
- today's target concept
- expected time
- reward or unlock
- interview readiness impact
- next milestone preview
The product should maintain open loops carefully:
- show what the next unlock or milestone is
- avoid creating many unfinished tasks at once
- close each session with a concrete result
### Growth lines
Use two growth lines:
1. Permanent mastery growth
- concept mastery
- misconception resolved
- interview skill badges
- portfolio of strong answers
2. Seasonal or campaign growth
- weekly interview sprint
- target-company prep campaign
- stack-specific challenge ladder
- mock interview streak
Permanent growth provides long-term identity. Campaign growth provides freshness
without erasing real learning progress.
## Product Systems
### Readiness Map
A role-specific map that shows concept readiness:
- unknown
- fragile
- improving
- interview-ready
- strong signal
### Challenge Ladder
Each concept gets a ladder:
1. define
2. explain tradeoffs
3. debug a scenario
4. design under constraints
5. answer under interview pressure
### Boss Questions
After a cluster of concepts is stable, the user gets a boss-style integrated
question. Example:
"Design a rate-limited API endpoint with database transactions, cache behavior,
failure handling, and test strategy."
### Reward Types
Prefer meaningful rewards:
- readiness percentage
- concept unlocks
- strong answer saved to portfolio
- mock interview token
- new scenario type unlocked
- visual certificate for a completed track
- generated review card or diagram
Avoid rewards that are disconnected from learning value.
### Session Ending
Every session should end with a strong closure:
- one thing improved
- one misconception discovered or resolved
- one recommended next step
- one visible progress change
## Safety Rules
- Do not use gambling-like random rewards as the primary motivator.
- Do not punish users for missing a day.
- Do not hide progress behind manipulative scarcity.
- Do not optimize only for time-on-site.
- Do not create shame-based leaderboards.
- Prefer mastery, autonomy, competence, and readiness over compulsion.
## MVP Gamification Features
- Daily 10-minute interview loop.
- Role readiness map.
- Concept challenge ladder.
- Streak with grace days, not punishment.
- Boss question after each concept cluster.
- Strong-answer portfolio.
- Session-end progress summary.
- Review campaign for interview date countdown.

269
docs/planning/PRD.md Normal file
View File

@@ -0,0 +1,269 @@
# Tutor Platform PRD
## Product Thesis
Build a web service that helps software job seekers prepare for technical
interviews through adaptive tutoring, interview-question practice, and a
student-specific learning memory. The first market is developers preparing for
employment or career transition. The platform should later expand to general
students by reusing the same curriculum, ontology, assessment, and tutoring
workflow foundations.
This is not a classic RAG chatbot. Source materials are ingested as evidence,
analyzed into a learning ontology, checked for missing or weak areas, verified,
and then transformed into structured learning material, practice questions, and
teaching aids.
## Target Users
### Primary: software job seekers
- Bootcamp graduates preparing for interviews.
- Junior developers preparing for first jobs.
- Developers changing stacks.
- Experienced developers preparing for system design, backend, frontend, data,
AI, DevOps, or language-specific interviews.
### Secondary: general students, later
- Students who need adaptive study plans.
- Teachers or parents who need progress summaries.
- Institutions that want a private learning memory and curriculum engine.
## Problem
Job seekers have abundant content but weak feedback loops:
- They do not know which concepts they truly understand.
- They memorize interview answers without building transferable understanding.
- Existing tools give generic questions, not diagnosis-based practice.
- Study materials are fragmented across notes, PDFs, slides, videos, blogs, and
repositories.
- Progress is hard to measure across concepts, mistakes, and repeated sessions.
## Product Goals
- Provide interview-first tutoring for software job seekers.
- Build a durable learner model from every practice answer and tutor session.
- Convert uploaded materials into a concept ontology and verified study assets.
- Detect missing, weak, outdated, or unverified parts of a learning corpus.
- Generate practice questions, explanations, review plans, and visual teaching
aids from the verified ontology.
- Use game-inspired progress loops to make learning feel rewarding, repeatable,
and visibly connected to interview readiness.
- Keep the architecture reusable for future general-student learning flows.
## Technology Direction
- Backend: Go.
- Workflow substrate: internalize `agent-farm-go` patterns and execution
contracts into the backend boundary instead of treating workflow execution as
a loose external script.
- LLM kernel: `third-one`, defaulting to `deepseek-v4-flash` through runtime
configuration.
- Frontend: web-first. The exact frontend stack remains open until the first UI
implementation slice, but it should stay lightweight and product-focused.
## Non-Goals for MVP
- Full school LMS replacement.
- Marketplace for courses.
- Automatic certification or hiring decisions.
- Broad multi-subject K-12 coverage.
- Unverified autonomous content publishing.
## MVP Scope
The first MVP should prove one loop:
1. User chooses a target role and stack.
2. Platform runs a diagnostic technical interview.
3. Tutor asks follow-up questions based on weak answers.
4. System extracts concept mastery, misconceptions, and evidence.
5. User receives a focused review plan.
6. User repeats practice with generated interview questions.
7. User sees visible readiness progress, next unlocks, and a recommended next
challenge.
Recommended first track:
- Backend developer interview preparation.
- Topics: HTTP, REST, databases, transactions, caching, concurrency, testing,
system design basics, Go or JavaScript/TypeScript depending on first content
corpus.
## Core User Flows
### Diagnostic interview
The user selects role, stack, target company type, and interview date. The
system asks a short series of adaptive questions, grades answers, identifies
weak concepts, and creates an initial study map.
### Practice session
The tutor asks one interview question at a time, requests the user's answer,
grades it against a rubric, asks follow-ups, and records learning evidence.
### Review plan
After each session, the system creates a concise plan:
- concepts to review
- mistakes to fix
- next practice questions
- suggested study order
- estimated readiness
### Gamified learning routine
The user follows a short loop:
1. choose or accept today's target
2. answer one interview question
3. receive rubric feedback
4. handle one follow-up or correction
5. see memory and readiness progress
6. unlock the next challenge or review card
The loop should feel like a game challenge ladder, but its rewards must be tied
to real learning evidence. The product should favor mastery, autonomy, and
readiness over empty points or exploitative streak pressure.
### Material ingestion
The user or operator uploads PDFs, notes, slides, docs, links, code snippets, or
existing interview-question sets. The system analyzes them into a concept graph,
detects missing prerequisites, flags weak evidence, and proposes generated
study assets.
### Teaching-aid generation
For concepts that need visual explanation, the system generates images,
slide-like lesson slices, diagrams, and worksheet-style teaching aids through
the configured image generation provider.
## Functional Requirements
### Interview question engine
- The system SHALL generate role-specific technical interview questions.
- The system SHALL support difficulty levels and follow-up questions.
- The system SHALL grade answers with rubric-backed evidence.
- The system SHALL separate factual correctness, depth, communication clarity,
and production judgment.
- The system SHALL keep original user answers as evidence for later review.
### Learner memory
- The system SHALL maintain a per-user learner profile.
- The system SHALL track concept mastery over time.
- The system SHALL track recurring misconceptions and weak reasoning patterns.
- The system SHALL store evidence for every memory update.
- The system SHALL distinguish durable memory from temporary session context.
### Ontology builder
- The system SHALL ingest source materials into a learning ontology.
- The system SHALL represent concepts, prerequisites, examples, questions,
rubrics, and source evidence as separate entities.
- The system SHALL detect missing prerequisite concepts.
- The system SHALL flag generated or inferred content that lacks source support.
- The system SHALL support human review before promoted learning assets become
canonical.
### Tutor workflows
- The system SHALL run tutoring behavior through configurable LLM workflows.
- The system SHALL use `agent-farm-go` as the workflow orchestration substrate.
- The backend SHALL be implemented in Go so the service can internalize
`agent-farm-go` workflow patterns and contracts directly.
- The system SHALL use `third-one` as the LLM execution kernel.
- The default LLM runtime SHALL target `deepseek-v4-flash` unless changed by
configuration.
- Workflow outputs SHALL prefer typed JSON contracts for grading, memory
extraction, ontology updates, and review-plan generation.
### Visual teaching assets
- The system SHALL generate educational visual assets for selected concepts.
- The system SHALL support slide-like lesson slices, diagrams, worksheets, and
interview explanation cards.
- Image generation SHALL be behind a provider/model configuration key. The
initial product intent is `gpt-image-v2`; implementation must verify the
actual OpenAI model identifier before wiring production calls.
- Generated assets SHALL keep source links, prompt lineage, and review state.
### Engagement and progression
- The system SHALL expose a role-specific readiness map.
- The system SHALL organize concepts into challenge ladders from definition to
pressure-tested interview answers.
- The system SHALL provide short daily or session-based learning loops.
- The system SHALL use adaptive difficulty to keep questions near the user's
current ability.
- The system SHALL provide meaningful rewards such as concept readiness,
strong-answer portfolio entries, boss questions, review cards, or visual
completion assets.
- The system SHALL avoid gambling-like random rewards, shame-based leaderboards,
and punitive streak loss.
## Memory Model
The memory layer should store structured learning state, not just retrieved
text chunks.
Core memory objects:
- `LearnerProfile`: target role, stack, timeline, preferences.
- `ConceptMastery`: concept-level state such as unknown, fragile, improving, or
mastered.
- `Misconception`: recurring wrong model or reasoning pattern.
- `Evidence`: original answer, quiz result, source passage, or tutor note that
supports a memory update.
- `Intervention`: explanation, hint, visual, analogy, or practice type that was
tried and whether it helped.
- `ReviewSchedule`: when and why a concept should be revisited.
MemPalace is useful as a reference for scoped, temporal, evidence-preserving
memory. Graphify is useful as a reference for building queryable knowledge
graphs from mixed materials. The product memory should still be implemented as
an application-owned data model because learner privacy, tenant boundaries,
review states, and deletion policies are product requirements.
## Success Metrics
- A user can complete a diagnostic interview in under 15 minutes.
- The system produces concept weaknesses that match human reviewer judgment.
- Generated follow-up questions target the user's actual weak points.
- The user can see progress across repeated practice sessions.
- Users complete repeated 5-10 minute learning loops without needing manual
planning.
- Readiness progress corresponds to actual graded answer evidence.
- Every durable memory update has inspectable evidence.
- Uploaded material produces a useful concept graph and a list of missing or
weak areas.
- Generated learning assets are reviewable before becoming canonical.
## Risks
- The tutor may overstate correctness or readiness.
- Generated ontology edges may look plausible but lack evidence.
- Job seekers may want company-specific interview prep before the core learning
loop is reliable.
- Image generation model names and API capabilities may change.
- Learner data can become sensitive, especially when expanding to minors or
school contexts.
## Open Questions
- Which stack should be the first interview track: backend Go, backend
Java/Spring, frontend React, or full-stack TypeScript?
- Should users upload their resume first, or should the first session start
with role/stack selection only?
- How much human review is required before generated ontology content becomes
canonical?
- Should teacher/operator review exist in MVP, or only after the job-seeker loop
is proven?
- Which progression surface should ship first: readiness map, challenge ladder,
strong-answer portfolio, or interview-date campaign?

View File

@@ -0,0 +1,74 @@
# Design
## Product Boundary
The first user is a software job seeker. The product should begin with
technical interview preparation because it gives clear task loops:
- ask a question
- receive an answer
- grade against a rubric
- ask follow-ups
- extract memory
- recommend review
- show progress and next challenge
General students remain a future expansion path, but the first requirements
should not be diluted by full K-12 scope.
## Architecture Boundary
Use a Go backend as the product service boundary. Internalize `agent-farm-go`
workflow patterns and contracts inside that backend boundary while keeping agent
behavior configuration-first where possible. Use `third-one` as the LLM
execution kernel with `deepseek-v4-flash` as the default configured model
target.
The product backend owns durable user, learner, memory, ontology, and asset
records. External memory or graph projects may inform design or become adapters,
but they should not own product privacy or tenant semantics.
## Memory Boundary
The platform should not model memory as a flat RAG corpus. It should keep
structured learning state:
- learner profile
- concept mastery
- misconception
- evidence
- intervention
- review schedule
Every durable memory update must include evidence so the product can explain why
it believes a learner is weak or strong on a concept.
## Ontology Boundary
Uploaded materials should become source-backed learning graphs. Inferred gaps
and generated explanations are candidates until reviewed or otherwise validated.
## Visual Asset Boundary
Image generation should support diagrams and slide-like learning slices, but the
asset pipeline must preserve prompt lineage, source concept links, and review
state. The desired image provider key is `gpt-image-v2`, but production
implementation must verify the current OpenAI API model name before wiring.
## Gamification Boundary
Use game-design principles to create healthy persistence: adaptive challenge,
visible growth, clear goals, strong session endings, and long-term readiness
progress. Do not optimize for compulsion alone. Random rewards, punitive streaks,
and shame-based leaderboards are out of scope for the first product baseline.
The initial learning loop is:
- readiness goal
- interview question
- answer
- rubric feedback
- follow-up or correction
- memory update
- visible progress
- next best challenge

View File

@@ -0,0 +1,44 @@
# Bootstrap Job Tutor Platform
## Summary
Create the first product baseline for a web-based AI tutor aimed at software job
seekers. The platform uses workflow-driven technical interview practice,
structured learner memory, and source-backed ontology building from uploaded
learning materials.
## Why
Software job seekers need adaptive practice and evidence-backed feedback more
than another generic interview-question list. A narrow first audience lets the
platform prove diagnosis, tutoring, memory extraction, and review planning
before expanding to broader student use cases.
## Motivation
The first product target should be narrow enough to build and evaluate:
developers preparing for technical interviews. This target naturally exercises
adaptive questioning, answer grading, misconception tracking, review planning,
and material-to-curriculum transformation. The same foundation can later expand
to general students.
## Scope
- Define the job-seeker-first product direction.
- Define learner memory requirements.
- Define ontology ingestion and gap-detection requirements.
- Define workflow boundaries for `agent-farm-go` and `third-one`.
- Define generated visual teaching asset requirements.
## Non-Goals
- Implement the full web service in this change.
- Replace a school LMS.
- Build a marketplace or certification product.
- Treat generated ontology content as canonical without review.
## Impact
This establishes the planning baseline for future implementation. Future code
changes should trace back to these specs and keep OpenSpec updated as the
product shape evolves.

View File

@@ -0,0 +1,52 @@
# engineering-quality Specification
## ADDED Requirements
### Requirement: Implementation follows bounded file-size and design principles
The system SHALL be implemented with bounded file sizes and pragmatic SOLID,
KISS, and YAGNI principles.
#### Scenario: backend source is Go by default
- **GIVEN** a developer adds backend service code
- **WHEN** no explicit OpenSpec change revisits the stack
- **THEN** the backend code is implemented in Go
- **AND** workflow integration uses typed Go boundaries.
#### Scenario: source files stay under the manual file-size limit
- **GIVEN** a developer adds or modifies manually authored source code
- **WHEN** the change is ready for review
- **THEN** no manually authored source file exceeds 600 lines
- **AND** files approaching the limit are split by responsibility.
#### Scenario: domain responsibilities remain separated
- **GIVEN** a feature touches interview grading, learner memory, ontology,
progression, or generated assets
- **WHEN** the implementation is designed
- **THEN** each responsibility remains in a cohesive module or service
- **AND** provider SDKs, storage adapters, and workflow calls do not leak into
unrelated domain logic.
#### Scenario: speculative abstractions are deferred
- **GIVEN** a future feature might need a generic framework, plugin system,
queue, or provider abstraction
- **WHEN** the current MVP slice does not require it
- **THEN** the implementation defers that abstraction
- **AND** uses the simplest design that satisfies the current OpenSpec
requirement.
### Requirement: Workflow state changes use typed contracts
The system SHALL prefer typed contracts for workflow inputs and outputs that
change product state.
#### Scenario: grading output updates learner state
- **GIVEN** a tutor workflow grades a user's answer
- **WHEN** the result can affect memory, readiness, or next challenge selection
- **THEN** the workflow returns a typed result
- **AND** the product state update does not depend on parsing freeform prose.

View File

@@ -0,0 +1,37 @@
# job-seeker-tutor Specification
## ADDED Requirements
### Requirement: Job seeker interview preparation is the first product target
The system SHALL prioritize software job seekers as the first product audience
while preserving an architecture that can later support general students.
#### Scenario: diagnostic session creates an initial study map
- **GIVEN** a user chooses a target software role and technology stack
- **WHEN** the user starts a diagnostic interview
- **THEN** the system asks role-relevant technical questions
- **AND** grades the user's answers against explicit rubrics
- **AND** creates an initial concept weakness map.
#### Scenario: practice session adapts to weak answers
- **GIVEN** a user answers an interview question weakly or incorrectly
- **WHEN** the tutoring workflow evaluates the answer
- **THEN** the system asks a targeted follow-up question or gives a corrective
explanation
- **AND** records the evidence used for that decision.
### Requirement: Review plans are based on learner evidence
The system SHALL generate review plans from graded answers, concept mastery, and
misconception evidence rather than generic topic lists.
#### Scenario: session produces next actions
- **GIVEN** a completed tutoring session
- **WHEN** the review-plan workflow runs
- **THEN** the user receives prioritized concepts, practice questions, and
review timing
- **AND** each recommendation links back to learning evidence.

View File

@@ -0,0 +1,33 @@
# learner-memory Specification
## ADDED Requirements
### Requirement: Durable learner memory is structured and evidence-backed
The system SHALL store durable learner memory as structured learning state
rather than an untyped RAG transcript.
#### Scenario: answer grading updates concept mastery
- **GIVEN** a user answers a technical interview question
- **WHEN** the answer is graded
- **THEN** the system may update concept mastery
- **AND** the update includes the original answer or grading evidence.
#### Scenario: repeated mistakes become misconceptions
- **GIVEN** the same reasoning error appears across multiple answers
- **WHEN** the memory extraction workflow runs
- **THEN** the system records or strengthens a misconception entry
- **AND** links it to the supporting answers.
### Requirement: Temporary context does not become durable truth automatically
The system SHALL separate session context from durable learner memory.
#### Scenario: inferred memory requires evidence
- **GIVEN** a tutor infers that a learner may misunderstand a concept
- **WHEN** the inference lacks enough evidence
- **THEN** the system records it as tentative or ignores it
- **AND** does not promote it to durable mastery state as confirmed truth.

View File

@@ -0,0 +1,34 @@
# learning-ontology Specification
## ADDED Requirements
### Requirement: Uploaded materials produce a source-backed ontology
The system SHALL analyze uploaded learning materials into concepts,
prerequisites, examples, questions, rubrics, and source evidence.
#### Scenario: material ingestion creates ontology candidates
- **GIVEN** an operator uploads learning materials
- **WHEN** the ingestion workflow completes
- **THEN** the system creates ontology candidate nodes and edges
- **AND** links each supported candidate to source evidence.
#### Scenario: gaps are identified separately from verified content
- **GIVEN** the ontology builder detects a missing prerequisite or weakly
supported concept
- **WHEN** it proposes missing content
- **THEN** the proposed content is marked as generated or inferred
- **AND** it is not treated as canonical until reviewed or validated.
### Requirement: Generated study assets keep lineage
The system SHALL preserve provenance for generated learning materials.
#### Scenario: visual teaching asset is generated
- **GIVEN** a concept needs a diagram or slide-like explanation
- **WHEN** the asset generation workflow runs
- **THEN** the generated asset stores its prompt, model configuration, source
concept, source evidence, and review state.

View File

@@ -0,0 +1,63 @@
# learning-progression Specification
## ADDED Requirements
### Requirement: Learning progression is evidence-backed and game-inspired
The system SHALL use game-inspired progression loops to make study sessions
rewarding while tying progress to learning evidence.
#### Scenario: session loop ends with visible progress
- **GIVEN** a user completes an interview-practice loop
- **WHEN** the system grades the answer and extracts memory
- **THEN** the user sees what improved, what remains weak, and the next
recommended challenge
- **AND** any progress shown is linked to grading or memory evidence.
#### Scenario: adaptive challenge stays near learner ability
- **GIVEN** the learner repeatedly fails a concept at the current difficulty
- **WHEN** the next challenge is selected
- **THEN** the system lowers difficulty, changes explanation strategy, or inserts
a recovery question
- **AND** does not simply continue escalating difficulty.
### Requirement: Progression surfaces support interview readiness
The system SHALL expose progression surfaces that help job seekers understand
interview readiness.
#### Scenario: readiness map summarizes concept state
- **GIVEN** the learner has completed diagnostic or practice sessions
- **WHEN** the readiness map is displayed
- **THEN** each concept has a readiness state
- **AND** the state can be traced to answer evidence, misconception evidence, or
review completion.
#### Scenario: boss question unlocks after prerequisite stability
- **GIVEN** a cluster of prerequisite concepts is stable enough
- **WHEN** the user reaches the next milestone
- **THEN** the system may unlock an integrated boss-style interview question
- **AND** the question combines multiple concepts under realistic constraints.
### Requirement: Engagement mechanics avoid exploitative patterns
The system SHALL avoid engagement mechanics that encourage unhealthy compulsion
or shame.
#### Scenario: streaks do not punish missed days
- **GIVEN** a user misses a scheduled practice day
- **WHEN** the user returns
- **THEN** the system helps the user resume with a recovery plan
- **AND** does not erase durable learning progress.
#### Scenario: rewards are connected to learning value
- **GIVEN** the system awards progress, badges, unlocks, or generated assets
- **WHEN** the reward is shown
- **THEN** it reflects mastery, effort, review completion, or interview readiness
- **AND** is not primarily a gambling-like random reward.

View File

@@ -0,0 +1,49 @@
# tutor-workflows Specification
## ADDED Requirements
### Requirement: Tutor behavior is workflow-driven
The system SHALL express core tutor behavior as configurable workflows using
`agent-farm-go` patterns internalized behind the Go backend boundary, with
`third-one` as the LLM execution kernel.
#### Scenario: grading workflow emits typed output
- **GIVEN** a user answer and grading rubric
- **WHEN** the grading workflow runs
- **THEN** it emits a typed grading result
- **AND** includes correctness, depth, communication clarity, evidence, and
follow-up recommendation fields.
#### Scenario: memory extraction workflow emits typed candidates
- **GIVEN** a graded interview answer
- **WHEN** the memory extraction workflow runs
- **THEN** it emits memory update candidates
- **AND** each candidate identifies its evidence and confidence.
### Requirement: Default LLM runtime is configurable
The system SHALL keep the LLM model target configurable while defaulting the
initial planning baseline to `deepseek-v4-flash`.
#### Scenario: workflow invokes the default model
- **GIVEN** no product override is configured
- **WHEN** a tutor workflow invokes the LLM kernel
- **THEN** the workflow uses the configured `deepseek-v4-flash` runtime target.
### Requirement: Backend stack is Go
The system SHALL use Go for the backend service so workflow orchestration,
typed contracts, and learner state updates share one strongly typed service
boundary.
#### Scenario: workflow calls cross typed backend interfaces
- **GIVEN** an API handler needs diagnostic, grading, memory extraction, or next
challenge behavior
- **WHEN** it invokes the workflow layer
- **THEN** it calls a typed Go interface
- **AND** does not mutate product state by parsing freeform shell output.

View File

@@ -0,0 +1,14 @@
# Tasks
- [x] 1. Create initial PRD for job-seeker-first tutoring.
- [x] 2. Create initial architecture note for workflow, LLM, memory, ontology,
and visual assets.
- [x] 3. Add OpenSpec capability specs for the first product baseline.
- [x] 4. Incorporate game-design learning loop and progression planning.
- [x] 5. Add repository engineering principles and code-size constraints.
- [x] 6. Lock Go backend and internalized agent-farm workflow direction.
- [ ] 7. Choose the first interview track and canonical concept seed list.
- [ ] 8. Define typed JSON contracts for diagnostic, grading, memory extraction,
ontology gap detection, and asset prompt generation.
- [ ] 9. Draft the first `agent-farm-go` YAML workflow package.
- [x] 10. Validate the OpenSpec change.

20
openspec/config.yaml Normal file
View File

@@ -0,0 +1,20 @@
schema: spec-driven
# Project context (optional)
# This is shown to AI when creating artifacts.
# Add your tech stack, conventions, style guides, domain knowledge, etc.
# Example:
# context: |
# Tech stack: TypeScript, React, Node.js
# We use conventional commits
# Domain: e-commerce platform
# Per-artifact rules (optional)
# Add custom rules for specific artifacts.
# Example:
# rules:
# proposal:
# - Keep proposals under 500 words
# - Always include a "Non-goals" section
# tasks:
# - Break tasks into chunks of max 2 hours