Commit f6fe8355 authored by whlwhlwhl's avatar whlwhlwhl
Browse files

Initial LightOp KernelPilot skill pack

parents
Pipeline #3628 canceled with stages
---
name: plan-understanding-quiz
description: Analyzes a plan and generates multiple-choice technical comprehension questions to verify user understanding before RLCR loop. Use when validating user readiness for start-rlcr-loop command.
model: opus
tools: Read, Glob, Grep
---
# Plan Understanding Quiz
You are a specialized agent that analyzes an implementation plan and generates targeted multiple-choice technical comprehension questions. Your goal is to test whether the user genuinely understands HOW the plan will be implemented, not just what the plan title says.
## Your Task
When invoked, you will be given the content of a plan file. You need to:
### Analyze the Plan
1. **Read the plan thoroughly** to understand:
- What components, files, or systems are being modified
- What technical approach or mechanism is being used
- How different pieces of the implementation connect together
- What existing patterns or systems the plan builds upon
2. **Explore the repository** to add context:
- Check README.md, CLAUDE.md, or other documentation files
- Look at the directory structure and key files referenced in the plan
- Understand the existing architecture that the plan interacts with
### Generate Multiple-Choice Questions
Create exactly 2 multiple-choice questions that test the user's understanding of the plan's **technical implementation details**. Each question must have exactly 4 options (A through D), with exactly 1 correct answer.
- **QUESTION_1**: Should test whether the user knows what components/systems are being changed and how. Focus on the core technical mechanism or approach.
- **QUESTION_2**: Should test whether the user understands how different parts of the implementation connect, what existing patterns are being followed, or what the key technical constraints are.
**Good question characteristics:**
- Derived from the plan's specific content, not generic templates
- Test understanding of HOW things will be done, not just WHAT the plan describes
- Not too low-level (no exact line numbers, exact syntax, or trivial details)
- A user who has carefully read and understood the plan should pick the correct answer
- A user who just skimmed the title or blindly accepted a generated plan would likely pick wrong
- Wrong options should be plausible (not obviously absurd) but clearly incorrect to someone who read the plan
**Example good questions:**
- "How does this plan integrate the new validation step into the startup flow?" with options covering different integration approaches
- "Which components need to change and why?" with options describing different component sets
**Example bad questions (avoid these):**
- "What is the plan about?" (too vague, tests nothing)
- "What are the risks?" (generic, not about implementation)
- "On which line does function X start?" (too low-level)
### Generate Plan Summary
Write a 2-3 sentence summary explaining what the plan does and how, suitable for educating a user who showed gaps in understanding. Focus on the technical approach, not just the goal.
## Output Format
You MUST output in this exact format, with each field on its own line:
```
QUESTION_1: <your first question>
OPTION_1A: <option A text>
OPTION_1B: <option B text>
OPTION_1C: <option C text>
OPTION_1D: <option D text>
ANSWER_1: <A, B, C, or D>
QUESTION_2: <your second question>
OPTION_2A: <option A text>
OPTION_2B: <option B text>
OPTION_2C: <option C text>
OPTION_2D: <option D text>
ANSWER_2: <A, B, C, or D>
PLAN_SUMMARY: <2-3 sentence technical summary>
```
## Important Notes
- Always output all 13 fields - never skip any
- ANSWER must be exactly one letter: A, B, C, or D
- Randomize the position of the correct answer (do not always put it in A or D)
- The plan may be written in any language - generate questions and options in the same language as the plan
- Focus on substance over format
- If the plan is very short or lacks technical detail, derive questions from whatever implementation hints are available
- Questions should feel like a friendly knowledge check, not an adversarial interrogation
## Example Output
```
QUESTION_1: How does this plan integrate the new validation step into the existing build pipeline?
OPTION_1A: By replacing the existing lint step with a combined lint-and-validate step
OPTION_1B: By adding a new PostToolUse hook that runs between the lint step and the compilation step
OPTION_1C: By modifying the compilation step to include inline validation checks
OPTION_1D: By creating a standalone pre-build script that runs before any other steps
ANSWER_1: B
QUESTION_2: Why does the plan require changes to both the CLI parser and the state file, rather than just the CLI?
OPTION_2A: The state file stores the original CLI arguments for audit logging purposes
OPTION_2B: The CLI parser is deprecated and the state file is the new configuration mechanism
OPTION_2C: The CLI parser adds the flag, the state file persists it across loop iterations, and the stop hook reads it at exit time
OPTION_2D: Both files share a common schema and must always be updated together
ANSWER_2: C
PLAN_SUMMARY: This plan adds a build output validation step by hooking into the PostToolUse lifecycle event. It modifies the hook configuration to insert a format checker between linting and compilation, and updates the state file schema to track validation results across RLCR rounds.
```
---
description: "Cancel active RLCR loop"
allowed-tools: ["Bash(${CLAUDE_PLUGIN_ROOT}/scripts/cancel-rlcr-loop.sh)", "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/cancel-rlcr-loop.sh --force)", "AskUserQuestion"]
disable-model-invocation: true
---
# Cancel RLCR Loop
To cancel the active loop:
1. Run the cancel script:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/cancel-rlcr-loop.sh"
```
2. Check the first line of output:
- **NO_LOOP** or **NO_ACTIVE_LOOP**: Say "No active RLCR loop found."
- **CANCELLED**: Report the cancellation message from the output
- **CANCELLED_METHODOLOGY_ANALYSIS**: Report the cancellation message from the output
- **CANCELLED_FINALIZE**: Report the cancellation message from the output
- **FINALIZE_NEEDS_CONFIRM**: The loop is in Finalize Phase. Continue to step 3
3. **If FINALIZE_NEEDS_CONFIRM**:
- Use AskUserQuestion to confirm cancellation with these options:
- Question: "The loop is currently in Finalize Phase. After this phase completes, the loop will end without returning to Codex review. Are you sure you want to cancel now?"
- Header: "Cancel?"
- Options:
1. Label: "Yes, cancel now", Description: "Cancel the loop immediately, finalize-state.md will be renamed to cancel-state.md"
2. Label: "No, let it finish", Description: "Continue with the Finalize Phase, the loop will complete normally"
- **If user chooses "Yes, cancel now"**:
- Run: `"${CLAUDE_PLUGIN_ROOT}/scripts/cancel-rlcr-loop.sh" --force`
- Report the cancellation message from the output
- **If user chooses "No, let it finish"**:
- Report: "Understood. The Finalize Phase will continue. Once complete, the loop will end normally."
**Key principle**: The script handles all cancellation logic. A loop is active if `state.md` (normal loop), `methodology-analysis-state.md` (Methodology Analysis Phase), or `finalize-state.md` (Finalize Phase) exists in the newest loop directory.
The loop directory with summaries, review results, and state information will be preserved for reference.
---
description: "Generate a repo-grounded idea draft via directed-swarm exploration"
argument-hint: "<idea-text-or-path> [--n <int>] [--output <path>]"
allowed-tools:
- "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/validate-gen-idea-io.sh:*)"
- "Read"
- "Glob"
- "Grep"
- "Task"
- "Write"
---
# Generate Idea Draft from Loose Input
Read and execute below with ultrathink.
## Hard Constraint: Draft-Only Output
This command MUST NOT implement features, modify source code, or create commits while producing the draft. Permitted writes are limited to the single output draft file produced in Phase 4; prerequisite directory creation for the default `.humanize/ideas/` path by the validation script is permitted as part of that write. All exploration subagents run read-only.
This command transforms a loose idea into a repo-grounded draft suitable as input to `/humanize:gen-plan`. It applies directed-diversity exploration: a lead picks N orthogonal directions, N parallel `Explore` subagents develop each, the lead synthesizes a draft with one primary direction plus N-1 alternatives. Each direction carries objective evidence from the repo.
## Workflow Overview
> **Sequential Execution Constraint**: All phases MUST execute strictly in order. Each phase fully completes before the next.
1. Parse Input
2. IO Validation
3. Direction Generation
4. Parallel Exploration
5. Synthesis and Write
---
## Phase 0: Parse Input
Extract from `$ARGUMENTS`:
- First positional: inline idea text or path to a `.md` file (required).
- `--n <int>`: number of directions. Default 6.
- `--output <path>`: target draft path. Default resolved by the validation script.
Do not interpret or rewrite the idea text here. Pass `$ARGUMENTS` through to Phase 1 unchanged.
---
## Phase 1: IO Validation
Run:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/validate-gen-idea-io.sh" $ARGUMENTS
```
Handle exit codes:
- `0`: Parse stdout to extract `INPUT_MODE`, `OUTPUT_FILE`, `SLUG`, `TEMPLATE_FILE`, `N` (each appears on its own `KEY: value` line). When `INPUT_MODE` is `file`, stdout additionally contains an `IDEA_BODY_FILE: <path>` line; extract that too. Continue to Phase 2. (`SLUG` is informational — the script has already incorporated it into `OUTPUT_FILE`, so later phases do not need to use `SLUG` directly.)
- `1`: Report "Missing or empty idea input" and stop.
- `2`: Report "Input looks like a file path but is missing, not readable, or not `.md`" and stop.
- `3`: Report "Output directory does not exist — please create it or choose a different path" and stop.
- `4`: Report "Output file already exists — choose a different path" and stop.
- `5`: Report "No write permission to output directory" and stop.
- `6`: Report "Invalid arguments" with the stdout usage text and stop.
- `7`: Report "Template file missing — plugin configuration error" and stop.
Before `VALIDATION_SUCCESS`, stdout may contain one or more lines starting with `WARNING:` (for example, `WARNING: short idea (<N> chars); proceeding` when an inline idea is under 10 characters). Surface these warnings to the user in your final report but continue Phase 2 normally. `WARNING:` lines are informational, not errors.
Obtain the idea body into memory as `IDEA_BODY`, based on `INPUT_MODE`:
- `inline`: stdout contains a sentinel block at the end of the success output; extract all text between the `=== IDEA_BODY_BEGIN ===` and `=== IDEA_BODY_END ===` lines (exclusive). The script emits a trailing newline after the last body line.
- `file`: read the full contents of `IDEA_BODY_FILE` using the `Read` tool.
Preserve byte-identical content in memory for later phases. No on-disk tempfile is created in inline mode — the stdout sentinel block is the authoritative source.
---
## Phase 2: Direction Generation
Generate exactly `N` orthogonal directions for exploring the idea.
### Context to Gather
Before generating directions, read (paths relative to the project root, which is `$(git rev-parse --show-toplevel)`):
- `README.md` at the project root.
- `CLAUDE.md` at the project root (if it exists).
- `.claude/CLAUDE.md` (if it exists).
- Top-level directory listing via `Glob` with pattern `*` (one level, no recursion).
This context grounds the directions in the actual repo rather than generic brainstorming.
### Generation Rules
Produce exactly `N` direction entries. Each entry has:
- `name`: a 2-5 word short label.
- `rationale`: a single sentence explaining why this angle is distinct from the other directions.
Hard constraint: **orthogonality**. Two near-duplicate directions defeat the directed-diversity premise. Before returning:
- If two directions feel like dupes, replace one with a genuinely different angle.
- If a direction collapses to "just do X better" with no angle distinction, replace it.
- Do not emit directions that merely restate the idea in different words.
### Retry and Degradation
- If the first pass returns fewer than `N` entries, regenerate once with an explicit "you MUST produce `N` orthogonal directions" instruction.
- If the second pass still returns fewer than `N` but at least 2, proceed with the reduced count and emit a warning to the user: `Warning: direction generation returned <count> of <N> requested directions; proceeding with reduced count.`
- If fewer than 2 directions are produced, stop with error: `direction generation degraded; retry.`
Store the final direction list as `DIRECTIONS` (ordered; index 0..len-1).
---
## Phase 3: Parallel Exploration
Dispatch all directions in a **single Task-tool message** containing one Task invocation per direction. This is the W2S parallel-swarm step.
### Subagent Invocation
For each direction in `DIRECTIONS`, launch one `Explore` subagent. Each invocation prompt MUST include:
1. A verbatim copy of the idea body (`IDEA_BODY`) captured in Phase 1.
2. The assigned direction (name + rationale).
3. The following instruction block (reproduce verbatim in the subagent prompt):
> Explore this direction within the current repo. Gather OBJECTIVE EVIDENCE:
> - Specific repo paths with existing patterns worth extending.
> - Prior art or precedent in the codebase or adjacent tooling.
> - Measurable considerations (approximate complexity, LOC surface, performance implications) where discoverable from reading the code.
>
> Read-only. Do not write any files.
>
> If no concrete evidence exists for this direction, report the literal string `exploratory, no concrete precedent` once in OBJECTIVE_EVIDENCE and stop exploring further. Fabrication of references is forbidden.
>
> Return a structured proposal with exactly these fields:
> - `APPROACH_SUMMARY`: concrete design description (what to build, core mechanism, affected components).
> - `OBJECTIVE_EVIDENCE`: bullet list of repo paths, prior art, or the `exploratory, no concrete precedent` sentinel.
> - `KNOWN_RISKS`: short bullet list.
> - `CONFIDENCE`: one of `high`, `medium`, `low`.
### Collection and Degradation
Collect all subagent responses. For each response:
- Parse the four required fields. If a field is missing, mark that proposal as degraded and drop it.
- If fewer than 2 proposals survive, stop with error: `exploration phase degraded; retry.`
- Otherwise continue with the surviving proposals.
Associate each surviving proposal with its originating direction (so Phase 4 can label it with the original direction name). When numbering alternatives in Phase 4 after any drops, renumber survivors sequentially as Alt-1..Alt-K (where K is the count of surviving non-primary directions). Do not preserve gaps from dropped proposals.
---
## Phase 4: Synthesis and Write
### Step 4.1: Pick the Primary Direction
Review all surviving proposals. Choose the strongest as the primary based on:
1. Evidence density — more concrete repo references outranks fewer.
2. Fit with existing repo patterns — extending patterns outranks introducing unfamiliar paradigms.
3. Implementation surface area — prefer smaller surface where quality is otherwise comparable.
4. Declared `CONFIDENCE``high` > `medium` > `low` as tiebreaker.
Record the chosen direction as `PRIMARY`; the remaining surviving directions become the Alt-1..Alt-K list (where K is the number of non-primary survivors, K ≤ N-1), numbered sequentially in their original direction order with no gaps for any dropped proposals.
### Step 4.2: Infer Title
Generate a 4-10 word Title Case title that captures the primary direction, not the original input phrasing verbatim. Example: idea `add undo/redo` with primary direction `command-pattern history` yields title `Command-Pattern Undo Stack For The Editor`.
### Step 4.3: Populate the Template
Read the template file located at `TEMPLATE_FILE` (from Phase 1 stdout).
Produce the finalized draft content in memory by replacing placeholders:
- `<TITLE>` — the inferred title.
- `<ORIGINAL_IDEA>` — byte-identical value of `IDEA_BODY` captured in Phase 1. Preserve line breaks, trailing newline, and all formatting. Do NOT paraphrase or re-indent.
- `<PRIMARY_NAME>` — primary direction's short name.
- `<PRIMARY_RATIONALE>` — primary direction's rationale (from Phase 2).
- `<PRIMARY_APPROACH_SUMMARY>` — primary proposal's `APPROACH_SUMMARY`.
- `<PRIMARY_OBJECTIVE_EVIDENCE>` — primary proposal's `OBJECTIVE_EVIDENCE`, rendered as a bullet list. If the subagent returned only the literal sentinel `exploratory, no concrete precedent`, render it as a single bullet: `- exploratory, no concrete precedent`.
- `<PRIMARY_KNOWN_RISKS>` — primary proposal's `KNOWN_RISKS`, rendered as a bullet list.
- `<ALTERNATIVES>` — for each non-primary survivor at its Alt index `i` (1-based, sequential per Step 4.1), emit:
```markdown
### Alt-<i>: <name>
- Gist: <one-paragraph summary derived from APPROACH_SUMMARY>
- Objective Evidence:
- <bullet from OBJECTIVE_EVIDENCE>
- ...
- Why not primary: <one sentence stating the tradeoff vs PRIMARY>
```
Separate consecutive Alt entries with a single blank line.
- `<SYNTHESIS_NOTES>` — one paragraph describing which elements from the alternatives could fold into the primary if the user chose a different direction. This is the lead's own synthesis note, not a subagent output.
### Step 4.4: Write the Draft File
Write the finalized content to `OUTPUT_FILE` using the `Write` tool. Single write; no progressive edits.
### Step 4.5: Report
Report to the user:
- Path written (`OUTPUT_FILE`).
- Primary direction name.
- Requested `N` and the actual direction count (note if reduced due to degradation).
- Next-step hint: `To turn this draft into a plan, run: /humanize:gen-plan --input <OUTPUT_FILE> --output <plan-path>`.
---
## Error Handling
- Phase 1 validation errors stop the command with a clear message. No partial output.
- Phase 2 degradation follows the retry-once + ≥2 minimum rule stated above.
- Phase 3 degradation follows the drop-and-continue + ≥2 minimum rule stated above.
- Never fabricate repo references or prior art. The `exploratory, no concrete precedent` sentinel from subagents is preserved verbatim in the draft.
- If any phase stops with an error, do not write a partial `OUTPUT_FILE`.
This diff is collapsed.
This diff is collapsed.
---
description: "Start iterative loop with Codex review"
argument-hint: "[path/to/plan.md | --plan-file path/to/plan.md] [--max N] [--codex-model MODEL:EFFORT] [--codex-timeout SECONDS] [--track-plan-file] [--push-every-round] [--base-branch BRANCH] [--full-review-round N] [--skip-impl] [--claude-answer-codex] [--agent-teams] [--yolo] [--skip-quiz] [--privacy]"
allowed-tools:
- "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh:*)"
- "Read"
- "Task"
- "AskUserQuestion"
---
# Start RLCR Loop
## Plan Compliance Pre-Check
Before running the setup script, validate the plan file for compliance. This is a fool-proofing mechanism that catches obviously wrong plan files early.
**Skip this entire pre-check if** any of these conditions are true:
- `$ARGUMENTS` contains `--skip-impl` (no plan file to validate)
- `$ARGUMENTS` contains `-h` or `--help` (just showing help)
### Extract the plan file path from arguments
Parse `$ARGUMENTS` to find the plan file path:
- If `--plan-file <path>` is present, use `<path>`
- Otherwise, use the first positional argument (the first argument that does not start with `--` and is not a value following a known flag like `--max`, `--codex-model`, `--codex-timeout`, `--base-branch`, `--full-review-round`, `--plan-file`)
- If no plan file path can be determined, skip the pre-check and let the setup script handle the error
### Basic path safety gate
Only proceed with the pre-check if the extracted path meets ALL of these conditions:
- Is a relative path (does not start with forward slash)
- Does not contain parent directory traversal (double dot path components)
- Contains only safe path characters: letters, digits, hyphen, underscore, dot, and forward slash
If any condition fails, skip the pre-check and let the setup script handle path validation.
### Read and validate plan content
1. Use the Read tool to read the plan file. If the file does not exist or cannot be read, skip the pre-check and let the setup script handle the error.
2. Use the Task tool to invoke the `humanize:plan-compliance-checker` agent (sonnet model):
```
Task tool parameters:
- model: "sonnet"
- prompt: Include the plan file content and ask the agent to:
1. Explore the repository structure (README, CLAUDE.md, main files)
2. Check if the plan content relates to this repository
3. Check if the plan contains branch-switching instructions
4. Return exactly one of: `PASS: <summary>`, `FAIL_RELEVANCE: <reason>`, or `FAIL_BRANCH_SWITCH: <details>`
```
3. **Parse the result** (fail-closed):
- If output contains `PASS`: continue to setup script below
- If output contains `FAIL_RELEVANCE`: report "Plan compliance check failed: the plan does not appear to be related to this repository." Show the reason. **Stop the command.**
- If output contains `FAIL_BRANCH_SWITCH`: report "Plan compliance check failed: the plan contains branch-switching instructions, which are incompatible with RLCR. The RLCR loop requires the working branch to remain constant across all rounds." Show the details. **Stop the command.**
- If output contains none of the above (malformed): report "Plan compliance check produced unexpected output. Cannot proceed." **Stop the command.**
---
## Plan Understanding Quiz
Before running the setup script, verify the user genuinely understands what the plan will do. This is an advisory check -- it never blocks the loop, but catches "wishful thinking" users who blindly accepted a generated plan without reading it.
**Skip this entire quiz if** any of these conditions are true:
- `$ARGUMENTS` contains `--skip-impl` (no plan to quiz about)
- `$ARGUMENTS` contains `--yolo` (user explicitly opted out of all pre-flight checks)
- `$ARGUMENTS` contains `--skip-quiz` (user explicitly opted out of the quiz)
- `$ARGUMENTS` contains `-h` or `--help` (just showing help)
- No plan content is available (the compliance pre-check was skipped because no plan file path could be determined)
### Run the quiz agent
1. Reuse the plan content that was already read during the compliance pre-check above (do not re-read the file).
2. Use the Task tool to invoke the `humanize:plan-understanding-quiz` agent (opus model):
```
Task tool parameters:
- model: "opus"
- prompt: Include the plan file content and ask the agent to:
1. Explore the repository structure for context
2. Analyze the plan's technical implementation details
3. Generate 2 multiple-choice questions (4 options each) and a plan summary
4. Return in the structured format: QUESTION_1, OPTION_1A-D, ANSWER_1, QUESTION_2, OPTION_2A-D, ANSWER_2, PLAN_SUMMARY
```
3. **Parse the result**: Extract all 13 fields from the agent output (QUESTION_1, OPTION_1A through OPTION_1D, ANSWER_1, QUESTION_2, OPTION_2A through OPTION_2D, ANSWER_2, PLAN_SUMMARY). If the output is malformed (any field missing or ANSWER not A/B/C/D), warn: "Plan understanding quiz unavailable, continuing without it." and proceed to the Setup section below.
### Ask questions and evaluate
4. Use AskUserQuestion to present QUESTION_1 as a multiple-choice question with the 4 options (OPTION_1A through OPTION_1D). Compare the user's choice against ANSWER_1:
- If the user selected the correct answer, mark QUESTION_1 as **PASS**
- Otherwise, mark as **WRONG**
5. Use AskUserQuestion to present QUESTION_2 as a multiple-choice question with the 4 options (OPTION_2A through OPTION_2D). Compare the user's choice against ANSWER_2 using the same criteria.
### Decide whether to proceed
6. **If both questions PASS**: Briefly acknowledge ("Your understanding of the plan looks solid. Proceeding with setup.") and continue to the Setup section below.
7. **If one or both questions are WRONG**: Show the PLAN_SUMMARY to the user to help them understand what the plan does and the correct answers to the questions they missed. Then use AskUserQuestion with the question: "Would you like to proceed with the RLCR loop anyway, or stop and review the plan more carefully first?" with these choices:
- "Proceed with RLCR loop"
- "Stop and review the plan first"
- If the user chooses **"Proceed with RLCR loop"**: Continue to the Setup section below.
- If the user chooses **"Stop and review the plan first"**: Report "Stopping. Please review the plan file and re-run start-rlcr-loop when ready." and **stop the command**.
---
## Setup
If the pre-check passed (or was skipped), and the quiz passed (or was skipped or user chose to proceed), execute the setup script to initialize the loop:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh" $ARGUMENTS
```
This command starts an iterative development loop where:
1. You execute the implementation plan with task-tag routing
- `coding` tasks: Claude executes directly
- `analyze` tasks: execute via `/humanize:ask-codex`
2. Write a summary of your work to the specified summary file
3. When you try to exit, Codex reviews your summary
4. If Codex finds issues, you receive feedback and continue
5. If Codex outputs "COMPLETE", the loop enters **Review Phase**
6. In Review Phase, `codex review --base <branch>` performs code review
7. If code review finds issues (`[P0-9]` markers), you fix them and continue
8. When no issues are found, the loop ends with a Finalize Phase
## What Is a Round
**One round = the agent believes the entire plan is finished.** A round boundary is when the agent writes a summary and attempts to exit, triggering Codex review. This is the fundamental semantic:
- A round is NOT one task, one milestone, one stage, or one layer of the plan.
- If the plan has multiple stages or milestones, they are all completed within a single round before writing the round summary.
- Intermediate progress checks (e.g., verifying a stage before starting the next) should use manual `ask-codex` calls, not round boundaries.
- Only write `round-N-summary.md` and attempt to exit when you believe ALL tasks in the plan are done.
## Goal Tracker System
This loop uses a **Goal Tracker** to prevent goal drift across iterations:
### Structure
- **IMMUTABLE SECTION**: Ultimate Goal and Acceptance Criteria (set in Round 0, never changed)
- **MUTABLE SECTION**: Active Tasks, Completed Items, Deferred Items, Plan Evolution Log
### Key Features
1. **Acceptance Criteria**: Each task maps to a specific AC - nothing can be "forgotten"
2. **Task Tag Routing**: Every task should carry `coding` or `analyze` tag from plan generation
- `coding -> Claude`, `analyze -> Codex`
3. **Plan Evolution Log**: If you discover the plan needs changes, document the change with justification
4. **Explicit Deferrals**: Deferred tasks require strong justification and impact analysis
5. **Full Alignment Checks**: At configurable intervals (default every 5 rounds: rounds 4, 9, 14, etc.), Codex conducts a comprehensive goal alignment audit. Use `--full-review-round N` to customize (min: 2)
### How to Use
1. **Round 0**: Initialize the Goal Tracker with Ultimate Goal and Acceptance Criteria
2. **Each Round**: Update task status, log plan changes, note discovered issues
3. **Before Exit**: Ensure goal-tracker.md reflects current state accurately
## Important Rules
1. **Write summaries**: Always write your work summary to the specified file before exiting
2. **Maintain Goal Tracker**: Keep goal-tracker.md up-to-date with your progress
3. **Be thorough**: Include details about what was implemented, files changed, and tests added
4. **No cheating**: Do not try to exit the loop by editing state files or running cancel commands
5. **Trust the process**: Codex's feedback helps improve the implementation
## BitLesson Workflow (Project Level)
Each project must maintain its own `.humanize/bitlesson.md` file.
If missing, `start-rlcr-loop` initializes it automatically with a strict template.
Per round requirements:
1. Read `.humanize/bitlesson.md` before execution
2. Run `bitlesson-selector` for each task/sub-task
3. Apply selected lesson IDs (or `NONE`) during implementation
4. Include `## BitLesson Delta` in the round summary with `Action: none|add|update`
If a problem is solved only after multiple rounds, add or update a precise lesson entry in `.humanize/bitlesson.md` (specific problem + specific solution).
By default, empty `.humanize/bitlesson.md` does not block `Action: none`; use `--require-bitlesson-entry-for-none` to enforce strict blocking.
## Stopping the Loop
- Reach the maximum iteration count
- Codex confirms completion with "COMPLETE", followed by successful code review (no `[P0-9]` issues)
- User runs `/humanize:cancel-rlcr-loop`
## Two-Phase System
The RLCR loop has two phases within the active loop:
1. **Implementation Phase**: Work by task tags (`coding -> Claude`, `analyze -> /humanize:ask-codex`), then Codex reviews your summary
2. **Review Phase**: After COMPLETE, `codex review` checks code quality with `[P0-9]` severity markers
The `--base-branch` option specifies the base branch for code review comparison. If not provided, it auto-detects from: remote default > local main > local master.
## Skip Implementation Mode
Use `--skip-impl` to skip the implementation phase and go directly to code review:
```bash
/humanize:start-rlcr-loop --skip-impl
```
In this mode:
- Plan file is optional (not required)
- No goal tracker initialization needed
- Immediately starts code review when you try to exit
- Useful for reviewing existing changes without an implementation plan
This is helpful when you want to:
- Review code changes made outside of an RLCR loop
- Get code quality feedback on existing work
- Skip the implementation tracking overhead for simple tasks
{
"description": "Humanize Codex Hooks - Native Stop hooks for RLCR loops",
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "{{HUMANIZE_RUNTIME_ROOT}}/hooks/loop-codex-stop-hook.sh",
"timeout": 7200,
"statusMessage": "humanize RLCR stop hook"
}
]
}
]
}
}
{
"codex_model": "gpt-5.5",
"codex_effort": "high",
"bitlesson_model": "haiku",
"agent_teams": false,
"alternative_plan_language": "",
"gen_plan_mode": "discussion"
}
# Bitter Lesson Workflow
BitLesson is the repository's Bitter Lesson-style knowledge capture system for RLCR rounds.
## Configuration
The selector reads `bitlesson_model` from the merged config hierarchy:
1. `config/default_config.json`
2. `~/.config/humanize/config.json`
3. `.humanize/config.json`
4. CLI flags where applicable
Provider routing is automatic:
- `gpt-*`, `o[N]-*` (e.g. `o1-*`, `o3-*`, `o4-*`) route to Codex
- `claude-*`, `haiku`, `sonnet`, `opus` route to Claude
If the configured provider binary is missing, the selector falls back to the default Codex model so the loop can still proceed.
When installing the Humanize runtime into Codex CLI, Humanize writes
`provider_mode: "codex-only"` into that runtime's user config. When that mode
is present, the selector forces BitLesson selection onto the Codex/OpenAI path
before provider resolution, even if an older default such as `haiku` would
otherwise route to Claude. This is not a repository-level limitation: Claude
Code and Kimi installs are supported separately.
## Workflow
Each project keeps its BitLesson knowledge base at `.humanize/bitlesson.md`.
When `start-rlcr-loop` begins:
1. The file is initialized from `templates/bitlesson.md` if it does not already exist
2. Each task or sub-task runs through `scripts/bitlesson-select.sh`
3. The selected lesson IDs are applied during implementation, or `NONE` is recorded when nothing matches
4. The stop gate validates a required `## BitLesson Delta` section in every round summary
## Summary Contract
Required summary shape:
```markdown
## BitLesson Delta
- Action: none|add|update
- Lesson ID(s): <IDs or NONE>
- Notes: <what changed and why>
```
Validation rules are strict:
- `Action: none` must use `Lesson ID(s): NONE` or leave the field empty
- `Action: add` and `Action: update` must reference concrete `BL-YYYYMMDD-short-name` IDs that exist in `.humanize/bitlesson.md`
- `--require-bitlesson-entry-for-none` can be used to block empty knowledge bases from repeatedly reporting `none`
# Draft: Optimize viz-dashboard — Merge into `humanize monitor` as a Web View
## Background
The `feat/viz-dashboard` branch currently introduces a `/humanize:viz` Claude
slash command and a local visualization dashboard for Humanize. While the
dashboard does show some data, the visualization of a *live, dynamically
running RLCR loop* is not clear enough today: status, progress per round, and
streamed log output are hard to follow as a loop progresses.
Separately, Humanize already ships a CLI-side monitoring capability that the
user runs in another terminal (NOT inside Claude Code):
```bash
source <path/to/humanize>/scripts/humanize.sh # or add to .bashrc / .zshrc
humanize monitor rlcr # RLCR loop
humanize monitor skill # All skill invocations (codex + gemini)
humanize monitor codex # Codex invocations only
humanize monitor gemini # Gemini invocations only
```
This monitor capability already captures live state (RLCR rounds, skill / Codex
/ Gemini invocations, log output). The web dashboard does not need to invent
its own capture pipeline — it should consume what `humanize monitor` already
provides.
## Goal
Optimize the viz-dashboard branch so that:
1. The dashboard becomes a **web view** layered on top of the existing
`humanize monitor` data sources, rather than an independent capture layer.
2. The dashboard can show **multiple live RLCR loops simultaneously**, with
per-loop status and streamed log output.
3. The entry point moves out of Claude (no more `/humanize:viz` slash command)
and into the `humanize monitor` CLI command, as a new web-online viewing
subcommand.
4. The new capability targets **online / remote viewing in a browser**, not a
local-only viewer that requires the user to be on the same machine running
Claude.
5. Useful features from the existing viz-dashboard branch — notably **cross-
conversation querying** (browsing past sessions / loops across different
Claude conversations) — are preserved.
## Non-goals
- Reimplementing the monitor capture pipeline (`humanize monitor rlcr/skill/
codex/gemini`). The dashboard consumes it; it does not replace it.
- Continuing to ship `/humanize:viz` as a Claude slash command.
- Adding chart panels or features explicitly removed in commit 1b575fe
("multi-project switcher + restart + remove chart panels").
## Required behaviors
1. **CLI entry point unification**
- Remove `commands/viz.md` and any `/humanize:viz` Claude command surface.
- Add a new `humanize monitor` subcommand (name to be agreed during
planning, e.g. `humanize monitor web` or `humanize monitor dashboard`)
that starts the web dashboard server.
- The other `humanize monitor rlcr|skill|codex|gemini` subcommands must
keep working unchanged (terminal-attached live tail).
2. **Live multi-loop view**
- The web dashboard MUST be able to display 2+ concurrently running RLCR
loops at the same time, each with:
- current status (running, paused, converged, stopped, …)
- current round / phase
- live streamed log output, updated in near real time
3. **Reuse existing monitor data**
- The dashboard MUST source its data from the same files / events that
`humanize monitor rlcr/skill/codex/gemini` already read. It MUST NOT add
a parallel capture mechanism (no new hooks just for the dashboard).
4. **Online / remote-viewable**
- The dashboard MUST be reachable from a browser over the network, not
only via `localhost` on the machine running Claude. Concrete binding /
auth design to be agreed during planning.
5. **Cross-conversation history**
- Cross-conversation querying (browsing past loops from different Claude
conversations / sessions) from the existing viz-dashboard branch MUST be
preserved.
## Branch hygiene
Before implementation begins, the branch `feat/viz-dashboard` MUST be rebased
onto the latest `upstream/dev` (humania-org/humanize). Several relevant changes
have landed on `upstream/dev` after the branch diverged, including:
- `Add ask-gemini skill and tool-filtered monitor subcommands` (introduces the
`humanize monitor skill|codex|gemini` subcommands the dashboard must reuse)
- `Remove PR loop feature entirely` (the viz-dashboard branch still references
PR-loop concepts via `commands/cancel-pr-loop.md`, `commands/start-pr-loop.md`,
`hooks/pr-loop-stop-hook.sh`)
- Multiple monitor / hook fixes
The rebase is therefore both a precondition for correctness (the dashboard
consumes the new monitor subcommands) and a cleanup step (PR-loop references
must be dropped).
## Out of scope (for this plan)
- Changes to RLCR semantics, hooks, or skill behavior.
- Authentication providers, identity systems, or multi-user account models —
basic remote-access protection is in scope, but full IAM is not.
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 750 370" width="750" height="370">
<!-- Arrows (behind nodes) -->
<!-- Plan -> Implement -->
<line x1="155" y1="160" x2="225" y2="160" stroke="#888" stroke-width="2" marker-end="url(#arrowhead)"/>
<!-- Implement -> Review Summary -->
<line x1="375" y1="160" x2="445" y2="160" stroke="#888" stroke-width="2" marker-end="url(#arrowhead)"/>
<!-- Review Summary -> Code Review (COMPLETE) -->
<line x1="510" y1="190" x2="510" y2="240" stroke="#888" stroke-width="2" marker-end="url(#arrowhead)"/>
<!-- Code Review -> Done (No Issues) -->
<line x1="570" y1="275" x2="675" y2="330" stroke="#888" stroke-width="2" marker-end="url(#arrowhead)"/>
<!-- Feedback loop: Review Summary -> Implement (curved back, top) -->
<path d="M 510 135 C 510 70, 300 70, 300 135" fill="none" stroke="#e07020" stroke-width="2" stroke-dasharray="6,3" marker-end="url(#arrowOrange)"/>
<!-- Issues loop: Code Review -> Implement bottom-right corner -->
<path d="M 450 275 C 425 245, 400 215, 370 195" fill="none" stroke="#9050c0" stroke-width="2" stroke-dasharray="6,3" marker-end="url(#arrowPurple)"/>
<!-- Arrow markers -->
<defs>
<marker id="arrowhead" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto">
<polygon points="0 0, 10 3.5, 0 7" fill="#888"/>
</marker>
<marker id="arrowOrange" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto">
<polygon points="0 0, 10 3.5, 0 7" fill="#e07020"/>
</marker>
<marker id="arrowPurple" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto">
<polygon points="0 0, 10 3.5, 0 7" fill="#9050c0"/>
</marker>
</defs>
<!-- Node: Your Plan -->
<rect x="30" y="130" width="120" height="60" rx="10" fill="#dbeafe" stroke="#3b82f6" stroke-width="2"/>
<text x="90" y="155" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="14" font-weight="bold">Your Plan</text>
<text x="90" y="175" text-anchor="middle" fill="#555" font-family="sans-serif" font-size="11">(plan.md)</text>
<!-- Node: Implement & Summarize (with Swarm Mode) -->
<!-- Swarm Mode: stacked instances behind main node -->
<rect x="238" y="122" width="140" height="60" rx="10" fill="#fff7ed" stroke="#f97316" stroke-width="1.5" stroke-dasharray="4,2"/>
<rect x="234" y="126" width="140" height="60" rx="10" fill="#ffedd5" stroke="#f97316" stroke-width="1.5" stroke-dasharray="4,2"/>
<!-- Main Implement node (front) -->
<rect x="230" y="130" width="140" height="60" rx="10" fill="#ffedd5" stroke="#f97316" stroke-width="2"/>
<text x="300" y="150" text-anchor="middle" fill="#b45309" font-family="sans-serif" font-size="11" font-style="italic">Claude:</text>
<text x="300" y="170" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="13" font-weight="bold">Working on it!</text>
<!-- Swarm workers (bigger) -->
<line x1="260" y1="190" x2="260" y2="204" stroke="#f97316" stroke-width="1.5"/>
<line x1="300" y1="190" x2="300" y2="204" stroke="#f97316" stroke-width="1.5"/>
<line x1="340" y1="190" x2="340" y2="204" stroke="#f97316" stroke-width="1.5"/>
<circle cx="260" cy="216" r="12" fill="#fff7ed" stroke="#f97316" stroke-width="1.5"/>
<circle cx="300" cy="216" r="12" fill="#fff7ed" stroke="#f97316" stroke-width="1.5"/>
<circle cx="340" cy="216" r="12" fill="#fff7ed" stroke="#f97316" stroke-width="1.5"/>
<text x="260" y="220" text-anchor="middle" fill="#c2410c" font-family="sans-serif" font-size="10" font-weight="bold">T1</text>
<text x="300" y="220" text-anchor="middle" fill="#c2410c" font-family="sans-serif" font-size="10" font-weight="bold">T2</text>
<text x="340" y="220" text-anchor="middle" fill="#c2410c" font-family="sans-serif" font-size="10" font-weight="bold">T3</text>
<text x="300" y="248" text-anchor="middle" fill="#c2410c" font-family="sans-serif" font-size="10" font-style="italic">Army of Swarm</text>
<!-- Node: Review Summary -->
<rect x="450" y="130" width="120" height="60" rx="10" fill="#f3e8ff" stroke="#a855f7" stroke-width="2"/>
<text x="510" y="150" text-anchor="middle" fill="#7c3aed" font-family="sans-serif" font-size="11" font-style="italic">Codex:</text>
<text x="510" y="170" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="13" font-weight="bold">You finished?</text>
<!-- Node: Code Review -->
<rect x="450" y="245" width="120" height="60" rx="10" fill="#f3e8ff" stroke="#a855f7" stroke-width="2"/>
<text x="510" y="268" text-anchor="middle" fill="#7c3aed" font-family="sans-serif" font-size="11" font-style="italic">Codex:</text>
<text x="510" y="288" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="13" font-weight="bold">Your work good?</text>
<!-- Node: Done -->
<circle cx="700" cy="330" r="25" fill="#dcfce7" stroke="#22c55e" stroke-width="2"/>
<text x="700" y="335" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="14" font-weight="bold">Done</text>
<!-- Labels on arrows -->
<text x="400" y="80" text-anchor="middle" fill="#e07020" font-family="sans-serif" font-size="11" font-style="italic">No! Your work not finished!</text>
<text x="525" y="222" text-anchor="start" fill="#888" font-family="sans-serif" font-size="11">COMPLETE</text>
<text x="388" y="236" text-anchor="start" fill="#9050c0" font-family="sans-serif" font-size="11" font-style="italic">No! Bug found!</text>
<text x="625" y="295" text-anchor="start" fill="#888" font-family="sans-serif" font-size="11">No Issues</text>
</svg>
# Install LightOp KernelPilot Humanize for Claude Code
## Prerequisites
- [codex](https://github.com/openai/codex) -- OpenAI Codex CLI (for review). Verify with `codex --version`.
- `jq` -- JSON processor. Verify with `jq --version`.
- `git` -- Git version control. Verify with `git --version`.
## Option 1: LightOp KernelPilot Marketplace (Recommended)
Clone KernelPilot, add the repository root as a Claude Code marketplace, install
the Humanize plugin, and expose the LightOp/DCU knowledge base as a Claude Code
skill:
```bash
git clone https://github.com/BBuf/kernel-pilot.git
cd kernel-pilot
humanize/scripts/install-skills-claude.sh
```
The installer performs the marketplace install, links `lightop-kernel-knowledge`,
installs the query dependency, hydrates Claude Code's installed skill cache with
absolute `HUMANIZE_RUNTIME_ROOT` and `KERNELPILOT_ROOT` paths, and fails if
either placeholder remains. Use the wrapper after manual plugin updates too,
because Claude Code does not hydrate `SKILL.md` placeholders during
`plugin install`.
Manual equivalent:
```bash
claude plugin marketplace add ./
claude plugin install humanize@KernelPilot
mkdir -p ~/.claude/skills
ln -s "$PWD/knowledge" ~/.claude/skills/lightop-kernel-knowledge
python3 -m pip install -r knowledge/requirements.txt
humanize/scripts/install-skills-claude.sh --skip-pip
```
Restart Claude Code after installing. If you prefer to run the marketplace
commands inside an existing Claude Code session, the equivalent slash commands
are:
```text
/plugin marketplace add /path/to/kernel-pilot
/plugin install humanize@KernelPilot
```
## Option 2: One-session Local Development
If you have the plugin cloned locally:
```bash
claude --plugin-dir /path/to/kernel-pilot/humanize \
--add-dir /path/to/kernel-pilot
```
This loads the plugin only for that Claude Code session. Add the knowledge skill
separately if you want `lightop-kernel-knowledge` discovery:
```bash
mkdir -p ~/.claude/skills
ln -s /path/to/kernel-pilot/knowledge ~/.claude/skills/lightop-kernel-knowledge
```
## Option 3: Upstream Humanize Only
If you only need generic Humanize RLCR and do not need KernelPilot's kernel
loop or knowledge pack, install the upstream Humanize marketplace instead:
```text
/plugin marketplace add PolyArch/humanize
/plugin install humanize@PolyArch
```
That upstream plugin is useful for general implementation loops, but it does
not provide `lightop-kernel-knowledge` from this repository.
## Verify Installation
After installing the LightOp KernelPilot marketplace, you should see Humanize
commands and the LightOp/DCU skills:
```text
/humanize:start-rlcr-loop
/humanize:gen-plan
/humanize:refine-plan
/humanize:ask-codex
lightop-kernel-agent-loop
lightop-kernel-knowledge
dcu-profiler-report
```
You can also inspect the installed plugin from a shell:
```bash
claude plugin list
claude plugin details humanize@KernelPilot
```
## Monitor Setup (Optional)
Add the monitoring helper to your shell for real-time progress tracking:
```bash
# Add to your .bashrc or .zshrc
source ~/.claude/plugins/cache/KernelPilot/humanize/<LATEST.VERSION>/scripts/humanize.sh
```
Then use:
```bash
humanize monitor rlcr # Monitor RLCR loop
```
## Other Install Guides
- [Install for Codex](install-for-codex.md)
- [Install for Kimi](install-for-kimi.md)
## Next Steps
See the [Usage Guide](usage.md) for detailed command reference and configuration options.
# Install Humanize Skills for Codex
This guide explains how to install Humanize for Codex CLI, including the skill runtime (`$CODEX_HOME/skills`) and the native Codex `Stop` hook (`$CODEX_HOME/hooks.json`).
## Quick Install (Recommended)
One-line install from anywhere:
```bash
tmp_dir="$(mktemp -d)" && git clone --depth 1 https://github.com/PolyArch/humanize.git "$tmp_dir/humanize" && "$tmp_dir/humanize/scripts/install-skills-codex.sh"
```
From the Humanize repo root:
```bash
./scripts/install-skills-codex.sh
```
Or use the unified installer directly:
```bash
./scripts/install-skill.sh --target codex
```
This will:
- Sync `humanize`, `humanize-gen-plan`, `humanize-refine-plan`, and `humanize-rlcr` into `${CODEX_HOME:-~/.codex}/skills`
- Copy runtime dependencies into `${CODEX_HOME:-~/.codex}/skills/humanize`
- Install/update native Humanize Stop hooks in `${CODEX_HOME:-~/.codex}/hooks.json`
- Enable the native `hooks` feature in `${CODEX_HOME:-~/.codex}/config.toml` when `codex` is available
- Seed `~/.config/humanize/config.json` with a Codex/OpenAI `bitlesson_model` when that key is not already set
- Mark that target's runtime config as `provider_mode: "codex-only"` when
using `--target codex`, so helper model routing stays on the Codex/OpenAI
path for that Codex installation.
- Use RLCR defaults: `codex exec` with `gpt-5.5:high`, `codex review` with `gpt-5.5:high`
Requires Codex CLI `0.114.0` or newer for native hooks. The hooks feature was renamed to `hooks`; older Codex builds that still expose `codex_hooks` are not supported by the Codex install path.
## Verify
```bash
ls -la "${CODEX_HOME:-$HOME/.codex}/skills"
```
Expected directories:
- `humanize`
- `humanize-gen-plan`
- `humanize-refine-plan`
- `humanize-rlcr`
Runtime dependencies in `humanize/`:
- `scripts/`
- `hooks/`
- `prompt-template/`
- `templates/`
- `config/`
- `agents/`
Installed files/directories:
- `${CODEX_HOME:-~/.codex}/skills/humanize/SKILL.md`
- `${CODEX_HOME:-~/.codex}/skills/humanize-gen-plan/SKILL.md`
- `${CODEX_HOME:-~/.codex}/skills/humanize-refine-plan/SKILL.md`
- `${CODEX_HOME:-~/.codex}/skills/humanize-rlcr/SKILL.md`
- `${CODEX_HOME:-~/.codex}/skills/humanize/scripts/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/hooks/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/prompt-template/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/templates/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/config/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/agents/`
- `${CODEX_HOME:-~/.codex}/hooks.json`
- `${XDG_CONFIG_HOME:-~/.config}/humanize/config.json` (created or updated only when Humanize config keys are unset)
Verify native hooks:
```bash
codex features list | rg '^hooks\s'
sed -n '1,220p' "${CODEX_HOME:-$HOME/.codex}/hooks.json"
```
Expected:
- `hooks` is present in `codex features list`
- `hooks.json` contains `loop-codex-stop-hook.sh`
- `${XDG_CONFIG_HOME:-~/.config}/humanize/config.json` contains `bitlesson_model` set to a Codex/OpenAI model such as `gpt-5.5`
- for `--target codex`, `${XDG_CONFIG_HOME:-~/.config}/humanize/config.json`
also contains `provider_mode: "codex-only"` for that Codex runtime
## Optional: Install for Both Codex and Kimi
```bash
./scripts/install-skill.sh --target both
```
## Useful Options
```bash
# Preview without writing
./scripts/install-skills-codex.sh --dry-run
# Custom Codex skills dir
./scripts/install-skills-codex.sh --codex-skills-dir /custom/codex/skills
# Reinstall only the native hooks/config
./scripts/install-codex-hooks.sh
```
## Troubleshooting
If scripts are not found from installed skills:
```bash
ls -la "${CODEX_HOME:-$HOME/.codex}/skills/humanize/scripts"
```
If native exit gating does not trigger:
```bash
codex features enable hooks
sed -n '1,220p' "${CODEX_HOME:-$HOME/.codex}/hooks.json"
```
If the installer reports that your config or installed Codex still uses `codex_hooks`, upgrade Codex first or change `${CODEX_HOME:-~/.codex}/config.toml` to `[features]\nhooks = true`.
# Install Humanize for Kimi CLI
This guide explains how to install the Humanize skills for [Kimi Code CLI](https://github.com/MoonshotAI/kimi-cli).
## Overview
Humanize provides four Agent Skills for kimi:
| Skill | Type | Purpose |
|-------|------|---------|
| `humanize` | Standard | General guidance for all workflows |
| `humanize-gen-plan` | Flow | Generate structured plan from draft |
| `humanize-refine-plan` | Flow | Refine annotated plan with CMT blocks |
| `humanize-rlcr` | Flow | Iterative development with Codex review |
## Installation
### Quick Install (Recommended)
From the Humanize repo root, run:
```bash
./scripts/install-skills-kimi.sh
```
This command will:
- Sync `humanize`, `humanize-gen-plan`, `humanize-refine-plan`, and `humanize-rlcr` into `~/.config/agents/skills`
- Copy runtime dependencies into `~/.config/agents/skills/humanize`
Common installer script (all targets):
```bash
./scripts/install-skill.sh --target kimi
```
### Manual Install
### 1. Clone or navigate to the humanize repository
```bash
cd /path/to/humanize
```
### 2. Copy skills and runtime bundle to kimi's skills directory
```bash
# Create the skills directory if it doesn't exist
mkdir -p ~/.config/agents/skills
# Copy all four skills
cp -r skills/humanize ~/.config/agents/skills/
cp -r skills/humanize-gen-plan ~/.config/agents/skills/
cp -r skills/humanize-refine-plan ~/.config/agents/skills/
cp -r skills/humanize-rlcr ~/.config/agents/skills/
# Copy runtime dependencies used by the skills
# (must match install-skill.sh's install_runtime_bundle)
cp -r scripts ~/.config/agents/skills/humanize/
cp -r hooks ~/.config/agents/skills/humanize/
cp -r prompt-template ~/.config/agents/skills/humanize/
cp -r templates ~/.config/agents/skills/humanize/
cp -r config ~/.config/agents/skills/humanize/
cp -r agents ~/.config/agents/skills/humanize/
# Hydrate runtime root placeholders inside SKILL.md files
for skill in humanize humanize-gen-plan humanize-refine-plan humanize-rlcr; do
sed -i.bak "s|{{HUMANIZE_RUNTIME_ROOT}}|$HOME/.config/agents/skills/humanize|g" \
"$HOME/.config/agents/skills/$skill/SKILL.md"
done
# Strip user-invocable flag from SKILL.md files for runtime visibility
# (This matches the behavior of scripts/install-skill.sh)
for skill in humanize humanize-gen-plan humanize-refine-plan humanize-rlcr; do
awk '
BEGIN { in_fm = 0; fm_done = 0 }
/^---[[:space:]]*$/ {
if (fm_done == 0) {
in_fm = !in_fm
if (in_fm == 0) {
fm_done = 1
}
}
print
next
}
in_fm && $0 ~ /^user-invocable:[[:space:]]*/ { next }
{ print }
' "$HOME/.config/agents/skills/$skill/SKILL.md" > "$HOME/.config/agents/skills/$skill/SKILL.md.tmp"
mv "$HOME/.config/agents/skills/$skill/SKILL.md.tmp" "$HOME/.config/agents/skills/$skill/SKILL.md"
done
```
### 3. Verify installation
```bash
# List installed skills
ls -la ~/.config/agents/skills/
# Should show:
# humanize/
# humanize-gen-plan/
# humanize-refine-plan/
# humanize-rlcr/
```
### 4. Restart kimi (if already running)
Skills are loaded at startup. Restart kimi to pick up the new skills:
```bash
# Exit current kimi session
/exit
# Or press Ctrl-D
# Start kimi again
kimi
```
## Usage
### List available skills
```bash
/help
```
Look for the "Skills" section in the help output.
### Use the skills
#### 1. Generate plan from draft
```bash
# Start the flow (will ask for input/output paths)
/flow:humanize-gen-plan
# Or load as standard skill
/skill:humanize-gen-plan
```
#### 2. Start RLCR development loop
```bash
# Start with plan file
/flow:humanize-rlcr path/to/plan.md
# With options
/flow:humanize-rlcr path/to/plan.md --max 20 --push-every-round
# Skip implementation, go directly to code review
/flow:humanize-rlcr --skip-impl
# Load as standard skill (no auto-execution)
/skill:humanize-rlcr
```
#### 3. Get general guidance
```bash
/skill:humanize
```
## Command Options
### RLCR Loop Options
| Option | Description | Default |
|--------|-------------|---------|
| `path/to/plan.md` | Plan file path | Required (unless --skip-impl) |
| `--max N` | Maximum iterations | 84 |
| `--codex-model MODEL:EFFORT` | Codex model | gpt-5.5:high |
| `--codex-timeout SECONDS` | Review timeout | 5400 |
| `--base-branch BRANCH` | Base for code review | auto-detect |
| `--full-review-round N` | Full alignment check interval | 5 |
| `--skip-impl` | Skip to code review | false |
| `--push-every-round` | Push after each round | false |
### Generate Plan Options
| Option | Description | Required |
|--------|-------------|----------|
| `--input <path>` | Draft file path | Yes |
| `--output <path>` | Plan output path | Yes |
## Prerequisites
Ensure you have `codex` CLI installed:
```bash
codex --version
```
The skills will use `gpt-5.5` with `high` effort level by default.
## Uninstall
To remove the skills:
```bash
rm -rf ~/.config/agents/skills/humanize
rm -rf ~/.config/agents/skills/humanize-gen-plan
rm -rf ~/.config/agents/skills/humanize-refine-plan
rm -rf ~/.config/agents/skills/humanize-rlcr
```
## Troubleshooting
### Skills not showing up
1. Check the skills directory exists:
```bash
ls ~/.config/agents/skills/
```
2. Ensure SKILL.md files are present:
```bash
cat ~/.config/agents/skills/humanize/SKILL.md | head -5
```
3. Restart kimi completely
### Codex not found
The skills expect `codex` to be in your PATH. If using a proxy, ensure `~/.zprofile` is configured:
```bash
# Add to ~/.zprofile if needed
export OPENAI_API_KEY="your-api-key"
# or other proxy settings
```
### Scripts not found
If skills report missing scripts like `setup-rlcr-loop.sh`, verify:
```bash
ls -la ~/.config/agents/skills/humanize/scripts
```
### Installer options
The installer supports:
```bash
./scripts/install-skill.sh --help
```
Common examples:
```bash
# Preview only
./scripts/install-skills-kimi.sh --dry-run
# Custom skills directory
./scripts/install-skills-kimi.sh --skills-dir /custom/skills/dir
```
### Output files not found
The skills save output to:
- Cache: `~/.cache/humanize/<project>/<timestamp>/`
- Loop data: `.humanize/rlcr/<timestamp>/`
Ensure these directories are writable.
## See Also
- [Kimi CLI Documentation](https://moonshotai.github.io/kimi-cli/)
- [Agent Skills Format](https://agentskills.io/)
- [Install for Codex](./install-for-codex.md)
- [Humanize README](../README.md)
This diff is collapsed.
# Streaming Protocol Contract
## Status
Frozen on April 17, 2026. Any change requires a new dated revision section appended below.
## Scope
This contract governs live streaming of RLCR round log files discovered for a single server project from `XDG_CACHE_HOME` or `HOME/.cache/humanize/SANITIZED/SID/round-N-{codex,gemini}-{run,review}.log`, where `SANITIZED` follows the rule implemented in `viz/server/rlcr_sources.py`. Session identity and liveness are derived from `.humanize/rlcr/SID/` metadata, but this contract does not define polling, parsing, or REST retrieval of frontmatter status files, goal-tracker files, round summaries, or review-result files.
## Channel Model
Streams are per-session, per-file. A stream is identified by `GET /api/sessions/SID/logs/FNAME`, where `SID` is the RLCR session id and `FNAME` is the exact cache-log basename such as `round-3-codex-run.log`. Each URL maps to one logical byte stream for one file generation within one session. Multiple sessions MAY be active concurrently, and clients MAY open multiple such channels in parallel.
## Event Shape
The live-log transport is Server-Sent Events. Every SSE frame MUST include `event: TYPE`, `id: N`, and one `data:` line containing exactly one JSON object. `TYPE` MUST equal the JSON `type` field. `id` MUST be a strictly increasing decimal string within the stream. `path` MUST be the canonical `FNAME` for the channel, not an absolute filesystem path. Raw file bytes MUST be base64 encoded into `bytes_b64` with standard RFC 4648 base64 and no line breaks. Payloads are: `snapshot` = `{ "type": "snapshot", "path": "...", "offset": 0, "bytes_b64": "...", "eof": false }`; `append` = `{ "type": "append", "path": "...", "offset": N, "bytes_b64": "..." }`; `resync` = `{ "type": "resync", "path": "...", "reason": "truncated|rotated|recreated|missing|overflow" }`; `eof` = `{ "type": "eof", "path": "..." }`. `offset` is the starting byte offset represented by `bytes_b64`.
## Truncation and Rotation Resync
The server MUST track the last emitted byte offset for each stream and, on POSIX, MUST also track `(st_dev, st_ino)` for the currently open file. If observed size shrinks below the last known offset, or `(st_dev, st_ino)` changes, or the file disappears, the server MUST emit `resync` and MUST restart the channel at offset `0` with a fresh `snapshot` as soon as the current file generation is readable again.
## Snapshot vs Append Semantics
A late-joining client MUST receive `snapshot` first. After that, only `append` events flow until a resync condition fires. Initial snapshots MUST be chunked at a maximum of `64 KiB` raw bytes per event; large files therefore produce multiple ordered `snapshot` events with increasing `offset` values until current EOF. `snapshot.eof=true` MAY be used only when the file is already terminal at snapshot time.
## Transport Mapping
When the server host is not `127.0.0.1`, live logs MUST be delivered only as SSE over HTTPS, and clients MUST authenticate with `?token=BEARER` on the stream URL. In that mode, WebSocket endpoints MUST be disabled or otherwise unreachable. When the server host equals `127.0.0.1`, SSE remains the live-log transport; `flask_sock` WebSocket MAY serve coarse session-level notifications such as `session-list-changed`, but MUST NOT carry per-file append data.
## Reconnect Behavior
On disconnect, the client SHOULD reconnect to the same stream URL and send `Last-Event-Id`. The server MUST retain the last `256` events per stream and MUST replay all events newer than that id when available. If the requested id is older than retained history or invalid for the current file generation, the server MUST recover by emitting `resync` and then a fresh `snapshot` from offset `0`.
## Latency Budget
Under nominal load of one project, up to `5` concurrent active sessions, and append rate not exceeding `100 KB/s` per stream, median append-to-render latency MUST be `<= 2.0s`. Tail `p95` latency MUST be `<= 5.0s`. Failure of the median assertion in CI MUST fail the build.
## Backpressure
If a client cannot keep up, the server MAY drop the oldest pending or retained `append` events for that stream, but it MUST emit a final `resync` with reason `overflow` and then provide a fresh `snapshot`. Silent data loss is forbidden.
## Out of Scope
This contract does not define the cancel control channel at `POST /api/sessions/SID/cancel`, project switching, daemon lifecycle, token issuance or validation, coarse session-list events, or any non-log REST payloads. Those surfaces require their own specifications.
## Example Event Stream
```text
event: snapshot
id: 101
data: {"type":"snapshot","path":"round-3-codex-run.log","offset":0,"bytes_b64":"U3RhcnQK","eof":false}
event: append
id: 102
data: {"type":"append","path":"round-3-codex-run.log","offset":6,"bytes_b64":"TW9yZQo="}
event: append
id: 103
data: {"type":"append","path":"round-3-codex-run.log","offset":11,"bytes_b64":"RGF0YQo="}
event: resync
id: 104
data: {"type":"resync","path":"round-3-codex-run.log","reason":"rotated"}
event: snapshot
id: 105
data: {"type":"snapshot","path":"round-3-codex-run.log","offset":0,"bytes_b64":"TmV3IGZpbGUK","eof":false}
```
# Humanize Usage Guide
Detailed usage documentation for the Humanize plugin. For installation, see [Install for Claude Code](install-for-claude.md).
## How It Works
Humanize creates an iterative feedback loop with two phases:
1. **Implementation Phase**: Claude works on your plan, Codex reviews summaries until COMPLETE
2. **Review Phase**: `codex review --base <branch>` checks code quality with `[P0-9]` severity markers
The loop continues until all acceptance criteria are met or no issues remain.
## Begin with the End in Mind
Before the RLCR loop starts any work, Humanize runs a **Plan Understanding Quiz** -- a brief pre-flight check that verifies you genuinely understand the plan you are about to execute.
### Why This Exists
The most expensive failure in AI-assisted development is not a bug. It is running a 40-round RLCR loop on a plan you never actually read. We call this **wishful coding**: treating a generated plan like a wish -- toss it in, hope for the best, check back later.
The problem is structural. An RLCR loop is an amplifier: it will faithfully execute whatever plan you give it. If the plan is wrong, the loop makes it wrong faster and at scale. If the plan is right but you do not understand it, you cannot course-correct when Codex raises questions, and the loop drifts.
Understanding your plan before execution is not optional overhead. It is the single highest-leverage thing you can do to ensure the loop succeeds.
### How the Quiz Works
When you run `start-rlcr-loop`, an independent agent analyzes the plan and generates two multiple-choice questions about the plan's technical implementation details:
1. **What components are changing and how?** -- Tests whether you know the core mechanism.
2. **How do the pieces connect?** -- Tests whether you understand the architecture being modified.
If you answer both correctly, the loop proceeds immediately. If you miss one or both, Humanize explains what the plan actually does and offers a choice: proceed anyway, or stop and review.
The quiz is advisory, not a gate. You always have the option to proceed. But that moment of friction -- the two seconds it takes to read the question and realize you do not know the answer -- is the entire point.
### Skipping the Quiz
- `--skip-quiz` -- Skip the quiz only. The rest of the RLCR loop behaves normally.
- `--yolo` -- Skip the quiz AND let Claude answer Codex's open questions directly (`--claude-answer-codex`). This is full automation mode for users who have already reviewed the plan and want to hand over complete control.
- Plans started via `gen-plan --auto-start-rlcr-if-converged` skip the quiz automatically, because the gen-plan convergence discussion already verified the user's understanding.
## Typical Planning Flow
1. Generate the initial implementation plan:
```bash
/humanize:gen-plan --input draft.md --output docs/plan.md
```
2. If the plan is reviewed with comment annotations, refine it and generate a QA ledger:
```bash
/humanize:refine-plan --input docs/plan.md
```
3. Start the RLCR loop on the refined plan:
```bash
/humanize:start-rlcr-loop docs/plan.md
```
## Commands
| Command | Purpose |
|---------|---------|
| `/start-rlcr-loop <plan.md>` | Start iterative development with Codex review |
| `/cancel-rlcr-loop` | Cancel active loop |
| `/gen-plan --input <draft.md> --output <plan.md>` | Generate structured plan from draft |
| `/refine-plan --input <annotated-plan.md>` | Refine an annotated plan and generate a QA ledger |
| `/ask-codex [question]` | One-shot consultation with Codex |
## Command Reference
### start-rlcr-loop
```
/humanize:start-rlcr-loop [path/to/plan.md | --plan-file path/to/plan.md] [OPTIONS]
OPTIONS:
--plan-file <path> Explicit plan file path (alternative to positional arg)
--max <N> Maximum iterations before auto-stop (default: 84)
--codex-model <MODEL:EFFORT>
Codex model and reasoning effort (default from config, fallback gpt-5.5:high)
--codex-timeout <SECONDS>
Timeout for each Codex review in seconds (default: 5400)
--track-plan-file Indicate plan file should be tracked in git (must be clean)
--push-every-round Require git push after each round (default: commits stay local)
--base-branch <BRANCH> Base branch for code review phase (default: auto-detect)
Priority: user input > remote default > main > master
--full-review-round <N>
Interval for Full Alignment Check rounds (default: 5, min: 2)
Full Alignment Checks occur at rounds N-1, 2N-1, 3N-1, etc.
--skip-impl Skip implementation phase, go directly to code review
Plan file is optional when using this flag
--claude-answer-codex When Codex finds Open Questions, let Claude answer them
directly instead of asking user via AskUserQuestion
--agent-teams Enable Claude Code Agent Teams mode for parallel development.
Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 environment variable.
Claude acts as team leader, splitting tasks among team members.
--yolo Skip Plan Understanding Quiz and let Claude answer Codex Open
Questions directly. Alias for --skip-quiz --claude-answer-codex.
--skip-quiz Skip the Plan Understanding Quiz only (without other changes).
-h, --help Show help message
```
### gen-plan
```
/humanize:gen-plan --input <path/to/draft.md> --output <path/to/plan.md> [OPTIONS]
OPTIONS:
--input Path to the input draft file (required)
--output Path to the output plan file (required)
--auto-start-rlcr-if-converged
Start the RLCR loop automatically when the plan is converged
(discussion mode only; ignored in --direct)
--discussion Use discussion mode (iterative Claude/Codex convergence rounds)
--direct Use direct mode (skip convergence rounds, proceed immediately to plan)
-h, --help Show help message
```
The gen-plan command transforms rough draft documents into structured implementation plans.
Workflow:
1. Validates input/output paths
2. Checks if draft is relevant to the repository
3. Analyzes draft for clarity, consistency, completeness, and functionality
4. Engages user to resolve any issues found
5. Generates a structured plan.md with acceptance criteria
6. Optionally starts `/humanize:start-rlcr-loop` if `--auto-start-rlcr-if-converged` conditions are met
If reviewers later annotate the generated plan with comment blocks, run
`/humanize:refine-plan --input <plan.md>` before starting or resuming implementation.
### refine-plan
```
/humanize:refine-plan --input <path/to/annotated-plan.md> [OPTIONS]
OPTIONS:
--input <path> Path to the annotated plan file (required)
--output <path> Path to the refined plan output file
Defaults to refining --input in place
--qa-dir <path> Directory for QA document output
Default: .humanize/plan_qa
--alt-language <LANG>
Generate translated plan and QA variants
Supported: zh, ko, ja, es, fr, de, pt, ru, ar
Full language names are also accepted; en/English is a no-op
--discussion Interactive mode for ambiguous comment classification
--direct Non-interactive mode; makes minimal safe assumptions
-h, --help Show help message
```
The refine-plan command reads an annotated `gen-plan` document, processes embedded review
comments, removes those comment blocks from the final plan, and writes a QA ledger that records
how each comment was handled.
**Usage examples:**
```bash
# Refine a plan in place and write QA output to the default directory
/humanize:refine-plan --input docs/plan.md
# Write the refined plan to a new file and store QA output in a custom directory
/humanize:refine-plan --input docs/plan.annotated.md --output docs/plan.refined.md --qa-dir docs/plan-qa
# Run in direct mode and generate translated variants
/humanize:refine-plan --input docs/plan.md --direct --alt-language zh
```
**Annotated comment block format:**
`refine-plan` supports three comment formats for reviewer annotations. Both inline
and multi-line comment blocks are supported in all formats:
**Classic format (CMT:/ENDCMT):**
```markdown
Text before CMT: clarify why AC-3 is split here ENDCMT text after
```
```markdown
CMT:
Please investigate whether this task should depend on task4 or task5.
If the dependency is unclear, add a pending decision instead of guessing.
ENDCMT
```
**Short tag format (<cmt></cmt>):**
```markdown
Text before <cmt>clarify why AC-3 is split here</cmt> text after
```
```markdown
<cmt>
Please investigate whether this task should depend on task4 or task5.
If the dependency is unclear, add a pending decision instead of guessing.
</cmt>
```
**Long tag format (<comment></comment>):**
```markdown
Text before <comment>clarify why AC-3 is split here</comment> text after
```
```markdown
<comment>
Please investigate whether this task should depend on task4 or task5.
If the dependency is unclear, add a pending decision instead of guessing.
</comment>
```
Rules:
- At least one non-empty comment block must exist in the input file.
- Comment markers inside fenced code blocks or HTML comments are ignored.
- Empty comment blocks are removed but do not create QA ledger entries.
- The input plan must still follow the `gen-plan` section schema.
- All three formats can be mixed within the same file.
**QA output structure:**
For an input plan named `plan.md`, the default QA output path is `.humanize/plan_qa/plan-qa.md`.
The generated QA document includes:
- `## Summary`: overall refinement outcome and comment counts
- `## Comment Ledger`: one row per raw `CMT-N` block with classification, location, excerpt, and disposition
- `## Answers`: responses to question comments and any clarifying edits
- `## Research Findings`: repository research performed for `research_request` comments
- `## Plan Changes Applied`: changes made for `change_request` comments and cross-reference updates
- `## Remaining Decisions`: unresolved items or assumption-heavy decisions that still need user input
- `## Refinement Metadata`: input/output paths, QA path, classification counts, modified sections, convergence status, and date
Disposition values in the ledger are `answered`, `applied`, `researched`, `deferred`, or
`resolved`.
If `--alt-language` is set to a supported non-English language, the command also generates
translated plan and QA variants by inserting `_<code>` before the file extension, such as
`plan_zh.md` and `plan-qa_zh.md`.
### ask-codex
```
/humanize:ask-codex [OPTIONS] <question or task>
OPTIONS:
--codex-model <MODEL:EFFORT>
Codex model and reasoning effort (default from config, fallback gpt-5.5:high)
--codex-timeout <SECONDS>
Timeout for the Codex query in seconds (default: 3600)
-h, --help Show help message
```
The ask-codex skill sends a one-shot question or task to Codex and returns the response
inline. Unlike the RLCR loop, this is a single consultation without iteration -- useful
for getting a second opinion, reviewing a design, or asking domain-specific questions.
Responses are saved to `.humanize/skill/<timestamp>/` with `input.md`, `output.md`,
and `metadata.md` for reference.
## Configuration
Humanize uses a 4-layer config hierarchy (lowest to highest priority):
1. **Plugin defaults**: `config/default_config.json`
2. **User config**: `~/.config/humanize/config.json`
3. **Project config**: `.humanize/config.json`
4. **CLI flags**: Command-line arguments (where available)
Current built-in keys:
| Key | Default | Description |
|-----|---------|-------------|
| `codex_model` | `gpt-5.5` | Shared default model for Codex-backed review and analysis |
| `codex_effort` | `high` | Shared default reasoning effort (`xhigh`, `high`, `medium`, `low`) |
| `bitlesson_model` | `haiku` | Model used by the BitLesson selector agent |
| `provider_mode` | unset | Optional runtime mode hint such as `codex-only` |
| `agent_teams` | `false` | Project-level default for agent teams workflow |
| `alternative_plan_language` | `""` | Optional translated plan variant language; supported values include `Chinese`, `Korean`, `Japanese`, `Spanish`, `French`, `German`, `Portuguese`, `Russian`, `Arabic`, or ISO codes like `zh` |
| `gen_plan_mode` | `discussion` | Default plan-generation mode |
### Codex Model Configuration
All Codex-using features (RLCR loop, ask-codex) share the same model configuration:
| Key | Default | Description |
|-----|---------|-------------|
| `codex_model` | `gpt-5.5` | Model used for Codex operations (reviews, analysis, queries) |
| `codex_effort` | `high` | Reasoning effort (`xhigh`, `high`, `medium`, `low`) |
To override, add to `.humanize/config.json`:
```json
{
"codex_model": "gpt-5.2",
"codex_effort": "xhigh",
"bitlesson_model": "sonnet"
}
```
When installing the Humanize runtime into Codex CLI, Humanize also seeds
`${XDG_CONFIG_HOME:-~/.config}/humanize/config.json` with a Codex/OpenAI
`bitlesson_model` and `provider_mode: "codex-only"` when those keys are unset.
That flag is only a routing hint for that Codex runtime; the repository also
supports Claude Code and Kimi installs.
Codex model is resolved with this precedence:
1. CLI `--codex-model` flag (highest priority)
2. Feature-specific defaults
3. Config-backed defaults from the 4-layer hierarchy above
4. Hardcoded fallback (`gpt-5.5:high`)
**Migration note**: If your `.humanize/config.json` contains the legacy keys
`loop_reviewer_model` or `loop_reviewer_effort`, they are silently ignored.
Use `codex_model` and `codex_effort` instead.
## Monitoring
Set up the monitoring helper for real-time progress tracking:
```bash
# Add to your .bashrc or .zshrc
source ~/.claude/plugins/cache/PolyArch/humanize/<LATEST.VERSION>/scripts/humanize.sh
# Terminal monitors (one project per terminal):
humanize monitor rlcr # latest RLCR loop log
humanize monitor skill # all skill invocations (codex + gemini)
humanize monitor codex # ask-codex skill invocations only
humanize monitor gemini # ask-gemini skill invocations only
# Browser dashboard (multiple loops at once, foreground default):
humanize monitor web --project /path/to/project
```
Progress data is stored in `.humanize/rlcr/<timestamp>/` for each loop session.
### Browser dashboard (`humanize monitor web`)
The web dashboard layers on top of the same `.humanize/rlcr/<session>/`
metadata and `~/.cache/humanize/<sanitized-project>/<session>/round-*-codex-{run,review}.log`
cache logs that the terminal monitors read. There is no parallel
capture pipeline; the dashboard is a reader, not a writer.
Lifecycle (per DEC-1, DEC-3):
- Foreground default (`humanize monitor web --project <path>`). Press
Ctrl+C to stop. The server is CLI-fixed to one project at startup;
to monitor several projects simultaneously, run multiple instances
(one per project) with different `--port` values.
- `--daemon` runs the same server inside a per-project tmux session
(`humanize-viz-<8-hex>`); use `viz-stop.sh --project <path>` or
the project's own tmux kill command to stop it.
Per-session inline live log panes appear on the home page for every
active session, driven by Server-Sent Events from
`/api/sessions/<session_id>/logs/<basename>`. Multiple loops stream
in parallel without leaving the home page.
### Remote browser access
The dashboard binds to `127.0.0.1` by default. To expose it over the
network, supply `--host` and an authentication token. The token is
required for any non-loopback host; the server refuses to start
otherwise.
Token-aware endpoints honor `Authorization: Bearer <tok>` for normal
fetch requests and `?token=<tok>` query parameters for the SSE stream
(per DEC-4: browsers cannot set arbitrary headers on EventSource).
WebSocket transport is rejected entirely in remote mode.
#### Pattern 1 (recommended): SSH tunnel
The safest remote pattern keeps the server bound to localhost and
forwards the port over SSH:
```bash
# On the server machine:
humanize monitor web --project /path/to/project --port 18000
# On your laptop:
ssh -N -L 18000:localhost:18000 user@server.example.com
# Then open http://localhost:18000 in the local browser.
```
No token is required because the server still binds to loopback. The
SSH tunnel provides authentication and encryption.
#### Pattern 2: Direct LAN bind
For trusted-network deployments where SSH tunneling is impractical:
```bash
# Generate a strong random token (one-time):
TOKEN="$(openssl rand -hex 32)"
# Start the dashboard:
humanize monitor web \
--project /path/to/project \
--host 0.0.0.0 \
--port 18000 \
--auth-token "$TOKEN"
# Or supply the token via env var instead of CLI:
HUMANIZE_VIZ_TOKEN="$TOKEN" humanize monitor web \
--project /path/to/project --host 0.0.0.0 --port 18000
```
Open the dashboard with `http://server:18000/?token=<TOKEN>` once;
the browser caches the token in `sessionStorage` and propagates it
on subsequent fetches and SSE reconnects.
## Cancellation
- **RLCR loop**: `/humanize:cancel-rlcr-loop`
## Environment Variables
### HUMANIZE_CODEX_BYPASS_SANDBOX
**WARNING: This is a dangerous option that disables security protections. Use only if you understand the implications.**
- **Purpose**: Controls whether Codex runs with sandbox protection
- **Default**: Not set (uses `--full-auto` with sandbox protection)
- **Values**:
- `true` or `1`: Bypasses Codex sandbox and approvals (uses `--dangerously-bypass-approvals-and-sandbox`)
- Any other value or unset: Uses safe mode with sandbox
**When to use this**:
- Linux servers without landlock kernel support (where Codex sandbox fails)
- Automated CI/CD pipelines in trusted environments
- Development environments where you have full control
**When NOT to use this**:
- Public or shared development servers
- When reviewing untrusted code or pull requests
- Production systems
- Any environment where unauthorized system access could cause damage
**Security implications**:
- Codex will have unrestricted access to your filesystem
- Codex can execute arbitrary commands without approval prompts
- Review all code changes carefully when using this mode
**Usage example**:
```bash
# Export before starting Claude Code
export HUMANIZE_CODEX_BYPASS_SANDBOX=true
# Or set for a single session
HUMANIZE_CODEX_BYPASS_SANDBOX=true claude --plugin-dir /path/to/humanize
```
#!/usr/bin/env python3
"""
Helper script to check for incomplete tasks from Claude Code.
Supports both:
- Legacy TodoWrite tool (parsed from transcript)
- New Task system (read directly from ~/.claude/tasks/<session_id>/)
Exit codes:
0 - All tasks are completed (or no tasks exist)
1 - There are incomplete tasks (details on stdout)
2 - Parse error reading hook input JSON
Usage:
echo '{"session_id": "...", "transcript_path": "/path/to/transcript.jsonl"}' | python3 check-todos-from-transcript.py
"""
import json
import re
import sys
from pathlib import Path
from typing import List, Tuple
LANE_PREFIX_PATTERN = re.compile(r"^\s*\[(mainline|blocking|queued)\](?:\s|$)", re.IGNORECASE)
def classify_lane(*parts: str) -> str:
"""Infer the task lane from content, defaulting to blocking for safety."""
for part in parts:
if not part:
continue
match = LANE_PREFIX_PATTERN.match(part)
if match:
return match.group(1).lower()
return "blocking"
def extract_tool_calls_from_entry(entry: dict) -> List[Tuple[str, dict]]:
"""
Extract tool calls from a transcript entry.
Returns list of (tool_name, tool_input) tuples.
"""
tool_calls = []
entry_type = entry.get("type", "")
# Pattern 1 & 2: Extract content list from assistant or message entries
if entry_type == "assistant":
content = entry.get("message", {}).get("content", [])
elif entry_type == "message":
content = entry.get("content", [])
else:
content = []
# Extract tool calls from content list
if isinstance(content, list):
for block in content:
if isinstance(block, dict) and block.get("type") == "tool_use":
tool_name = block.get("name", "")
tool_input = block.get("input", {})
if tool_name:
tool_calls.append((tool_name, tool_input))
# Pattern 3: Direct tool_use entry
if entry_type == "tool_use":
tool_name = entry.get("name", "") or entry.get("tool_name", "")
tool_input = entry.get("input", {}) or entry.get("tool_input", {})
if tool_name:
tool_calls.append((tool_name, tool_input))
return tool_calls
def find_incomplete_todos_from_transcript(transcript_path: Path) -> List[dict]:
"""
Parse transcript JSONL and find incomplete legacy todos (TodoWrite only).
Returns list of incomplete items with 'status' and 'content' keys.
"""
if not transcript_path.exists():
return []
# Legacy: track the most recent TodoWrite todos
latest_todos = []
with open(transcript_path, 'r', encoding='utf-8') as f:
for line in f:
line = line.strip()
if not line:
continue
try:
entry = json.loads(line)
except json.JSONDecodeError:
continue
# Extract all tool calls from this entry
for tool_name, tool_input in extract_tool_calls_from_entry(entry):
# Legacy: TodoWrite
if tool_name == "TodoWrite":
todos = tool_input.get("todos", [])
if todos:
latest_todos = todos
# Build list of incomplete items from legacy todos
incomplete = []
for todo in latest_todos:
status = todo.get("status", "")
content = todo.get("content", "")
if status != "completed":
lane = classify_lane(content)
if lane == "queued":
continue
incomplete.append({
"status": status,
"content": content,
"source": "todo",
"lane": lane,
})
return incomplete
def find_incomplete_tasks_from_directory(session_id: str, tasks_base_dir: str = "") -> List[dict]:
"""
Read task files directly from ~/.claude/tasks/<session_id>/ directory.
This is the authoritative source for task state, as it reflects
the actual in-memory task list that Claude Code maintains.
Args:
session_id: The Claude Code session ID
tasks_base_dir: Optional override for tasks base directory (for testing)
Returns list of incomplete items with 'status' and 'content' keys.
"""
if tasks_base_dir:
tasks_dir = Path(tasks_base_dir) / session_id
else:
tasks_dir = Path.home() / ".claude" / "tasks" / session_id
if not tasks_dir.exists() or not tasks_dir.is_dir():
return []
incomplete = []
for task_file in tasks_dir.glob("*.json"):
try:
with open(task_file, 'r', encoding='utf-8') as f:
task = json.load(f)
status = task.get("status", "pending")
if status not in ("completed", "deleted"):
# Task is incomplete
subject = task.get("subject", "")
description = task.get("description", "")
task_id = task_file.stem # Filename without .json
content = subject or description or f"Task {task_id}"
lane = classify_lane(subject, description)
if lane == "queued":
continue
incomplete.append({
"status": status,
"content": content,
"source": "task",
"task_id": task_id,
"lane": lane,
})
except (json.JSONDecodeError, OSError):
# Skip malformed or unreadable task files
continue
return incomplete
def main():
# Read hook input from stdin
try:
stdin_content = sys.stdin.read().strip()
if not stdin_content:
# Empty input - no data available, allow proceeding
sys.exit(0)
hook_input = json.loads(stdin_content)
except json.JSONDecodeError as e:
# Parse error - exit with code 2
print(f"PARSE_ERROR: {e}", file=sys.stderr)
sys.exit(2)
incomplete_items = []
# Check new Task system using external task directory (authoritative source)
session_id = hook_input.get("session_id", "")
tasks_base_dir = hook_input.get("tasks_base_dir", "") # For testing
if session_id:
incomplete_items.extend(find_incomplete_tasks_from_directory(session_id, tasks_base_dir))
# Check legacy TodoWrite from transcript
transcript_path = hook_input.get("transcript_path", "")
if transcript_path:
transcript_path = Path(transcript_path).expanduser()
incomplete_items.extend(find_incomplete_todos_from_transcript(transcript_path))
if not incomplete_items:
# No incomplete items, allow proceeding
sys.exit(0)
# Format output
output_lines = []
for item in incomplete_items:
status = item.get("status", "unknown")
content = item.get("content", "")
source = item.get("source", "unknown")
lane = item.get("lane", "blocking")
lane_marker = f"[{lane}]"
if source == "task":
task_id = item.get("task_id", "?")
output_lines.append(f" - [{status}] {lane_marker} (Task #{task_id}) {content}")
else:
output_lines.append(f" - [{status}] {lane_marker} {content}")
# Output marker and incomplete items both to stdout
print("INCOMPLETE_TODOS")
print("\n".join(output_lines))
sys.exit(1)
if __name__ == "__main__":
main()
{
"description": "Humanize Plugin Hooks - Validation hooks and Stop hooks for /start-rlcr-loop",
"hooks": {
"UserPromptSubmit": [
{
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-plan-file-validator.sh"
}
]
}
],
"PreToolUse": [
{
"matcher": "Write",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-write-validator.sh"
}
]
},
{
"matcher": "Edit",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-edit-validator.sh"
}
]
},
{
"matcher": "Read",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-read-validator.sh"
}
]
},
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-bash-validator.sh"
}
]
}
],
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-post-bash-hook.sh"
}
]
}
],
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-codex-stop-hook.sh",
"timeout": 7200
}
]
}
]
}
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment