Commit f6fe8355 authored by whlwhlwhl's avatar whlwhlwhl
Browse files

Initial LightOp KernelPilot skill pack

parents
Pipeline #3628 canceled with stages
---
name: plan-understanding-quiz
description: Analyzes a plan and generates multiple-choice technical comprehension questions to verify user understanding before RLCR loop. Use when validating user readiness for start-rlcr-loop command.
model: opus
tools: Read, Glob, Grep
---
# Plan Understanding Quiz
You are a specialized agent that analyzes an implementation plan and generates targeted multiple-choice technical comprehension questions. Your goal is to test whether the user genuinely understands HOW the plan will be implemented, not just what the plan title says.
## Your Task
When invoked, you will be given the content of a plan file. You need to:
### Analyze the Plan
1. **Read the plan thoroughly** to understand:
- What components, files, or systems are being modified
- What technical approach or mechanism is being used
- How different pieces of the implementation connect together
- What existing patterns or systems the plan builds upon
2. **Explore the repository** to add context:
- Check README.md, CLAUDE.md, or other documentation files
- Look at the directory structure and key files referenced in the plan
- Understand the existing architecture that the plan interacts with
### Generate Multiple-Choice Questions
Create exactly 2 multiple-choice questions that test the user's understanding of the plan's **technical implementation details**. Each question must have exactly 4 options (A through D), with exactly 1 correct answer.
- **QUESTION_1**: Should test whether the user knows what components/systems are being changed and how. Focus on the core technical mechanism or approach.
- **QUESTION_2**: Should test whether the user understands how different parts of the implementation connect, what existing patterns are being followed, or what the key technical constraints are.
**Good question characteristics:**
- Derived from the plan's specific content, not generic templates
- Test understanding of HOW things will be done, not just WHAT the plan describes
- Not too low-level (no exact line numbers, exact syntax, or trivial details)
- A user who has carefully read and understood the plan should pick the correct answer
- A user who just skimmed the title or blindly accepted a generated plan would likely pick wrong
- Wrong options should be plausible (not obviously absurd) but clearly incorrect to someone who read the plan
**Example good questions:**
- "How does this plan integrate the new validation step into the startup flow?" with options covering different integration approaches
- "Which components need to change and why?" with options describing different component sets
**Example bad questions (avoid these):**
- "What is the plan about?" (too vague, tests nothing)
- "What are the risks?" (generic, not about implementation)
- "On which line does function X start?" (too low-level)
### Generate Plan Summary
Write a 2-3 sentence summary explaining what the plan does and how, suitable for educating a user who showed gaps in understanding. Focus on the technical approach, not just the goal.
## Output Format
You MUST output in this exact format, with each field on its own line:
```
QUESTION_1: <your first question>
OPTION_1A: <option A text>
OPTION_1B: <option B text>
OPTION_1C: <option C text>
OPTION_1D: <option D text>
ANSWER_1: <A, B, C, or D>
QUESTION_2: <your second question>
OPTION_2A: <option A text>
OPTION_2B: <option B text>
OPTION_2C: <option C text>
OPTION_2D: <option D text>
ANSWER_2: <A, B, C, or D>
PLAN_SUMMARY: <2-3 sentence technical summary>
```
## Important Notes
- Always output all 13 fields - never skip any
- ANSWER must be exactly one letter: A, B, C, or D
- Randomize the position of the correct answer (do not always put it in A or D)
- The plan may be written in any language - generate questions and options in the same language as the plan
- Focus on substance over format
- If the plan is very short or lacks technical detail, derive questions from whatever implementation hints are available
- Questions should feel like a friendly knowledge check, not an adversarial interrogation
## Example Output
```
QUESTION_1: How does this plan integrate the new validation step into the existing build pipeline?
OPTION_1A: By replacing the existing lint step with a combined lint-and-validate step
OPTION_1B: By adding a new PostToolUse hook that runs between the lint step and the compilation step
OPTION_1C: By modifying the compilation step to include inline validation checks
OPTION_1D: By creating a standalone pre-build script that runs before any other steps
ANSWER_1: B
QUESTION_2: Why does the plan require changes to both the CLI parser and the state file, rather than just the CLI?
OPTION_2A: The state file stores the original CLI arguments for audit logging purposes
OPTION_2B: The CLI parser is deprecated and the state file is the new configuration mechanism
OPTION_2C: The CLI parser adds the flag, the state file persists it across loop iterations, and the stop hook reads it at exit time
OPTION_2D: Both files share a common schema and must always be updated together
ANSWER_2: C
PLAN_SUMMARY: This plan adds a build output validation step by hooking into the PostToolUse lifecycle event. It modifies the hook configuration to insert a format checker between linting and compilation, and updates the state file schema to track validation results across RLCR rounds.
```
---
description: "Cancel active RLCR loop"
allowed-tools: ["Bash(${CLAUDE_PLUGIN_ROOT}/scripts/cancel-rlcr-loop.sh)", "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/cancel-rlcr-loop.sh --force)", "AskUserQuestion"]
disable-model-invocation: true
---
# Cancel RLCR Loop
To cancel the active loop:
1. Run the cancel script:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/cancel-rlcr-loop.sh"
```
2. Check the first line of output:
- **NO_LOOP** or **NO_ACTIVE_LOOP**: Say "No active RLCR loop found."
- **CANCELLED**: Report the cancellation message from the output
- **CANCELLED_METHODOLOGY_ANALYSIS**: Report the cancellation message from the output
- **CANCELLED_FINALIZE**: Report the cancellation message from the output
- **FINALIZE_NEEDS_CONFIRM**: The loop is in Finalize Phase. Continue to step 3
3. **If FINALIZE_NEEDS_CONFIRM**:
- Use AskUserQuestion to confirm cancellation with these options:
- Question: "The loop is currently in Finalize Phase. After this phase completes, the loop will end without returning to Codex review. Are you sure you want to cancel now?"
- Header: "Cancel?"
- Options:
1. Label: "Yes, cancel now", Description: "Cancel the loop immediately, finalize-state.md will be renamed to cancel-state.md"
2. Label: "No, let it finish", Description: "Continue with the Finalize Phase, the loop will complete normally"
- **If user chooses "Yes, cancel now"**:
- Run: `"${CLAUDE_PLUGIN_ROOT}/scripts/cancel-rlcr-loop.sh" --force`
- Report the cancellation message from the output
- **If user chooses "No, let it finish"**:
- Report: "Understood. The Finalize Phase will continue. Once complete, the loop will end normally."
**Key principle**: The script handles all cancellation logic. A loop is active if `state.md` (normal loop), `methodology-analysis-state.md` (Methodology Analysis Phase), or `finalize-state.md` (Finalize Phase) exists in the newest loop directory.
The loop directory with summaries, review results, and state information will be preserved for reference.
---
description: "Generate a repo-grounded idea draft via directed-swarm exploration"
argument-hint: "<idea-text-or-path> [--n <int>] [--output <path>]"
allowed-tools:
- "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/validate-gen-idea-io.sh:*)"
- "Read"
- "Glob"
- "Grep"
- "Task"
- "Write"
---
# Generate Idea Draft from Loose Input
Read and execute below with ultrathink.
## Hard Constraint: Draft-Only Output
This command MUST NOT implement features, modify source code, or create commits while producing the draft. Permitted writes are limited to the single output draft file produced in Phase 4; prerequisite directory creation for the default `.humanize/ideas/` path by the validation script is permitted as part of that write. All exploration subagents run read-only.
This command transforms a loose idea into a repo-grounded draft suitable as input to `/humanize:gen-plan`. It applies directed-diversity exploration: a lead picks N orthogonal directions, N parallel `Explore` subagents develop each, the lead synthesizes a draft with one primary direction plus N-1 alternatives. Each direction carries objective evidence from the repo.
## Workflow Overview
> **Sequential Execution Constraint**: All phases MUST execute strictly in order. Each phase fully completes before the next.
1. Parse Input
2. IO Validation
3. Direction Generation
4. Parallel Exploration
5. Synthesis and Write
---
## Phase 0: Parse Input
Extract from `$ARGUMENTS`:
- First positional: inline idea text or path to a `.md` file (required).
- `--n <int>`: number of directions. Default 6.
- `--output <path>`: target draft path. Default resolved by the validation script.
Do not interpret or rewrite the idea text here. Pass `$ARGUMENTS` through to Phase 1 unchanged.
---
## Phase 1: IO Validation
Run:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/validate-gen-idea-io.sh" $ARGUMENTS
```
Handle exit codes:
- `0`: Parse stdout to extract `INPUT_MODE`, `OUTPUT_FILE`, `SLUG`, `TEMPLATE_FILE`, `N` (each appears on its own `KEY: value` line). When `INPUT_MODE` is `file`, stdout additionally contains an `IDEA_BODY_FILE: <path>` line; extract that too. Continue to Phase 2. (`SLUG` is informational — the script has already incorporated it into `OUTPUT_FILE`, so later phases do not need to use `SLUG` directly.)
- `1`: Report "Missing or empty idea input" and stop.
- `2`: Report "Input looks like a file path but is missing, not readable, or not `.md`" and stop.
- `3`: Report "Output directory does not exist — please create it or choose a different path" and stop.
- `4`: Report "Output file already exists — choose a different path" and stop.
- `5`: Report "No write permission to output directory" and stop.
- `6`: Report "Invalid arguments" with the stdout usage text and stop.
- `7`: Report "Template file missing — plugin configuration error" and stop.
Before `VALIDATION_SUCCESS`, stdout may contain one or more lines starting with `WARNING:` (for example, `WARNING: short idea (<N> chars); proceeding` when an inline idea is under 10 characters). Surface these warnings to the user in your final report but continue Phase 2 normally. `WARNING:` lines are informational, not errors.
Obtain the idea body into memory as `IDEA_BODY`, based on `INPUT_MODE`:
- `inline`: stdout contains a sentinel block at the end of the success output; extract all text between the `=== IDEA_BODY_BEGIN ===` and `=== IDEA_BODY_END ===` lines (exclusive). The script emits a trailing newline after the last body line.
- `file`: read the full contents of `IDEA_BODY_FILE` using the `Read` tool.
Preserve byte-identical content in memory for later phases. No on-disk tempfile is created in inline mode — the stdout sentinel block is the authoritative source.
---
## Phase 2: Direction Generation
Generate exactly `N` orthogonal directions for exploring the idea.
### Context to Gather
Before generating directions, read (paths relative to the project root, which is `$(git rev-parse --show-toplevel)`):
- `README.md` at the project root.
- `CLAUDE.md` at the project root (if it exists).
- `.claude/CLAUDE.md` (if it exists).
- Top-level directory listing via `Glob` with pattern `*` (one level, no recursion).
This context grounds the directions in the actual repo rather than generic brainstorming.
### Generation Rules
Produce exactly `N` direction entries. Each entry has:
- `name`: a 2-5 word short label.
- `rationale`: a single sentence explaining why this angle is distinct from the other directions.
Hard constraint: **orthogonality**. Two near-duplicate directions defeat the directed-diversity premise. Before returning:
- If two directions feel like dupes, replace one with a genuinely different angle.
- If a direction collapses to "just do X better" with no angle distinction, replace it.
- Do not emit directions that merely restate the idea in different words.
### Retry and Degradation
- If the first pass returns fewer than `N` entries, regenerate once with an explicit "you MUST produce `N` orthogonal directions" instruction.
- If the second pass still returns fewer than `N` but at least 2, proceed with the reduced count and emit a warning to the user: `Warning: direction generation returned <count> of <N> requested directions; proceeding with reduced count.`
- If fewer than 2 directions are produced, stop with error: `direction generation degraded; retry.`
Store the final direction list as `DIRECTIONS` (ordered; index 0..len-1).
---
## Phase 3: Parallel Exploration
Dispatch all directions in a **single Task-tool message** containing one Task invocation per direction. This is the W2S parallel-swarm step.
### Subagent Invocation
For each direction in `DIRECTIONS`, launch one `Explore` subagent. Each invocation prompt MUST include:
1. A verbatim copy of the idea body (`IDEA_BODY`) captured in Phase 1.
2. The assigned direction (name + rationale).
3. The following instruction block (reproduce verbatim in the subagent prompt):
> Explore this direction within the current repo. Gather OBJECTIVE EVIDENCE:
> - Specific repo paths with existing patterns worth extending.
> - Prior art or precedent in the codebase or adjacent tooling.
> - Measurable considerations (approximate complexity, LOC surface, performance implications) where discoverable from reading the code.
>
> Read-only. Do not write any files.
>
> If no concrete evidence exists for this direction, report the literal string `exploratory, no concrete precedent` once in OBJECTIVE_EVIDENCE and stop exploring further. Fabrication of references is forbidden.
>
> Return a structured proposal with exactly these fields:
> - `APPROACH_SUMMARY`: concrete design description (what to build, core mechanism, affected components).
> - `OBJECTIVE_EVIDENCE`: bullet list of repo paths, prior art, or the `exploratory, no concrete precedent` sentinel.
> - `KNOWN_RISKS`: short bullet list.
> - `CONFIDENCE`: one of `high`, `medium`, `low`.
### Collection and Degradation
Collect all subagent responses. For each response:
- Parse the four required fields. If a field is missing, mark that proposal as degraded and drop it.
- If fewer than 2 proposals survive, stop with error: `exploration phase degraded; retry.`
- Otherwise continue with the surviving proposals.
Associate each surviving proposal with its originating direction (so Phase 4 can label it with the original direction name). When numbering alternatives in Phase 4 after any drops, renumber survivors sequentially as Alt-1..Alt-K (where K is the count of surviving non-primary directions). Do not preserve gaps from dropped proposals.
---
## Phase 4: Synthesis and Write
### Step 4.1: Pick the Primary Direction
Review all surviving proposals. Choose the strongest as the primary based on:
1. Evidence density — more concrete repo references outranks fewer.
2. Fit with existing repo patterns — extending patterns outranks introducing unfamiliar paradigms.
3. Implementation surface area — prefer smaller surface where quality is otherwise comparable.
4. Declared `CONFIDENCE``high` > `medium` > `low` as tiebreaker.
Record the chosen direction as `PRIMARY`; the remaining surviving directions become the Alt-1..Alt-K list (where K is the number of non-primary survivors, K ≤ N-1), numbered sequentially in their original direction order with no gaps for any dropped proposals.
### Step 4.2: Infer Title
Generate a 4-10 word Title Case title that captures the primary direction, not the original input phrasing verbatim. Example: idea `add undo/redo` with primary direction `command-pattern history` yields title `Command-Pattern Undo Stack For The Editor`.
### Step 4.3: Populate the Template
Read the template file located at `TEMPLATE_FILE` (from Phase 1 stdout).
Produce the finalized draft content in memory by replacing placeholders:
- `<TITLE>` — the inferred title.
- `<ORIGINAL_IDEA>` — byte-identical value of `IDEA_BODY` captured in Phase 1. Preserve line breaks, trailing newline, and all formatting. Do NOT paraphrase or re-indent.
- `<PRIMARY_NAME>` — primary direction's short name.
- `<PRIMARY_RATIONALE>` — primary direction's rationale (from Phase 2).
- `<PRIMARY_APPROACH_SUMMARY>` — primary proposal's `APPROACH_SUMMARY`.
- `<PRIMARY_OBJECTIVE_EVIDENCE>` — primary proposal's `OBJECTIVE_EVIDENCE`, rendered as a bullet list. If the subagent returned only the literal sentinel `exploratory, no concrete precedent`, render it as a single bullet: `- exploratory, no concrete precedent`.
- `<PRIMARY_KNOWN_RISKS>` — primary proposal's `KNOWN_RISKS`, rendered as a bullet list.
- `<ALTERNATIVES>` — for each non-primary survivor at its Alt index `i` (1-based, sequential per Step 4.1), emit:
```markdown
### Alt-<i>: <name>
- Gist: <one-paragraph summary derived from APPROACH_SUMMARY>
- Objective Evidence:
- <bullet from OBJECTIVE_EVIDENCE>
- ...
- Why not primary: <one sentence stating the tradeoff vs PRIMARY>
```
Separate consecutive Alt entries with a single blank line.
- `<SYNTHESIS_NOTES>` — one paragraph describing which elements from the alternatives could fold into the primary if the user chose a different direction. This is the lead's own synthesis note, not a subagent output.
### Step 4.4: Write the Draft File
Write the finalized content to `OUTPUT_FILE` using the `Write` tool. Single write; no progressive edits.
### Step 4.5: Report
Report to the user:
- Path written (`OUTPUT_FILE`).
- Primary direction name.
- Requested `N` and the actual direction count (note if reduced due to degradation).
- Next-step hint: `To turn this draft into a plan, run: /humanize:gen-plan --input <OUTPUT_FILE> --output <plan-path>`.
---
## Error Handling
- Phase 1 validation errors stop the command with a clear message. No partial output.
- Phase 2 degradation follows the retry-once + ≥2 minimum rule stated above.
- Phase 3 degradation follows the drop-and-continue + ≥2 minimum rule stated above.
- Never fabricate repo references or prior art. The `exploratory, no concrete precedent` sentinel from subagents is preserved verbatim in the draft.
- If any phase stops with an error, do not write a partial `OUTPUT_FILE`.
---
description: "Generate implementation plan from draft document"
argument-hint: "--input <path/to/draft.md> --output <path/to/plan.md> [--auto-start-rlcr-if-converged] [--discussion|--direct]"
allowed-tools:
- "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/validate-gen-plan-io.sh:*)"
- "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/ask-codex.sh:*)"
- "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh:*)"
- "Read"
- "Glob"
- "Grep"
- "Task"
- "Write"
- "AskUserQuestion"
---
# Generate Plan from Draft
Read and execute below with ultrathink.
## Hard Constraint: No Coding During Plan Generation
This command MUST ONLY generate a plan document during the planning phases. It MUST NOT implement tasks, modify repository source code, or make commits/PRs while producing the plan.
Permitted writes (before any optional auto-start) are limited to:
- The plan output file (`--output`)
- Optional translated language variant (only when `ALT_PLAN_LANGUAGE` is configured)
If `--auto-start-rlcr-if-converged` is enabled, the command MAY immediately start the RLCR loop by running `/humanize:start-rlcr-loop <output-plan-path>`, but only in `discussion` mode when `PLAN_CONVERGENCE_STATUS=converged` and there are no pending user decisions. All coding happens in that subsequent command/loop, not during plan generation.
This command transforms a user's draft document into a well-structured implementation plan with clear goals, acceptance criteria (AC-X format), path boundaries, and feasibility suggestions.
## Workflow Overview
> **Sequential Execution Constraint**: All phases below MUST execute strictly in order. Do NOT parallelize tool calls across different phases. Each phase must fully complete before the next one begins.
1. **Execution Mode Setup**: Parse optional behaviors from command arguments
2. **Load Project Config**: Resolve merged Humanize config defaults for `alternative_plan_language` and `gen_plan_mode`
3. **IO Validation**: Validate input and output paths
4. **Relevance Check**: Verify draft is relevant to the repository
5. **Codex First-Pass Analysis**: Use one planning Codex before Claude synthesizes plan details
6. **Claude Candidate Plan (v1)**: Claude builds an initial plan from draft + Codex findings
7. **Iterative Convergence Loop**: Claude and a second Codex iteratively challenge/refine plan reasonability
8. **Issue and Disagreement Resolution**: Resolve unresolved opposite opinions (or skip manual review if converged, auto-start mode is enabled, and `GEN_PLAN_MODE=discussion`)
9. **Final Plan Generation**: Generate the converged structured plan.md with task routing tags
10. **Write and Complete**: Write output file, optionally write translated language variant, optionally auto-start implementation, and report results
---
## Phase 0: Execution Mode Setup
Parse `$ARGUMENTS` and set:
- `AUTO_START_RLCR_IF_CONVERGED=true` if `--auto-start-rlcr-if-converged` is present
- `AUTO_START_RLCR_IF_CONVERGED=false` otherwise
- `GEN_PLAN_MODE_DISCUSSION=true` if `--discussion` is present
- `GEN_PLAN_MODE_DIRECT=true` if `--direct` is present
- If both `--discussion` and `--direct` are present simultaneously, report error "Cannot use --discussion and --direct together" and stop
`AUTO_START_RLCR_IF_CONVERGED=true` allows skipping manual plan review and starting implementation immediately (by invoking `/humanize:start-rlcr-loop <output-plan-path>`), but only when `GEN_PLAN_MODE=discussion`, plan convergence is achieved, and no pending user decisions remain. In `direct` mode this condition is never satisfied.
---
## Phase 0.5: Load Project Config
After setting execution mode flags, resolve configuration using `${CLAUDE_PLUGIN_ROOT}/scripts/lib/config-loader.sh`. Reuse that behavior; do not read `.humanize/config.json` directly.
### Config Merge Semantics
1. Source `${CLAUDE_PLUGIN_ROOT}/scripts/lib/config-loader.sh`.
2. Call `load_merged_config "${CLAUDE_PLUGIN_ROOT}" "${PROJECT_ROOT}"` to obtain `MERGED_CONFIG_JSON`, where `PROJECT_ROOT` is the repository root where the command was invoked.
3. `load_merged_config` merges these layers in order:
- Required default config: `${CLAUDE_PLUGIN_ROOT}/config/default_config.json`
- Optional user config: `${XDG_CONFIG_HOME:-$HOME/.config}/humanize/config.json`
- Optional project config: `${HUMANIZE_CONFIG:-$PROJECT_ROOT/.humanize/config.json}`
4. Later layers override earlier layers. Malformed optional JSON objects are warnings and ignored. A malformed required default config, missing `jq`, or any other fatal `load_merged_config` failure is a configuration error and must stop the command.
### Values to Extract
Use `get_config_value` against `MERGED_CONFIG_JSON` to read:
- `CONFIG_ALT_LANGUAGE_RAW` from `alternative_plan_language`
- `CONFIG_GEN_PLAN_MODE_RAW` from `gen_plan_mode`
- `CONFIG_CHINESE_PLAN_RAW` from `chinese_plan` (legacy fallback only)
Also detect whether `alternative_plan_language` is explicitly present in `MERGED_CONFIG_JSON` so an empty string still counts as an explicit override:
- `HAS_ALT_LANGUAGE_KEY=true` when `MERGED_CONFIG_JSON` contains the `alternative_plan_language` key
- `HAS_ALT_LANGUAGE_KEY=false` otherwise
### Alternative Language Resolution
1. Resolve the effective `alternative_plan_language` value with this priority:
- Merged config `alternative_plan_language`, when `HAS_ALT_LANGUAGE_KEY=true` (even if the value is an empty string)
- Deprecated merged config `chinese_plan`, only when `HAS_ALT_LANGUAGE_KEY=false`
- Default disabled state
2. Backward compatibility for deprecated `chinese_plan`:
- If `HAS_ALT_LANGUAGE_KEY=true` and `CONFIG_CHINESE_PLAN_RAW` is `true`, log: `Warning: deprecated "chinese_plan" field ignored; "alternative_plan_language" takes precedence. Remove "chinese_plan" from your humanize config.`
- If `HAS_ALT_LANGUAGE_KEY=false` and `CONFIG_CHINESE_PLAN_RAW` is `true`, treat the effective `alternative_plan_language` as `"Chinese"`. Log: `Warning: deprecated "chinese_plan" field detected. Replace it with "alternative_plan_language": "Chinese" in your humanize config.`
- Otherwise treat the effective `alternative_plan_language` as disabled.
3. Resolve `ALT_PLAN_LANGUAGE` and `ALT_PLAN_LANG_CODE` from the effective `alternative_plan_language` value using the built-in mapping table below. Matching is **case-insensitive**.
| Language | Code | Suffix |
|------------|------|--------|
| Chinese | zh | `_zh` |
| Korean | ko | `_ko` |
| Japanese | ja | `_ja` |
| Spanish | es | `_es` |
| French | fr | `_fr` |
| German | de | `_de` |
| Portuguese | pt | `_pt` |
| Russian | ru | `_ru` |
| Arabic | ar | `_ar` |
Matching accepts both the language name (e.g. `"Chinese"`) and the ISO 639-1 code (e.g. `"zh"`), both case-insensitive. Leading/trailing whitespace is trimmed before matching.
- If the value is empty or absent: set `ALT_PLAN_LANGUAGE=""` and `ALT_PLAN_LANG_CODE=""` (disabled).
- If the value is `"English"` or `"en"` (case-insensitive): set `ALT_PLAN_LANGUAGE=""` and `ALT_PLAN_LANG_CODE=""` (no-op; the plan is already in English).
- If the value matches a language name or code in the table: set `ALT_PLAN_LANGUAGE` to the matched language name and `ALT_PLAN_LANG_CODE` to the corresponding code.
- If the value does NOT match any language name or code in the table: set `ALT_PLAN_LANGUAGE=""` and `ALT_PLAN_LANG_CODE=""` (disabled). Log: `Warning: unsupported alternative_plan_language "<value>". Supported values: Chinese (zh), Korean (ko), Japanese (ja), Spanish (es), French (fr), German (de), Portuguese (pt), Russian (ru), Arabic (ar). Translation variant will not be generated.`
4. Resolve `CONFIG_GEN_PLAN_MODE_RAW` from the merged config:
- Valid values: `"discussion"` or `"direct"` (case-insensitive).
- Invalid or absent values: treat as absent (fall back to default) and log a warning if the value is present but invalid.
5. Resolve `GEN_PLAN_MODE` using the following priority (highest to lowest), with CLI flags taking priority over merged config:
- CLI flag: if `GEN_PLAN_MODE_DISCUSSION=true`, set `GEN_PLAN_MODE=discussion`; if `GEN_PLAN_MODE_DIRECT=true`, set `GEN_PLAN_MODE=direct`
- Merged config `gen_plan_mode` field (if valid)
- Default: `discussion`
6. Malformed optional user or project config files should be reported as warnings by `load_merged_config` and must NOT stop execution. In those cases, continue with the remaining valid layers and the same effective defaults (`ALT_PLAN_LANGUAGE=""`, `ALT_PLAN_LANG_CODE=""`, and `GEN_PLAN_MODE=discussion`) when no higher-precedence value is available.
`ALT_PLAN_LANGUAGE` and `ALT_PLAN_LANG_CODE` control whether a translated language variant of the output file is written in Phase 8. When `ALT_PLAN_LANGUAGE` is non-empty, a variant file with the `_<ALT_PLAN_LANG_CODE>` suffix is generated.
---
## Phase 1: IO Validation
Execute the validation script with the provided arguments:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/validate-gen-plan-io.sh" $ARGUMENTS
```
**Handle exit codes:**
- Exit code 0: Continue to Phase 2. Parse the `TEMPLATE_FILE:` line from stdout to get the template path.
- Exit code 1: Report "Input file not found" and stop
- Exit code 2: Report "Input file is empty" and stop
- Exit code 3: Report "Output directory does not exist - please create it" and stop
- Exit code 4: Report "Output file already exists - please choose another path" and stop
- Exit code 5: Report "No write permission to output directory" and stop
- Exit code 6: Report "Invalid arguments" and show usage, then stop
- Exit code 7: Report "Plan template file not found - plugin configuration error" and stop
**Note:** The validation script is side-effect-free. It does NOT create the output file.
---
## Phase 2: Relevance Check
After IO validation passes, check if the draft is relevant to this repository.
> **Note**: Do not spend too much time on this check. As long as the draft is not completely unrelated to the current project - not like the difference between ship design and cake recipes - it passes.
1. Read the input draft file to get its content
2. Use the Task tool to invoke the `humanize:draft-relevance-checker` agent (haiku model):
```
Task tool parameters:
- model: "haiku"
- prompt: Include the draft content and ask the agent to:
1. Explore the repository structure (README, CLAUDE.md, main files)
2. Analyze if the draft content relates to this repository
3. Return either `RELEVANT: <reason>` or `NOT_RELEVANT: <reason>`
```
3. **If NOT_RELEVANT**:
- Report: "The draft content does not appear to be related to this repository."
- Show the reason from the relevance check
- Stop the command
4. **If RELEVANT**: Create the output plan file by copying the template and appending the draft:
```bash
cp "$TEMPLATE_FILE" "$OUTPUT_FILE" && echo "" >> "$OUTPUT_FILE" && echo "--- Original Design Draft Start ---" >> "$OUTPUT_FILE" && echo "" >> "$OUTPUT_FILE" && cat "$INPUT_FILE" >> "$OUTPUT_FILE" && echo "" >> "$OUTPUT_FILE" && echo "--- Original Design Draft End ---" >> "$OUTPUT_FILE"
```
Then continue to Phase 3.
---
## Phase 3: Codex First-Pass Analysis
After relevance check, invoke Codex BEFORE Claude plan synthesis.
This Codex pass is the first planning analysis before Claude synthesizes plan details.
1. Run:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/ask-codex.sh" "<structured prompt>"
```
2. The structured prompt MUST include:
- Repository context (project purpose, relevant files)
- Raw draft content
- Explicit request to critique assumptions, identify missing requirements, and propose stronger plan directions
3. Require Codex output to follow this format:
- `CORE_RISKS:` highest-risk assumptions and potential failure modes
- `MISSING_REQUIREMENTS:` likely omitted requirements or edge cases
- `TECHNICAL_GAPS:` feasibility or architecture gaps
- `ALTERNATIVE_DIRECTIONS:` viable alternatives with tradeoffs
- `QUESTIONS_FOR_USER:` questions that need explicit human decisions
- `CANDIDATE_CRITERIA:` candidate acceptance criteria suggestions
4. Preserve this output as **Codex Analysis v1** and feed it into Claude planning.
5. Record a concise planning summary from this analysis.
### Codex Availability Handling
If `ask-codex.sh` fails (missing Codex CLI, timeout, or runtime error), use AskUserQuestion and let the user choose:
- Retry with updated Codex settings/environment
- Continue with Claude-only planning (explicitly note reduced cross-review confidence in plan output)
---
## Phase 4: Claude Candidate Plan (v1)
Use draft content + Codex Analysis v1 to produce an initial candidate plan and issue map.
Deeply analyze the draft for potential issues. Use Explore agents to investigate the codebase.
Alongside candidate plan v1, prepare a concise implementation summary covering scope, boundaries, dependencies, and known risks.
### Analysis Dimensions
1. **Clarity**: Is the draft's intent and goals clearly expressed?
- Are objectives well-defined?
- Is the scope clear?
- Are terms and concepts unambiguous?
2. **Consistency**: Does the draft contradict itself?
- Are requirements internally consistent?
- Do different sections align with each other?
3. **Completeness**: Are there missing considerations?
- Use Explore agents to investigate parts of the codebase the draft might affect
- Identify dependencies, side effects, or related components not mentioned
- Check if the draft overlooks important edge cases
4. **Functionality**: Does the design have fundamental flaws?
- Would the proposed approach actually work?
- Are there technical limitations not addressed?
- Could the design negatively impact existing functionality?
### Exploration Strategy
Use the Task tool with `subagent_type: "Explore"` to investigate:
- Components mentioned in the draft
- Related files and directories
- Existing patterns and conventions
- Dependencies and integrations
---
## Phase 5: Iterative Convergence Loop (Claude <-> Second Codex)
If `GEN_PLAN_MODE=direct`, skip this entire phase. The plan proceeds directly from candidate plan v1 (Phase 4) to Phase 6 without convergence rounds. Since no convergence rounds or second-pass review occurred, set `PLAN_CONVERGENCE_STATUS=partially_converged` and `HUMAN_REVIEW_REQUIRED=true` (direct mode must NOT satisfy `--auto-start-rlcr-if-converged` conditions).
After Claude candidate plan v1 is ready, run iterative challenge/refine rounds with a SECOND Codex pass.
### Convergence Round Steps
1. **Second Codex Reasonability Review**
- Run:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/ask-codex.sh" "<review current candidate plan>"
```
- Prompt MUST include current candidate plan, prior disagreements, and unresolved items
- Require output format:
- `AGREE:` points accepted as reasonable
- `DISAGREE:` points considered unreasonable and why
- `REQUIRED_CHANGES:` must-fix items before convergence
- `OPTIONAL_IMPROVEMENTS:` non-blocking improvements
- `UNRESOLVED:` opposite opinions needing user decisions
2. **Claude Revision**
- Claude updates the candidate plan to address `REQUIRED_CHANGES`
- Claude documents accepted/rejected suggestions with rationale
3. **Convergence Assessment**
- Update a per-round convergence matrix:
- Topic
- Claude position
- Second Codex position
- Resolution status (`resolved`, `needs_user_decision`, `deferred`)
- Round-to-round delta
### Loop Termination Rules
Repeat convergence rounds until one of the following is true:
- No `REQUIRED_CHANGES` remain and no high-impact `DISAGREE` remains
- Two consecutive rounds produce no material plan changes
- Maximum 3 rounds reached
If max rounds are reached with unresolved opposite opinions, carry them to user decision phase explicitly.
Set convergence state explicitly:
- `PLAN_CONVERGENCE_STATUS=converged` when convergence conditions are met
- `PLAN_CONVERGENCE_STATUS=partially_converged` otherwise
---
## Phase 6: Issue and Disagreement Resolution
> **Critical**: The draft document contains the most valuable human input. During issue resolution, NEVER discard or override any original draft content. All clarifications should be treated as incremental additions that supplement the draft, not replacements. Keep track of both the original draft statements and the clarified information.
### Step 1: Manual Review Gate
Decide if manual review can be skipped:
- If `GEN_PLAN_MODE=direct`, set `HUMAN_REVIEW_REQUIRED=true`
- Else if `AUTO_START_RLCR_IF_CONVERGED=true` **and** `PLAN_CONVERGENCE_STATUS=converged`, set `HUMAN_REVIEW_REQUIRED=false`
- Otherwise set `HUMAN_REVIEW_REQUIRED=true`
If `HUMAN_REVIEW_REQUIRED=false`, skip Step 2-4 and continue directly to Phase 7.
### Step 1.5: Consolidate Pending User Decisions (runs unconditionally)
Before proceeding (regardless of `HUMAN_REVIEW_REQUIRED`), consolidate all user-facing questions from prior phases into the plan's `## Pending User Decisions` section:
1. Extract `QUESTIONS_FOR_USER` items from Codex Analysis v1 (Phase 3)
2. Extract items with status `needs_user_decision` from the final convergence matrix (Phase 5) — use the last round's state, not intermediate rounds
3. Deduplicate: if the same topic appears in both sources, merge into one entry
4. For each collected item, check if it was substantively resolved during Phase 4-5 plan refinement (i.e., Claude addressed it and second Codex agreed in a subsequent round). Remove only items with clear evidence of resolution.
5. Write all remaining unresolved items into the plan's `## Pending User Decisions` section. Use `DEC-N` identifiers. Set `Decision Status` to `PENDING`.
- For Claude-vs-Codex disagreements: fill `Claude Position`, `Codex Position`, and `Tradeoff Summary`
- For open questions (no opposing positions): set `Claude Position` to Claude's tentative answer (if any), `Codex Position` to `N/A - open question`, and `Tradeoff Summary` to the question's context
This ensures:
- When `HUMAN_REVIEW_REQUIRED=true`: items are visible for Steps 2-4 user resolution
- When `HUMAN_REVIEW_REQUIRED=false`: items block auto-start via Phase 8 Step 5's `PENDING` check
### Step 2: Resolve Analysis Issues (when manual review is required)
If any issues are found during Codex-first analysis, Claude analysis, or convergence loop, use AskUserQuestion to clarify with the user.
For each issue category that has problems, present:
- What the issue is
- Why it matters
- Options for resolution (if applicable)
Continue this dialogue until all significant issues are resolved or acknowledged by the user.
### Step 3: Confirm Quantitative Metrics (when manual review is required)
After all analysis issues are resolved, check the draft for any quantitative metrics or numeric thresholds, such as:
- Performance targets: "less than 15GB/s", "under 100ms latency"
- Size constraints: "below 300KB", "maximum 1MB"
- Count limits: "more than 10 files", "at least 5 retries"
- Percentage goals: "95% coverage", "reduce by 50%"
For each quantitative metric found, use AskUserQuestion to explicitly confirm with the user:
- Is this a **hard requirement** that must be achieved for the implementation to be considered successful?
- Or is this describing an **optimization trend/direction** where improvement toward the target is acceptable even if the exact number is not reached?
Document the user's answer for each metric, as this distinction significantly affects how acceptance criteria should be written in the plan.
---
### Step 4: Resolve Unresolved Claude/Codex Disagreements (when manual review is required)
For every item marked `needs_user_decision`, explicitly ask the user to decide.
For each unresolved disagreement, present:
- The decision topic
- Claude's position
- Codex's position
- Tradeoffs and risks of each option
- A clear recommendation (if one option is materially safer)
If the user does not decide immediately, keep the item in the plan as `PENDING` under a dedicated user-decision section.
---
## Phase 7: Final Plan Generation
Deeply think and generate the plan.md following these rules:
### Plan Structure
```markdown
# <Plan Title>
## Goal Description
<Clear, direct description of what needs to be accomplished>
## Acceptance Criteria
Following TDD philosophy, each criterion includes positive and negative tests for deterministic verification.
- AC-1: <First criterion>
- Positive Tests (expected to PASS):
- <Test case that should succeed when criterion is met>
- <Another success case>
- Negative Tests (expected to FAIL):
- <Test case that should fail/be rejected when working correctly>
- <Another failure/rejection case>
- AC-1.1: <Sub-criterion if needed>
- Positive: <...>
- Negative: <...>
- AC-2: <Second criterion>
- Positive Tests: <...>
- Negative Tests: <...>
...
## Path Boundaries
Path boundaries define the acceptable range of implementation quality and choices.
### Upper Bound (Maximum Acceptable Scope)
<Affirmative description of the most comprehensive acceptable implementation>
<This represents completing the goal without over-engineering>
Example: "The implementation includes X, Y, and Z features with full test coverage"
### Lower Bound (Minimum Acceptable Scope)
<Affirmative description of the minimum viable implementation>
<This represents the least effort that still satisfies all acceptance criteria>
Example: "The implementation includes core feature X with basic validation"
### Allowed Choices
<Options that are acceptable for implementation decisions>
- Can use: <technologies, approaches, patterns that are allowed>
- Cannot use: <technologies, approaches, patterns that are prohibited>
> **Note on Deterministic Designs**: If the draft specifies a highly deterministic design with no choices (e.g., "must use JSON format", "must use algorithm X"), then the path boundaries should reflect this narrow constraint. In such cases, upper and lower bounds may converge to the same point, and "Allowed Choices" should explicitly state that the choice is fixed per the draft specification.
## Feasibility Hints and Suggestions
> **Note**: This section is for reference and understanding only. These are conceptual suggestions, not prescriptive requirements.
### Conceptual Approach
<Text description, pseudocode, or diagrams showing ONE possible implementation path>
### Relevant References
<Code paths and concepts that might be useful>
- <path/to/relevant/component> - <brief description>
## Dependencies and Sequence
### Milestones
1. <Milestone 1>: <Description>
- Phase A: <...>
- Phase B: <...>
2. <Milestone 2>: <Description>
- Step 1: <...>
- Step 2: <...>
<Describe relative dependencies between components, not time estimates>
## Task Breakdown
Each task must include exactly one routing tag:
- `coding`: implemented by Claude
- `analyze`: executed via Codex (`/humanize:ask-codex`)
| Task ID | Description | Target AC | Tag (`coding`/`analyze`) | Depends On |
|---------|-------------|-----------|----------------------------|------------|
| task1 | <...> | AC-1 | coding | - |
| task2 | <...> | AC-2 | analyze | task1 |
## Claude-Codex Deliberation
### Agreements
- <Point both sides agree on>
### Resolved Disagreements
- <Topic>: Claude vs Codex summary, chosen resolution, and rationale
### Convergence Status
- Final Status: `converged` or `partially_converged`
## Pending User Decisions
- DEC-1: <Decision topic>
- Claude Position: <...>
- Codex Position: <...>
- Tradeoff Summary: <...>
- Decision Status: `PENDING` or `<User's final decision>`
## Implementation Notes
### Code Style Requirements
- Implementation code and comments must NOT contain plan-specific terminology such as "AC-", "Milestone", "Step", "Phase", or similar workflow markers
- These terms are for plan documentation only, not for the resulting codebase
- Use descriptive, domain-appropriate naming in code instead
## Output File Convention
This template is used to produce the main output file (e.g., `plan.md`).
### Translated Language Variant
When `alternative_plan_language` resolves to a supported language name through merged config loading, a translated variant of the output file is also written after the main file. Humanize loads config from merged layers in this order: default config, optional user config, then optional project config; `alternative_plan_language` may be set at any of those layers. The variant filename is constructed by inserting `_<code>` (the ISO 639-1 code from the built-in mapping table) immediately before the file extension:
- `plan.md` becomes `plan_<code>.md` (e.g. `plan_zh.md` for Chinese, `plan_ko.md` for Korean)
- `docs/my-plan.md` becomes `docs/my-plan_<code>.md`
- `output` (no extension) becomes `output_<code>`
The translated variant file contains a full translation of the main plan file's current content in the configured language. All identifiers (`AC-*`, task IDs, file paths, API names, command flags) remain unchanged, as they are language-neutral.
When `alternative_plan_language` is empty, absent, set to `"English"`, or set to an unsupported language, no translated variant is written. Humanize does not auto-create `.humanize/config.json` when no project config file is present.
```
### Generation Rules
1. **Terminology**: Use Milestone, Phase, Step, Section. Never use Day, Week, Month, Year, or time estimates.
2. **No Line Numbers**: Reference code by path only (e.g., `src/utils/helpers.ts`), never by line ranges.
3. **No Time Estimates**: Do not estimate duration, effort, or code line counts.
4. **Conceptual Not Prescriptive**: Path boundaries and suggestions guide without mandating.
5. **AC Format**: All acceptance criteria must use AC-X or AC-X.Y format.
6. **Clear Dependencies**: Show what depends on what, not when things happen.
7. **TDD-Style Tests**: Each acceptance criterion MUST include both positive tests (expected to pass) and negative tests (expected to fail). This follows Test-Driven Development philosophy and enables deterministic verification.
8. **Affirmative Path Boundaries**: Describe upper and lower bounds using affirmative language (what IS acceptable) rather than negative language (what is NOT acceptable).
9. **Respect Deterministic Designs**: If the draft specifies a fixed approach with no choices, reflect this in the plan by narrowing the path boundaries to match the user's specification.
10. **Code Style Constraint**: The generated plan MUST include a section or note instructing that implementation code and comments should NOT contain plan-specific progress terminology such as "AC-", "Milestone", "Step", "Phase", or similar workflow markers. These terms belong in the plan document, not in the resulting codebase.
11. **Draft Completeness Requirement**: The generated plan MUST incorporate ALL information from the input draft document without omission. The draft represents the most valuable human input and must be fully preserved. Any clarifications obtained through Phase 6 should be added incrementally to the draft's original content, never replacing or losing any original requirements. The final plan must be a superset of the draft information plus all clarified details.
12. **Debate Traceability**: The plan MUST include Codex-first findings, Claude/Codex agreements, resolved disagreements, and unresolved decisions. Unresolved opposite opinions MUST be recorded in `## Pending User Decisions` for explicit user decision.
13. **Convergence Requirement**: The plan MUST record Claude/Codex agreements, resolved disagreements, and final convergence status in `## Claude-Codex Deliberation`. Stop only when convergence conditions are met or max rounds reached with explicit carry-over decisions.
14. **Task Tag Requirement**: The plan MUST include `## Task Breakdown`, and every task MUST be tagged as either `coding` or `analyze` (no untagged tasks, no other tag values).
---
## Phase 8: Write and Complete
The output file already contains the plan template structure and the original draft content (combined after the relevance check). Now complete the plan through the following steps:
### Step 1: Update Plan Content
Use the **Edit tool** (not Write) to update the plan file with the generated content:
- Replace template placeholders with actual plan content
- Keep the original draft section intact at the bottom of the file
- The final file should contain both the structured plan AND the original draft for reference
### Step 2: Comprehensive Review
After updating, **read the complete plan file** and verify:
- The plan is complete and comprehensive
- All sections are consistent with each other
- The structured plan aligns with the original draft content
- Claude/Codex disagreement handling is explicit and correctly reflected
- No contradictions exist between different parts of the document
If inconsistencies are found, fix them using the Edit tool.
### Step 3: Language Unification
Check if the updated plan file contains multiple languages (e.g., mixed English and Chinese content).
If multiple languages are detected:
1. Use **AskUserQuestion** to ask the user:
- Whether they want to unify the language
- Which language to use for unification
2. If the user chooses to unify:
- Translate all content to the chosen language
- Ensure the meaning and intent remain unchanged
- Use the Edit tool to apply the translations
3. If the user declines, leave the document as-is
### Step 4: Write Translated Language Variant (Conditional)
If `ALT_PLAN_LANGUAGE` is non-empty (translation enabled), write a translated variant of the output file.
**Language Unification guard**: If the main plan file was unified to `ALT_PLAN_LANGUAGE` in Step 3 (Language Unification), skip this step. Log: `Main plan file is already in <ALT_PLAN_LANGUAGE>; translated variant not needed.`
**Filename construction rule** - insert `_<ALT_PLAN_LANG_CODE>` immediately before the file extension:
- `plan.md` becomes `plan_<code>.md` (e.g. `plan_zh.md`, `plan_ko.md`)
- `docs/my-plan.md` becomes `docs/my-plan_<code>.md`
- `output` (no extension) becomes `output_<code>`
Algorithm:
1. Find the last `.` in the base filename.
2. If a `.` is found, insert `_<ALT_PLAN_LANG_CODE>` before it: `<stem>_<code>.<extension>`.
3. If no `.` is found (no extension), append `_<ALT_PLAN_LANG_CODE>` to the filename: `<filename>_<code>`.
4. The variant file is placed in the same directory as the main output file.
**Content of the variant file**:
- Translate the main plan file's current content (after any Language Unification from Step 3) into `ALT_PLAN_LANGUAGE`. For Chinese, default to Simplified Chinese.
- Section headings, AC labels, task IDs, file paths, API names, and command flags MUST remain unchanged (identifiers are language-neutral).
- The variant file is a translated reading view of the same plan; it must not add new information not present in the main file.
- The original draft section at the bottom should be kept as-is (not re-translated).
If `ALT_PLAN_LANGUAGE` is empty (the default), do NOT create a translated variant file.
### Step 5: Optional Direct Work Start
If all of the following are true:
- `AUTO_START_RLCR_IF_CONVERGED=true`
- `PLAN_CONVERGENCE_STATUS=converged`
- `GEN_PLAN_MODE=discussion`
- There are no pending decisions with status `PENDING`
Then start work immediately by running:
```bash
/humanize:start-rlcr-loop --skip-quiz <output-plan-path>
```
The `--skip-quiz` flag is passed because the user has already demonstrated understanding of the plan through the gen-plan convergence discussion.
If the command invocation is not available in this context, fall back to the setup script:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh" --skip-quiz --plan-file <output-plan-path>
```
If the auto-start attempt fails, report the failure reason and provide the exact manual command for the user to run:
```bash
/humanize:start-rlcr-loop <output-plan-path>
```
### Step 6: Report Results
Report to the user:
- Path to the generated plan
- Summary of what was included
- Number of acceptance criteria defined
- Number of convergence rounds executed
- Number of unresolved user decisions (if any)
- Whether language was unified (if applicable)
- Whether direct work start was attempted, and its result
---
## Error Handling
If issues arise during plan generation that require user input:
- Use AskUserQuestion to clarify
- Document any user decisions in the plan's context
If auto-start mode is enabled but convergence conditions are not met:
- Explain why direct start was skipped
- Tell the user to either resolve pending decisions or run `/humanize:start-rlcr-loop <plan.md>` manually
If unable to generate a complete plan:
- Explain what information is missing
- Suggest how the user can improve their draft
---
description: "Refine an annotated implementation plan and generate a QA ledger"
argument-hint: "--input <path/to/annotated-plan.md> [--output <path/to/refined-plan.md>] [--qa-dir <path/to/qa-dir>] [--alt-language <language-or-code>] [--discussion|--direct]"
allowed-tools:
- "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/validate-refine-plan-io.sh:*)"
- "Read"
- "Glob"
- "Grep"
- "Write"
- "Edit"
- "AskUserQuestion"
hide-from-slash-command-tool: "true"
---
# Refine Annotated Plan
Read and execute below with ultrathink.
## Hard Constraint: Planning-Only Refinement
This command MUST ONLY refine plan artifacts. It MUST NOT implement repository code, modify source files unrelated to the plan outputs, start RLCR automatically, or create a new plan schema.
Permitted writes are limited to:
- The refined plan output file (`--output`, or `--input` in in-place mode)
- The QA document under `--qa-dir`
- Optional translated language variants for the refined plan and QA document
The refined plan MUST reuse the existing `gen-plan` schema. Do not invent new top-level sections. Keep required sections intact, preserve optional sections when present, and preserve any `--- Original Design Draft Start ---` appendix or other non-comment content unless a comment explicitly requires a plan-level change there.
## Workflow Overview
> **Sequential Execution Constraint**: Execute the phases strictly in order. Do NOT parallelize work across phases. Finish each phase before moving to the next one.
1. **Execution Mode Setup**: Parse CLI arguments and derive output paths
2. **Load Project Config**: Resolve `alternative_plan_language` and mode defaults using `config-loader.sh` semantics
3. **IO Validation**: Run `validate-refine-plan-io.sh`
4. **Comment Extraction**: Scan the annotated plan and extract valid comment blocks (`CMT:`/`ENDCMT`, `<cmt>`/`</cmt>`, `<comment>`/`</comment>`)
5. **Comment Classification**: Classify each extracted comment for downstream handling
6. **Comment Processing**: Answer questions, apply requested plan edits, and perform targeted research
7. **Plan Refinement**: Produce the comment-free refined plan while preserving the `gen-plan` structure
8. **QA Generation**: Populate the QA template with the comment ledger and outcomes
9. **Atomic Write**: Commit the refined plan, QA document, and optional variants as one transaction
---
## Phase 0: Execution Mode Setup
Parse `$ARGUMENTS` and set the following variables:
- `INPUT_FILE` from `--input` (required)
- `OUTPUT_FILE` from `--output`
- `QA_DIR` from `--qa-dir`
- `CLI_ALT_LANGUAGE_RAW` from `--alt-language`
- `REFINE_PLAN_MODE_DISCUSSION=true` if `--discussion` is present
- `REFINE_PLAN_MODE_DIRECT=true` if `--direct` is present
Argument rules:
1. `--input <path>` is required.
2. `--output <path>` is optional. If omitted, set `OUTPUT_FILE=INPUT_FILE` for in-place mode.
3. `--qa-dir <path>` is optional. If omitted, set `QA_DIR=.humanize/plan_qa`.
4. `--alt-language <language-or-code>` is optional. If present without a value, report `Invalid arguments: --alt-language requires a value` and stop.
5. `--discussion` and `--direct` are mutually exclusive. If both are present, report `Cannot use --discussion and --direct together` and stop.
Derived paths:
1. Compute `IN_PLACE_MODE=true` when `OUTPUT_FILE` equals `INPUT_FILE`; otherwise `false`.
2. Compute `QA_FILE` from the input basename, not the output basename:
- `plan.md` becomes `<QA_DIR>/plan-qa.md`
- `docs/my-plan.md` becomes `<QA_DIR>/my-plan-qa.md`
- `plan` becomes `<QA_DIR>/plan-qa.md`
3. Keep `--alt-language` out of the validator invocation because `validate-refine-plan-io.sh` does not accept it. Pass only:
- `--input`
- `--output` when provided
- `--qa-dir` when provided
- `--discussion` or `--direct` when provided
Scope rules for v1:
- Do not introduce `--language` or `--qa-output`
- Do not add new config keys
- Do not auto-start RLCR after refinement
---
## Phase 0.5: Load Project Config
Resolve configuration by following the same precedence and merge semantics defined in `${CLAUDE_PLUGIN_ROOT}/scripts/lib/config-loader.sh`. Reuse that behavior; do not invent a separate refine-plan config model.
### Config Merge Semantics
Use the same layer order as `load_merged_config`:
1. Required default config: `${CLAUDE_PLUGIN_ROOT}/config/default_config.json`
2. Optional user config: `${XDG_CONFIG_HOME:-$HOME/.config}/humanize/config.json`
3. Optional project config: `${HUMANIZE_CONFIG:-$PROJECT_ROOT/.humanize/config.json}`
Later layers override earlier layers. Malformed optional JSON objects are treated as warnings and ignored. A malformed required default config is a fatal configuration error.
### Values to Extract
Read the merged config and resolve:
- `CONFIG_ALT_LANGUAGE_RAW` from `alternative_plan_language`
- `CONFIG_GEN_PLAN_MODE_RAW` from `gen_plan_mode`
### Mode Resolution
Resolve `REFINE_PLAN_MODE` with this priority:
1. CLI `--discussion` => `discussion`
2. CLI `--direct` => `direct`
3. Valid config value `gen_plan_mode` (`discussion` or `direct`, case-insensitive)
4. Default => `discussion`
If `gen_plan_mode` is present but invalid, log a warning and fall back to the next rule.
### Alternative Language Resolution
Resolve the variant language with this priority:
1. CLI `--alt-language`
2. Config `alternative_plan_language`
3. No variant
Normalize the value case-insensitively using this mapping table:
| Language | Code | Suffix |
|------------|------|--------|
| Chinese | zh | `_zh` |
| Korean | ko | `_ko` |
| Japanese | ja | `_ja` |
| Spanish | es | `_es` |
| French | fr | `_fr` |
| German | de | `_de` |
| Portuguese | pt | `_pt` |
| Russian | ru | `_ru` |
| Arabic | ar | `_ar` |
Normalization rules:
1. Trim leading and trailing whitespace before matching.
2. Accept either the full language name or the ISO code from the table.
3. Treat `English` / `en` as a no-op: no translated variant is generated.
4. If the CLI value is unsupported, report `Unsupported --alt-language "<value>"` and stop.
5. If the config value is unsupported, log a warning and disable variant generation.
Set:
- `ALT_PLAN_LANGUAGE` to the normalized language name or empty string
- `ALT_PLAN_LANG_CODE` to the normalized code or empty string
Do not depend on deprecated `chinese_plan`. `refine-plan` only uses `alternative_plan_language`.
---
## Phase 1: IO Validation
Run the validator with the parsed arguments, excluding `--alt-language`:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/validate-refine-plan-io.sh" <validated-arguments>
```
Handle exit codes exactly:
- Exit code 0: Continue to Phase 2
- Exit code 1: Report `Input file not found` and stop
- Exit code 2: Report `Input file is empty` and stop
- Exit code 3: Report `Input file has no comment blocks` and stop
- Exit code 4: Report `Input file is missing required gen-plan sections` and stop
- Exit code 5: Report `Output directory does not exist or is not writable - please fix it` and stop
- Exit code 6: Report `QA directory is not writable` and stop
- Exit code 7: Report `Invalid arguments` and show the validator usage, then stop
Validation notes:
1. `validate-refine-plan-io.sh` may create `QA_DIR` when it does not exist. Treat that as expected setup, not as a side effect to undo.
2. After validation succeeds, read the input file and preserve its exact contents as `ORIGINAL_PLAN_TEXT`.
3. Do not mutate the validated input yet. All writes happen in Phase 7 only.
---
## Phase 2: Comment Extraction
Extract comments using a **stateful scanner** equivalent to POSIX `awk` wrapped by `bash`, not a naive regular expression pass. The scanner behavior must match the Task 3 findings.
### Scanner Requirements
Track these states while scanning the validated input in document order:
- `IN_FENCE` with the active fence marker (` ``` ` or ` ~~~ `)
- `IN_HTML_COMMENT` for `<!-- ... -->`
- `IN_CMT_BLOCK`
- `NEAREST_HEADING`
Extraction rules:
1. Support three comment formats:
- Classic: `CMT:` as start marker and `ENDCMT` as end marker
- Short tag: `<cmt>` as start marker and `</cmt>` as end marker
- Long tag: `<comment>` as start marker and `</comment>` as end marker
2. Support both inline and multi-line blocks for all formats:
- Inline: `Text before CMT: comment text ENDCMT text after`
- Inline: `Text before <cmt>comment text</cmt> text after`
- Inline: `Text before <comment>comment text</comment> text after`
- Multi-line:
```markdown
CMT:
comment text
ENDCMT
```
```markdown
<cmt>
comment text
</cmt>
```
```markdown
<comment>
comment text
</comment>
```
3. Ignore comment markers inside fenced code blocks.
4. Ignore comment markers inside HTML comments.
5. Update `NEAREST_HEADING` whenever a Markdown heading is encountered outside fenced code and HTML comments.
6. Preserve surrounding non-comment text when removing inline comment blocks from the working plan text.
7. Assign raw comment IDs in document order as `CMT-1`, `CMT-2`, ... only for non-empty blocks.
8. If a block is empty after trimming whitespace, remove it from the working plan text but do not create a ledger item and do not consume an ID.
### Extracted Metadata
For each non-empty comment block, capture:
- `id` (`CMT-N`)
- `original_text` exactly as written between the comment markers
- `normalized_text` with surrounding whitespace trimmed
- `start_line`, `start_column`
- `end_line`, `end_column`
- `nearest_heading` or `Preamble` when no heading exists yet
- `location_label` for QA output
- `form` = `inline` or `multiline`
- `context_excerpt` from the nearest non-comment source text
### Parse Errors
These are fatal extraction errors:
1. Nested comment start marker while already inside a comment block
2. Comment end marker encountered while not inside a comment block or wrong end marker for the format
3. End of file reached while still inside a comment block
Every fatal parse error MUST report:
- The error kind
- The exact line and column
- The nearest heading
- A short context excerpt
Examples of acceptable messages:
- `Comment parse error: nested comment block at line 48, column 3 near "## Acceptance Criteria" (context: "<cmt>split AC-2...")`
- `Comment parse error: stray comment end marker at line 109, column 1 near "## Task Breakdown" (context: "</comment>")`
- `Comment parse error: missing end marker for block opened at line 72, column 5 near "## Dependencies and Sequence"`
### Outputs from Phase 2
Produce:
- `EXTRACTED_COMMENTS`: ordered list of comment records
- `PLAN_WITH_COMMENTS_REMOVED`: the original plan text with every valid comment block removed and surrounding inline text preserved
If `EXTRACTED_COMMENTS` is empty after removing no-op blocks, report `No non-empty CMT blocks remain after parsing` and stop.
---
## Phase 3: Comment Classification
Classify every extracted comment for downstream handling.
### Primary Classification Set
Each raw comment block must receive exactly one primary classification:
- `question`
- `change_request`
- `research_request`
### Heuristic Rules
Use these heuristics first:
- `question`: asks why, how, what, explain, clarify, or says the plan is unclear
- `change_request`: asks to add, remove, delete, rewrite, restore, rename, split, merge, or otherwise modify the plan
- `research_request`: asks to investigate the repository, compare existing patterns, confirm current behavior, or gather evidence before deciding
When more than one intent appears in the same raw block:
1. Keep the raw ledger ID unchanged (`CMT-N`)
2. Create deterministic processing sub-items in textual order: `CMT-N.1`, `CMT-N.2`, ...
3. Assign each sub-item one of the three classifications above
4. Assign the raw block a dominant classification for the QA ledger using this precedence:
- `research_request`
- `change_request`
- `question`
### Ambiguity Handling
If classification is still ambiguous after applying the heuristics:
- In `discussion` mode: use `AskUserQuestion` to confirm the classification before continuing
- In `direct` mode: choose the most action-driving interpretation and record the assumption in the QA document
Examples:
- `Why do we need two config layers here?` => `question`
- `Delete task5 and fold its work into task4.` => `change_request`
- `Investigate how config loading works in this repo before deciding whether AC-3 should change.` => `research_request`, or split into research plus follow-up change sub-items if the block clearly contains both intents
### Classification Record
For each raw comment block and any sub-items, record:
- `id`
- `parent_id` when applicable
- `classification`
- `classification_rationale`
- `needs_user_confirmation` (`true` or `false`)
- `resolved_via_discussion` (`true` or `false`)
---
## Phase 4: Comment Processing
Process comments in document order. When a raw block has sub-items, process the sub-items in order before moving to the next raw block.
### `question`
Default behavior:
1. Answer the question in the QA document.
2. Apply only minimal clarifying plan edits when the current plan text is genuinely ambiguous or misleading.
3. Do not use a question as an excuse to expand scope, add implementation detail, or rewrite unrelated sections.
Preferred destinations for light clarification:
- `## Goal Description`
- `## Feasibility Hints and Suggestions`
- `## Dependencies and Sequence`
- `## Implementation Notes`
### `change_request`
Default behavior:
1. Apply the requested plan edits directly to the refined plan draft.
2. Keep the `gen-plan` structure intact.
3. Propagate changes across all affected sections so the plan stays internally consistent.
Consistency obligations:
- Acceptance criteria still match referenced tasks
- Task Breakdown still points to existing ACs
- Task dependencies still reference existing task IDs or `-`
- Milestones and sequencing remain aligned with the changed scope
- `Claude-Codex Deliberation` and `Pending User Decisions` reflect the new state
- Task routing tags remain exactly `coding` or `analyze`
### `research_request`
Default behavior:
1. Perform targeted repository research using only `Read`, `Glob`, and `Grep`.
2. Keep the research tightly scoped to the comment. Do not drift into implementation work.
3. Summarize the files and patterns examined in the QA document.
4. Integrate the conclusion into the refined plan if the evidence supports a clear plan update.
5. If the research narrows the issue but still requires a human choice, add or update a `DEC-N` item in `## Pending User Decisions` and record the same decision in the QA document.
### Resolution Rules
1. Every raw `CMT-N` must end with one disposition:
- `answered`
- `applied`
- `researched`
- `deferred`
- `resolved`
2. Preserve the original comment text in the QA document exactly as captured in Phase 2.
3. If a comment cannot be fully resolved without user input:
- In `discussion` mode, ask only the minimum necessary question
- In `direct` mode, make the smallest safe assumption, mark it explicitly in QA, and add a pending decision when the assumption materially affects the plan
4. If unresolved user decisions remain after processing, the plan convergence status must be `partially_converged`
5. If all comments are fully resolved and no pending decisions remain, preserve or set convergence status to `converged`
---
## Phase 5: Generate Refined Plan
Starting from `PLAN_WITH_COMMENTS_REMOVED`, apply the accepted refinements from Phase 4 and produce `REFINED_PLAN_TEXT`.
### Structural Preservation Rules
The refined plan MUST retain these required sections:
- `## Goal Description`
- `## Acceptance Criteria`
- `## Path Boundaries`
- `## Feasibility Hints and Suggestions`
- `## Dependencies and Sequence`
- `## Task Breakdown`
- `## Claude-Codex Deliberation`
- `## Pending User Decisions`
- `## Implementation Notes`
Optional sections that MUST be preserved when present in the input:
- `## Codex Team Workflow`
- `## Convergence Log`
- `--- Original Design Draft Start ---` appendix and its matching end marker
### Refinement Rules
1. Remove every resolved comment marker and all enclosed comment text from the refined plan.
2. Do not add any new top-level schema section.
3. Preserve `AC-X` / `AC-X.Y` formatting.
4. Preserve task IDs unless a comment explicitly requests a structural change.
5. If task IDs or AC IDs change, update all references consistently across the plan.
6. Keep task routing tags restricted to `coding` or `analyze`.
7. Keep the refined plan in the same main language as the input plan. Only normalize mixed-language content when the input is ambiguous and discussion-mode user input explicitly requests normalization.
### Main Language Detection
Determine the primary language of the input plan after comment removal.
Rules:
1. Use the dominant language of headings and prose as the default main language.
2. If the plan is clearly mixed-language and the dominant language is ambiguous:
- In `discussion` mode, ask the user whether to keep the current mix or normalize to the dominant language
- In `direct` mode, keep the dominant language inferred from headings and body text; if still tied, default to English
3. The QA document MUST use the same main language as the refined plan.
4. If `ALT_PLAN_LANGUAGE` resolves to the same language as the main language, skip variant generation.
### Required Validation Before Phase 6
Before generating the QA document, verify:
1. All required sections are still present
2. No comment markers remain
3. Every referenced `AC-*` exists
4. Every task dependency references an existing task ID or `-`
5. Every task row has exactly one valid routing tag: `coding` or `analyze`
6. `## Pending User Decisions` and `### Convergence Status` agree with the actual unresolved state
If a validation issue can be fixed by reconciling the plan, fix it before continuing. If it cannot be fixed without inventing requirements, stop and report the blocking inconsistency.
---
## Phase 6: Generate QA Document
Read `${CLAUDE_PLUGIN_ROOT}/prompt-template/plan/refine-plan-qa-template.md` and populate it completely. The QA document is not optional.
### QA Content Requirements
Populate all template sections:
1. `## Summary`
2. `## Comment Ledger`
3. `## Answers`
4. `## Research Findings`
5. `## Plan Changes Applied`
6. `## Remaining Decisions`
7. `## Refinement Metadata`
### Ledger Rules
The `Comment Ledger` MUST contain exactly one row per raw `CMT-N` extracted in Phase 2, in document order.
Each row must include:
- `CMT-ID`
- Dominant classification
- Location
- Original text excerpt
- Final disposition
If a raw block was split into processing sub-items, keep one ledger row for the raw ID and describe the sub-item handling in the detailed sections.
### Section-Specific Rules
- `Answers`: include all `question` items and any clarifying edits made to the plan
- `Research Findings`: include all `research_request` items, the files or patterns examined, and the impact on the plan
- `Plan Changes Applied`: include all `change_request` items and cross-reference updates
- `Remaining Decisions`: include every unresolved or assumption-heavy item that still needs user choice
Language rules:
1. Write the main QA document in the same main language as `REFINED_PLAN_TEXT`
2. Keep identifiers unchanged: `AC-*`, task IDs, file paths, API names, command flags, config keys
3. Preserve the original comment text verbatim inside fenced code blocks
Metadata rules:
1. Record the resolved input path, output path, QA path, date, and counts by classification
2. Record the final convergence status as `converged` or `partially_converged`
3. Record the set of plan sections modified during refinement
---
## Phase 7: Atomic Write Transaction
Do not write any final output until all content is fully prepared.
### Files in Scope
Always prepare:
- Main refined plan at `OUTPUT_FILE`
- Main QA document at `QA_FILE`
Conditionally prepare:
- Plan variant at `OUTPUT_FILE` with `_<ALT_PLAN_LANG_CODE>` inserted before the extension
- QA variant at `QA_FILE` with `_<ALT_PLAN_LANG_CODE>` inserted before the extension
Filename construction rule for variants:
1. If the filename has an extension, insert `_<code>` before the last `.`
2. If the filename has no extension, append `_<code>`
Examples:
- `plan.md` -> `plan_zh.md`
- `feature-a-qa.md` -> `feature-a-qa_zh.md`
- `output` -> `output_zh`
### Variant Content Rules
If `ALT_PLAN_LANGUAGE` is non-empty and different from the main language:
1. Translate the main refined plan into `ALT_PLAN_LANGUAGE`
2. Translate the main QA document into `ALT_PLAN_LANGUAGE`
3. Keep identifiers unchanged
4. For Chinese, default to Simplified Chinese
If `ALT_PLAN_LANGUAGE` is empty or equals the main language, do not create variant files.
### Transaction Rules
1. Prepare all final content in memory first:
- `REFINED_PLAN_TEXT`
- `QA_TEXT`
- Optional `REFINED_PLAN_VARIANT_TEXT`
- Optional `QA_VARIANT_TEXT`
2. Write each output to a temporary file in the same directory as its final destination.
3. Use temp naming patterns equivalent to:
- `.refine-plan-XXXXXX`
- `.refine-qa-XXXXXX`
- `.refine-plan-variant-XXXXXX`
- `.refine-qa-variant-XXXXXX`
4. If any temp write or translation step fails:
- Delete all temp files
- Leave existing final outputs untouched
- Report the failure
5. Only after every temp file is written successfully may you replace final outputs.
6. Replace auxiliary outputs before replacing the main in-place plan file, so the primary plan is updated last.
7. If finalization fails after any destination was replaced, restore from backups if the environment allows it; otherwise report the partial-finalization risk explicitly.
Success condition:
- Main refined plan written successfully
- Main QA document written successfully
- Every requested variant written successfully
- No stale temp files remain
### Final Report
Report:
- Path to the refined plan
- Path to the QA document
- Paths to any generated variants
- Number of raw comments processed
- Counts by classification
- Whether pending decisions remain
- Final convergence status
- Whether refinement ran in `discussion` or `direct` mode
---
## Error Handling
If a blocking issue occurs:
- Report the exact phase where it failed
- Include the concrete reason
- Include any relevant line/column/context detail for parse errors
- Do not leave partially refined plan artifacts behind
If a user decision is needed in `discussion` mode:
- Ask only the narrowest question needed to proceed
- Record the decision in the QA document and, when still unresolved, in `## Pending User Decisions`
If a decision is deferred in `direct` mode:
- Make the smallest safe assumption
- Record the assumption explicitly in the QA document
- Mark the plan as `partially_converged` when the deferred item materially affects implementation direction
---
description: "Start iterative loop with Codex review"
argument-hint: "[path/to/plan.md | --plan-file path/to/plan.md] [--max N] [--codex-model MODEL:EFFORT] [--codex-timeout SECONDS] [--track-plan-file] [--push-every-round] [--base-branch BRANCH] [--full-review-round N] [--skip-impl] [--claude-answer-codex] [--agent-teams] [--yolo] [--skip-quiz] [--privacy]"
allowed-tools:
- "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh:*)"
- "Read"
- "Task"
- "AskUserQuestion"
---
# Start RLCR Loop
## Plan Compliance Pre-Check
Before running the setup script, validate the plan file for compliance. This is a fool-proofing mechanism that catches obviously wrong plan files early.
**Skip this entire pre-check if** any of these conditions are true:
- `$ARGUMENTS` contains `--skip-impl` (no plan file to validate)
- `$ARGUMENTS` contains `-h` or `--help` (just showing help)
### Extract the plan file path from arguments
Parse `$ARGUMENTS` to find the plan file path:
- If `--plan-file <path>` is present, use `<path>`
- Otherwise, use the first positional argument (the first argument that does not start with `--` and is not a value following a known flag like `--max`, `--codex-model`, `--codex-timeout`, `--base-branch`, `--full-review-round`, `--plan-file`)
- If no plan file path can be determined, skip the pre-check and let the setup script handle the error
### Basic path safety gate
Only proceed with the pre-check if the extracted path meets ALL of these conditions:
- Is a relative path (does not start with forward slash)
- Does not contain parent directory traversal (double dot path components)
- Contains only safe path characters: letters, digits, hyphen, underscore, dot, and forward slash
If any condition fails, skip the pre-check and let the setup script handle path validation.
### Read and validate plan content
1. Use the Read tool to read the plan file. If the file does not exist or cannot be read, skip the pre-check and let the setup script handle the error.
2. Use the Task tool to invoke the `humanize:plan-compliance-checker` agent (sonnet model):
```
Task tool parameters:
- model: "sonnet"
- prompt: Include the plan file content and ask the agent to:
1. Explore the repository structure (README, CLAUDE.md, main files)
2. Check if the plan content relates to this repository
3. Check if the plan contains branch-switching instructions
4. Return exactly one of: `PASS: <summary>`, `FAIL_RELEVANCE: <reason>`, or `FAIL_BRANCH_SWITCH: <details>`
```
3. **Parse the result** (fail-closed):
- If output contains `PASS`: continue to setup script below
- If output contains `FAIL_RELEVANCE`: report "Plan compliance check failed: the plan does not appear to be related to this repository." Show the reason. **Stop the command.**
- If output contains `FAIL_BRANCH_SWITCH`: report "Plan compliance check failed: the plan contains branch-switching instructions, which are incompatible with RLCR. The RLCR loop requires the working branch to remain constant across all rounds." Show the details. **Stop the command.**
- If output contains none of the above (malformed): report "Plan compliance check produced unexpected output. Cannot proceed." **Stop the command.**
---
## Plan Understanding Quiz
Before running the setup script, verify the user genuinely understands what the plan will do. This is an advisory check -- it never blocks the loop, but catches "wishful thinking" users who blindly accepted a generated plan without reading it.
**Skip this entire quiz if** any of these conditions are true:
- `$ARGUMENTS` contains `--skip-impl` (no plan to quiz about)
- `$ARGUMENTS` contains `--yolo` (user explicitly opted out of all pre-flight checks)
- `$ARGUMENTS` contains `--skip-quiz` (user explicitly opted out of the quiz)
- `$ARGUMENTS` contains `-h` or `--help` (just showing help)
- No plan content is available (the compliance pre-check was skipped because no plan file path could be determined)
### Run the quiz agent
1. Reuse the plan content that was already read during the compliance pre-check above (do not re-read the file).
2. Use the Task tool to invoke the `humanize:plan-understanding-quiz` agent (opus model):
```
Task tool parameters:
- model: "opus"
- prompt: Include the plan file content and ask the agent to:
1. Explore the repository structure for context
2. Analyze the plan's technical implementation details
3. Generate 2 multiple-choice questions (4 options each) and a plan summary
4. Return in the structured format: QUESTION_1, OPTION_1A-D, ANSWER_1, QUESTION_2, OPTION_2A-D, ANSWER_2, PLAN_SUMMARY
```
3. **Parse the result**: Extract all 13 fields from the agent output (QUESTION_1, OPTION_1A through OPTION_1D, ANSWER_1, QUESTION_2, OPTION_2A through OPTION_2D, ANSWER_2, PLAN_SUMMARY). If the output is malformed (any field missing or ANSWER not A/B/C/D), warn: "Plan understanding quiz unavailable, continuing without it." and proceed to the Setup section below.
### Ask questions and evaluate
4. Use AskUserQuestion to present QUESTION_1 as a multiple-choice question with the 4 options (OPTION_1A through OPTION_1D). Compare the user's choice against ANSWER_1:
- If the user selected the correct answer, mark QUESTION_1 as **PASS**
- Otherwise, mark as **WRONG**
5. Use AskUserQuestion to present QUESTION_2 as a multiple-choice question with the 4 options (OPTION_2A through OPTION_2D). Compare the user's choice against ANSWER_2 using the same criteria.
### Decide whether to proceed
6. **If both questions PASS**: Briefly acknowledge ("Your understanding of the plan looks solid. Proceeding with setup.") and continue to the Setup section below.
7. **If one or both questions are WRONG**: Show the PLAN_SUMMARY to the user to help them understand what the plan does and the correct answers to the questions they missed. Then use AskUserQuestion with the question: "Would you like to proceed with the RLCR loop anyway, or stop and review the plan more carefully first?" with these choices:
- "Proceed with RLCR loop"
- "Stop and review the plan first"
- If the user chooses **"Proceed with RLCR loop"**: Continue to the Setup section below.
- If the user chooses **"Stop and review the plan first"**: Report "Stopping. Please review the plan file and re-run start-rlcr-loop when ready." and **stop the command**.
---
## Setup
If the pre-check passed (or was skipped), and the quiz passed (or was skipped or user chose to proceed), execute the setup script to initialize the loop:
```bash
"${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh" $ARGUMENTS
```
This command starts an iterative development loop where:
1. You execute the implementation plan with task-tag routing
- `coding` tasks: Claude executes directly
- `analyze` tasks: execute via `/humanize:ask-codex`
2. Write a summary of your work to the specified summary file
3. When you try to exit, Codex reviews your summary
4. If Codex finds issues, you receive feedback and continue
5. If Codex outputs "COMPLETE", the loop enters **Review Phase**
6. In Review Phase, `codex review --base <branch>` performs code review
7. If code review finds issues (`[P0-9]` markers), you fix them and continue
8. When no issues are found, the loop ends with a Finalize Phase
## What Is a Round
**One round = the agent believes the entire plan is finished.** A round boundary is when the agent writes a summary and attempts to exit, triggering Codex review. This is the fundamental semantic:
- A round is NOT one task, one milestone, one stage, or one layer of the plan.
- If the plan has multiple stages or milestones, they are all completed within a single round before writing the round summary.
- Intermediate progress checks (e.g., verifying a stage before starting the next) should use manual `ask-codex` calls, not round boundaries.
- Only write `round-N-summary.md` and attempt to exit when you believe ALL tasks in the plan are done.
## Goal Tracker System
This loop uses a **Goal Tracker** to prevent goal drift across iterations:
### Structure
- **IMMUTABLE SECTION**: Ultimate Goal and Acceptance Criteria (set in Round 0, never changed)
- **MUTABLE SECTION**: Active Tasks, Completed Items, Deferred Items, Plan Evolution Log
### Key Features
1. **Acceptance Criteria**: Each task maps to a specific AC - nothing can be "forgotten"
2. **Task Tag Routing**: Every task should carry `coding` or `analyze` tag from plan generation
- `coding -> Claude`, `analyze -> Codex`
3. **Plan Evolution Log**: If you discover the plan needs changes, document the change with justification
4. **Explicit Deferrals**: Deferred tasks require strong justification and impact analysis
5. **Full Alignment Checks**: At configurable intervals (default every 5 rounds: rounds 4, 9, 14, etc.), Codex conducts a comprehensive goal alignment audit. Use `--full-review-round N` to customize (min: 2)
### How to Use
1. **Round 0**: Initialize the Goal Tracker with Ultimate Goal and Acceptance Criteria
2. **Each Round**: Update task status, log plan changes, note discovered issues
3. **Before Exit**: Ensure goal-tracker.md reflects current state accurately
## Important Rules
1. **Write summaries**: Always write your work summary to the specified file before exiting
2. **Maintain Goal Tracker**: Keep goal-tracker.md up-to-date with your progress
3. **Be thorough**: Include details about what was implemented, files changed, and tests added
4. **No cheating**: Do not try to exit the loop by editing state files or running cancel commands
5. **Trust the process**: Codex's feedback helps improve the implementation
## BitLesson Workflow (Project Level)
Each project must maintain its own `.humanize/bitlesson.md` file.
If missing, `start-rlcr-loop` initializes it automatically with a strict template.
Per round requirements:
1. Read `.humanize/bitlesson.md` before execution
2. Run `bitlesson-selector` for each task/sub-task
3. Apply selected lesson IDs (or `NONE`) during implementation
4. Include `## BitLesson Delta` in the round summary with `Action: none|add|update`
If a problem is solved only after multiple rounds, add or update a precise lesson entry in `.humanize/bitlesson.md` (specific problem + specific solution).
By default, empty `.humanize/bitlesson.md` does not block `Action: none`; use `--require-bitlesson-entry-for-none` to enforce strict blocking.
## Stopping the Loop
- Reach the maximum iteration count
- Codex confirms completion with "COMPLETE", followed by successful code review (no `[P0-9]` issues)
- User runs `/humanize:cancel-rlcr-loop`
## Two-Phase System
The RLCR loop has two phases within the active loop:
1. **Implementation Phase**: Work by task tags (`coding -> Claude`, `analyze -> /humanize:ask-codex`), then Codex reviews your summary
2. **Review Phase**: After COMPLETE, `codex review` checks code quality with `[P0-9]` severity markers
The `--base-branch` option specifies the base branch for code review comparison. If not provided, it auto-detects from: remote default > local main > local master.
## Skip Implementation Mode
Use `--skip-impl` to skip the implementation phase and go directly to code review:
```bash
/humanize:start-rlcr-loop --skip-impl
```
In this mode:
- Plan file is optional (not required)
- No goal tracker initialization needed
- Immediately starts code review when you try to exit
- Useful for reviewing existing changes without an implementation plan
This is helpful when you want to:
- Review code changes made outside of an RLCR loop
- Get code quality feedback on existing work
- Skip the implementation tracking overhead for simple tasks
{
"description": "Humanize Codex Hooks - Native Stop hooks for RLCR loops",
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "{{HUMANIZE_RUNTIME_ROOT}}/hooks/loop-codex-stop-hook.sh",
"timeout": 7200,
"statusMessage": "humanize RLCR stop hook"
}
]
}
]
}
}
{
"codex_model": "gpt-5.5",
"codex_effort": "high",
"bitlesson_model": "haiku",
"agent_teams": false,
"alternative_plan_language": "",
"gen_plan_mode": "discussion"
}
# Bitter Lesson Workflow
BitLesson is the repository's Bitter Lesson-style knowledge capture system for RLCR rounds.
## Configuration
The selector reads `bitlesson_model` from the merged config hierarchy:
1. `config/default_config.json`
2. `~/.config/humanize/config.json`
3. `.humanize/config.json`
4. CLI flags where applicable
Provider routing is automatic:
- `gpt-*`, `o[N]-*` (e.g. `o1-*`, `o3-*`, `o4-*`) route to Codex
- `claude-*`, `haiku`, `sonnet`, `opus` route to Claude
If the configured provider binary is missing, the selector falls back to the default Codex model so the loop can still proceed.
When installing the Humanize runtime into Codex CLI, Humanize writes
`provider_mode: "codex-only"` into that runtime's user config. When that mode
is present, the selector forces BitLesson selection onto the Codex/OpenAI path
before provider resolution, even if an older default such as `haiku` would
otherwise route to Claude. This is not a repository-level limitation: Claude
Code and Kimi installs are supported separately.
## Workflow
Each project keeps its BitLesson knowledge base at `.humanize/bitlesson.md`.
When `start-rlcr-loop` begins:
1. The file is initialized from `templates/bitlesson.md` if it does not already exist
2. Each task or sub-task runs through `scripts/bitlesson-select.sh`
3. The selected lesson IDs are applied during implementation, or `NONE` is recorded when nothing matches
4. The stop gate validates a required `## BitLesson Delta` section in every round summary
## Summary Contract
Required summary shape:
```markdown
## BitLesson Delta
- Action: none|add|update
- Lesson ID(s): <IDs or NONE>
- Notes: <what changed and why>
```
Validation rules are strict:
- `Action: none` must use `Lesson ID(s): NONE` or leave the field empty
- `Action: add` and `Action: update` must reference concrete `BL-YYYYMMDD-short-name` IDs that exist in `.humanize/bitlesson.md`
- `--require-bitlesson-entry-for-none` can be used to block empty knowledge bases from repeatedly reporting `none`
# Draft: Optimize viz-dashboard — Merge into `humanize monitor` as a Web View
## Background
The `feat/viz-dashboard` branch currently introduces a `/humanize:viz` Claude
slash command and a local visualization dashboard for Humanize. While the
dashboard does show some data, the visualization of a *live, dynamically
running RLCR loop* is not clear enough today: status, progress per round, and
streamed log output are hard to follow as a loop progresses.
Separately, Humanize already ships a CLI-side monitoring capability that the
user runs in another terminal (NOT inside Claude Code):
```bash
source <path/to/humanize>/scripts/humanize.sh # or add to .bashrc / .zshrc
humanize monitor rlcr # RLCR loop
humanize monitor skill # All skill invocations (codex + gemini)
humanize monitor codex # Codex invocations only
humanize monitor gemini # Gemini invocations only
```
This monitor capability already captures live state (RLCR rounds, skill / Codex
/ Gemini invocations, log output). The web dashboard does not need to invent
its own capture pipeline — it should consume what `humanize monitor` already
provides.
## Goal
Optimize the viz-dashboard branch so that:
1. The dashboard becomes a **web view** layered on top of the existing
`humanize monitor` data sources, rather than an independent capture layer.
2. The dashboard can show **multiple live RLCR loops simultaneously**, with
per-loop status and streamed log output.
3. The entry point moves out of Claude (no more `/humanize:viz` slash command)
and into the `humanize monitor` CLI command, as a new web-online viewing
subcommand.
4. The new capability targets **online / remote viewing in a browser**, not a
local-only viewer that requires the user to be on the same machine running
Claude.
5. Useful features from the existing viz-dashboard branch — notably **cross-
conversation querying** (browsing past sessions / loops across different
Claude conversations) — are preserved.
## Non-goals
- Reimplementing the monitor capture pipeline (`humanize monitor rlcr/skill/
codex/gemini`). The dashboard consumes it; it does not replace it.
- Continuing to ship `/humanize:viz` as a Claude slash command.
- Adding chart panels or features explicitly removed in commit 1b575fe
("multi-project switcher + restart + remove chart panels").
## Required behaviors
1. **CLI entry point unification**
- Remove `commands/viz.md` and any `/humanize:viz` Claude command surface.
- Add a new `humanize monitor` subcommand (name to be agreed during
planning, e.g. `humanize monitor web` or `humanize monitor dashboard`)
that starts the web dashboard server.
- The other `humanize monitor rlcr|skill|codex|gemini` subcommands must
keep working unchanged (terminal-attached live tail).
2. **Live multi-loop view**
- The web dashboard MUST be able to display 2+ concurrently running RLCR
loops at the same time, each with:
- current status (running, paused, converged, stopped, …)
- current round / phase
- live streamed log output, updated in near real time
3. **Reuse existing monitor data**
- The dashboard MUST source its data from the same files / events that
`humanize monitor rlcr/skill/codex/gemini` already read. It MUST NOT add
a parallel capture mechanism (no new hooks just for the dashboard).
4. **Online / remote-viewable**
- The dashboard MUST be reachable from a browser over the network, not
only via `localhost` on the machine running Claude. Concrete binding /
auth design to be agreed during planning.
5. **Cross-conversation history**
- Cross-conversation querying (browsing past loops from different Claude
conversations / sessions) from the existing viz-dashboard branch MUST be
preserved.
## Branch hygiene
Before implementation begins, the branch `feat/viz-dashboard` MUST be rebased
onto the latest `upstream/dev` (humania-org/humanize). Several relevant changes
have landed on `upstream/dev` after the branch diverged, including:
- `Add ask-gemini skill and tool-filtered monitor subcommands` (introduces the
`humanize monitor skill|codex|gemini` subcommands the dashboard must reuse)
- `Remove PR loop feature entirely` (the viz-dashboard branch still references
PR-loop concepts via `commands/cancel-pr-loop.md`, `commands/start-pr-loop.md`,
`hooks/pr-loop-stop-hook.sh`)
- Multiple monitor / hook fixes
The rebase is therefore both a precondition for correctness (the dashboard
consumes the new monitor subcommands) and a cleanup step (PR-loop references
must be dropped).
## Out of scope (for this plan)
- Changes to RLCR semantics, hooks, or skill behavior.
- Authentication providers, identity systems, or multi-user account models —
basic remote-access protection is in scope, but full IAM is not.
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 750 370" width="750" height="370">
<!-- Arrows (behind nodes) -->
<!-- Plan -> Implement -->
<line x1="155" y1="160" x2="225" y2="160" stroke="#888" stroke-width="2" marker-end="url(#arrowhead)"/>
<!-- Implement -> Review Summary -->
<line x1="375" y1="160" x2="445" y2="160" stroke="#888" stroke-width="2" marker-end="url(#arrowhead)"/>
<!-- Review Summary -> Code Review (COMPLETE) -->
<line x1="510" y1="190" x2="510" y2="240" stroke="#888" stroke-width="2" marker-end="url(#arrowhead)"/>
<!-- Code Review -> Done (No Issues) -->
<line x1="570" y1="275" x2="675" y2="330" stroke="#888" stroke-width="2" marker-end="url(#arrowhead)"/>
<!-- Feedback loop: Review Summary -> Implement (curved back, top) -->
<path d="M 510 135 C 510 70, 300 70, 300 135" fill="none" stroke="#e07020" stroke-width="2" stroke-dasharray="6,3" marker-end="url(#arrowOrange)"/>
<!-- Issues loop: Code Review -> Implement bottom-right corner -->
<path d="M 450 275 C 425 245, 400 215, 370 195" fill="none" stroke="#9050c0" stroke-width="2" stroke-dasharray="6,3" marker-end="url(#arrowPurple)"/>
<!-- Arrow markers -->
<defs>
<marker id="arrowhead" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto">
<polygon points="0 0, 10 3.5, 0 7" fill="#888"/>
</marker>
<marker id="arrowOrange" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto">
<polygon points="0 0, 10 3.5, 0 7" fill="#e07020"/>
</marker>
<marker id="arrowPurple" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto">
<polygon points="0 0, 10 3.5, 0 7" fill="#9050c0"/>
</marker>
</defs>
<!-- Node: Your Plan -->
<rect x="30" y="130" width="120" height="60" rx="10" fill="#dbeafe" stroke="#3b82f6" stroke-width="2"/>
<text x="90" y="155" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="14" font-weight="bold">Your Plan</text>
<text x="90" y="175" text-anchor="middle" fill="#555" font-family="sans-serif" font-size="11">(plan.md)</text>
<!-- Node: Implement & Summarize (with Swarm Mode) -->
<!-- Swarm Mode: stacked instances behind main node -->
<rect x="238" y="122" width="140" height="60" rx="10" fill="#fff7ed" stroke="#f97316" stroke-width="1.5" stroke-dasharray="4,2"/>
<rect x="234" y="126" width="140" height="60" rx="10" fill="#ffedd5" stroke="#f97316" stroke-width="1.5" stroke-dasharray="4,2"/>
<!-- Main Implement node (front) -->
<rect x="230" y="130" width="140" height="60" rx="10" fill="#ffedd5" stroke="#f97316" stroke-width="2"/>
<text x="300" y="150" text-anchor="middle" fill="#b45309" font-family="sans-serif" font-size="11" font-style="italic">Claude:</text>
<text x="300" y="170" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="13" font-weight="bold">Working on it!</text>
<!-- Swarm workers (bigger) -->
<line x1="260" y1="190" x2="260" y2="204" stroke="#f97316" stroke-width="1.5"/>
<line x1="300" y1="190" x2="300" y2="204" stroke="#f97316" stroke-width="1.5"/>
<line x1="340" y1="190" x2="340" y2="204" stroke="#f97316" stroke-width="1.5"/>
<circle cx="260" cy="216" r="12" fill="#fff7ed" stroke="#f97316" stroke-width="1.5"/>
<circle cx="300" cy="216" r="12" fill="#fff7ed" stroke="#f97316" stroke-width="1.5"/>
<circle cx="340" cy="216" r="12" fill="#fff7ed" stroke="#f97316" stroke-width="1.5"/>
<text x="260" y="220" text-anchor="middle" fill="#c2410c" font-family="sans-serif" font-size="10" font-weight="bold">T1</text>
<text x="300" y="220" text-anchor="middle" fill="#c2410c" font-family="sans-serif" font-size="10" font-weight="bold">T2</text>
<text x="340" y="220" text-anchor="middle" fill="#c2410c" font-family="sans-serif" font-size="10" font-weight="bold">T3</text>
<text x="300" y="248" text-anchor="middle" fill="#c2410c" font-family="sans-serif" font-size="10" font-style="italic">Army of Swarm</text>
<!-- Node: Review Summary -->
<rect x="450" y="130" width="120" height="60" rx="10" fill="#f3e8ff" stroke="#a855f7" stroke-width="2"/>
<text x="510" y="150" text-anchor="middle" fill="#7c3aed" font-family="sans-serif" font-size="11" font-style="italic">Codex:</text>
<text x="510" y="170" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="13" font-weight="bold">You finished?</text>
<!-- Node: Code Review -->
<rect x="450" y="245" width="120" height="60" rx="10" fill="#f3e8ff" stroke="#a855f7" stroke-width="2"/>
<text x="510" y="268" text-anchor="middle" fill="#7c3aed" font-family="sans-serif" font-size="11" font-style="italic">Codex:</text>
<text x="510" y="288" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="13" font-weight="bold">Your work good?</text>
<!-- Node: Done -->
<circle cx="700" cy="330" r="25" fill="#dcfce7" stroke="#22c55e" stroke-width="2"/>
<text x="700" y="335" text-anchor="middle" fill="#333" font-family="sans-serif" font-size="14" font-weight="bold">Done</text>
<!-- Labels on arrows -->
<text x="400" y="80" text-anchor="middle" fill="#e07020" font-family="sans-serif" font-size="11" font-style="italic">No! Your work not finished!</text>
<text x="525" y="222" text-anchor="start" fill="#888" font-family="sans-serif" font-size="11">COMPLETE</text>
<text x="388" y="236" text-anchor="start" fill="#9050c0" font-family="sans-serif" font-size="11" font-style="italic">No! Bug found!</text>
<text x="625" y="295" text-anchor="start" fill="#888" font-family="sans-serif" font-size="11">No Issues</text>
</svg>
# Install LightOp KernelPilot Humanize for Claude Code
## Prerequisites
- [codex](https://github.com/openai/codex) -- OpenAI Codex CLI (for review). Verify with `codex --version`.
- `jq` -- JSON processor. Verify with `jq --version`.
- `git` -- Git version control. Verify with `git --version`.
## Option 1: LightOp KernelPilot Marketplace (Recommended)
Clone KernelPilot, add the repository root as a Claude Code marketplace, install
the Humanize plugin, and expose the LightOp/DCU knowledge base as a Claude Code
skill:
```bash
git clone https://github.com/BBuf/kernel-pilot.git
cd kernel-pilot
humanize/scripts/install-skills-claude.sh
```
The installer performs the marketplace install, links `lightop-kernel-knowledge`,
installs the query dependency, hydrates Claude Code's installed skill cache with
absolute `HUMANIZE_RUNTIME_ROOT` and `KERNELPILOT_ROOT` paths, and fails if
either placeholder remains. Use the wrapper after manual plugin updates too,
because Claude Code does not hydrate `SKILL.md` placeholders during
`plugin install`.
Manual equivalent:
```bash
claude plugin marketplace add ./
claude plugin install humanize@KernelPilot
mkdir -p ~/.claude/skills
ln -s "$PWD/knowledge" ~/.claude/skills/lightop-kernel-knowledge
python3 -m pip install -r knowledge/requirements.txt
humanize/scripts/install-skills-claude.sh --skip-pip
```
Restart Claude Code after installing. If you prefer to run the marketplace
commands inside an existing Claude Code session, the equivalent slash commands
are:
```text
/plugin marketplace add /path/to/kernel-pilot
/plugin install humanize@KernelPilot
```
## Option 2: One-session Local Development
If you have the plugin cloned locally:
```bash
claude --plugin-dir /path/to/kernel-pilot/humanize \
--add-dir /path/to/kernel-pilot
```
This loads the plugin only for that Claude Code session. Add the knowledge skill
separately if you want `lightop-kernel-knowledge` discovery:
```bash
mkdir -p ~/.claude/skills
ln -s /path/to/kernel-pilot/knowledge ~/.claude/skills/lightop-kernel-knowledge
```
## Option 3: Upstream Humanize Only
If you only need generic Humanize RLCR and do not need KernelPilot's kernel
loop or knowledge pack, install the upstream Humanize marketplace instead:
```text
/plugin marketplace add PolyArch/humanize
/plugin install humanize@PolyArch
```
That upstream plugin is useful for general implementation loops, but it does
not provide `lightop-kernel-knowledge` from this repository.
## Verify Installation
After installing the LightOp KernelPilot marketplace, you should see Humanize
commands and the LightOp/DCU skills:
```text
/humanize:start-rlcr-loop
/humanize:gen-plan
/humanize:refine-plan
/humanize:ask-codex
lightop-kernel-agent-loop
lightop-kernel-knowledge
dcu-profiler-report
```
You can also inspect the installed plugin from a shell:
```bash
claude plugin list
claude plugin details humanize@KernelPilot
```
## Monitor Setup (Optional)
Add the monitoring helper to your shell for real-time progress tracking:
```bash
# Add to your .bashrc or .zshrc
source ~/.claude/plugins/cache/KernelPilot/humanize/<LATEST.VERSION>/scripts/humanize.sh
```
Then use:
```bash
humanize monitor rlcr # Monitor RLCR loop
```
## Other Install Guides
- [Install for Codex](install-for-codex.md)
- [Install for Kimi](install-for-kimi.md)
## Next Steps
See the [Usage Guide](usage.md) for detailed command reference and configuration options.
# Install Humanize Skills for Codex
This guide explains how to install Humanize for Codex CLI, including the skill runtime (`$CODEX_HOME/skills`) and the native Codex `Stop` hook (`$CODEX_HOME/hooks.json`).
## Quick Install (Recommended)
One-line install from anywhere:
```bash
tmp_dir="$(mktemp -d)" && git clone --depth 1 https://github.com/PolyArch/humanize.git "$tmp_dir/humanize" && "$tmp_dir/humanize/scripts/install-skills-codex.sh"
```
From the Humanize repo root:
```bash
./scripts/install-skills-codex.sh
```
Or use the unified installer directly:
```bash
./scripts/install-skill.sh --target codex
```
This will:
- Sync `humanize`, `humanize-gen-plan`, `humanize-refine-plan`, and `humanize-rlcr` into `${CODEX_HOME:-~/.codex}/skills`
- Copy runtime dependencies into `${CODEX_HOME:-~/.codex}/skills/humanize`
- Install/update native Humanize Stop hooks in `${CODEX_HOME:-~/.codex}/hooks.json`
- Enable the native `hooks` feature in `${CODEX_HOME:-~/.codex}/config.toml` when `codex` is available
- Seed `~/.config/humanize/config.json` with a Codex/OpenAI `bitlesson_model` when that key is not already set
- Mark that target's runtime config as `provider_mode: "codex-only"` when
using `--target codex`, so helper model routing stays on the Codex/OpenAI
path for that Codex installation.
- Use RLCR defaults: `codex exec` with `gpt-5.5:high`, `codex review` with `gpt-5.5:high`
Requires Codex CLI `0.114.0` or newer for native hooks. The hooks feature was renamed to `hooks`; older Codex builds that still expose `codex_hooks` are not supported by the Codex install path.
## Verify
```bash
ls -la "${CODEX_HOME:-$HOME/.codex}/skills"
```
Expected directories:
- `humanize`
- `humanize-gen-plan`
- `humanize-refine-plan`
- `humanize-rlcr`
Runtime dependencies in `humanize/`:
- `scripts/`
- `hooks/`
- `prompt-template/`
- `templates/`
- `config/`
- `agents/`
Installed files/directories:
- `${CODEX_HOME:-~/.codex}/skills/humanize/SKILL.md`
- `${CODEX_HOME:-~/.codex}/skills/humanize-gen-plan/SKILL.md`
- `${CODEX_HOME:-~/.codex}/skills/humanize-refine-plan/SKILL.md`
- `${CODEX_HOME:-~/.codex}/skills/humanize-rlcr/SKILL.md`
- `${CODEX_HOME:-~/.codex}/skills/humanize/scripts/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/hooks/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/prompt-template/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/templates/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/config/`
- `${CODEX_HOME:-~/.codex}/skills/humanize/agents/`
- `${CODEX_HOME:-~/.codex}/hooks.json`
- `${XDG_CONFIG_HOME:-~/.config}/humanize/config.json` (created or updated only when Humanize config keys are unset)
Verify native hooks:
```bash
codex features list | rg '^hooks\s'
sed -n '1,220p' "${CODEX_HOME:-$HOME/.codex}/hooks.json"
```
Expected:
- `hooks` is present in `codex features list`
- `hooks.json` contains `loop-codex-stop-hook.sh`
- `${XDG_CONFIG_HOME:-~/.config}/humanize/config.json` contains `bitlesson_model` set to a Codex/OpenAI model such as `gpt-5.5`
- for `--target codex`, `${XDG_CONFIG_HOME:-~/.config}/humanize/config.json`
also contains `provider_mode: "codex-only"` for that Codex runtime
## Optional: Install for Both Codex and Kimi
```bash
./scripts/install-skill.sh --target both
```
## Useful Options
```bash
# Preview without writing
./scripts/install-skills-codex.sh --dry-run
# Custom Codex skills dir
./scripts/install-skills-codex.sh --codex-skills-dir /custom/codex/skills
# Reinstall only the native hooks/config
./scripts/install-codex-hooks.sh
```
## Troubleshooting
If scripts are not found from installed skills:
```bash
ls -la "${CODEX_HOME:-$HOME/.codex}/skills/humanize/scripts"
```
If native exit gating does not trigger:
```bash
codex features enable hooks
sed -n '1,220p' "${CODEX_HOME:-$HOME/.codex}/hooks.json"
```
If the installer reports that your config or installed Codex still uses `codex_hooks`, upgrade Codex first or change `${CODEX_HOME:-~/.codex}/config.toml` to `[features]\nhooks = true`.
# Install Humanize for Kimi CLI
This guide explains how to install the Humanize skills for [Kimi Code CLI](https://github.com/MoonshotAI/kimi-cli).
## Overview
Humanize provides four Agent Skills for kimi:
| Skill | Type | Purpose |
|-------|------|---------|
| `humanize` | Standard | General guidance for all workflows |
| `humanize-gen-plan` | Flow | Generate structured plan from draft |
| `humanize-refine-plan` | Flow | Refine annotated plan with CMT blocks |
| `humanize-rlcr` | Flow | Iterative development with Codex review |
## Installation
### Quick Install (Recommended)
From the Humanize repo root, run:
```bash
./scripts/install-skills-kimi.sh
```
This command will:
- Sync `humanize`, `humanize-gen-plan`, `humanize-refine-plan`, and `humanize-rlcr` into `~/.config/agents/skills`
- Copy runtime dependencies into `~/.config/agents/skills/humanize`
Common installer script (all targets):
```bash
./scripts/install-skill.sh --target kimi
```
### Manual Install
### 1. Clone or navigate to the humanize repository
```bash
cd /path/to/humanize
```
### 2. Copy skills and runtime bundle to kimi's skills directory
```bash
# Create the skills directory if it doesn't exist
mkdir -p ~/.config/agents/skills
# Copy all four skills
cp -r skills/humanize ~/.config/agents/skills/
cp -r skills/humanize-gen-plan ~/.config/agents/skills/
cp -r skills/humanize-refine-plan ~/.config/agents/skills/
cp -r skills/humanize-rlcr ~/.config/agents/skills/
# Copy runtime dependencies used by the skills
# (must match install-skill.sh's install_runtime_bundle)
cp -r scripts ~/.config/agents/skills/humanize/
cp -r hooks ~/.config/agents/skills/humanize/
cp -r prompt-template ~/.config/agents/skills/humanize/
cp -r templates ~/.config/agents/skills/humanize/
cp -r config ~/.config/agents/skills/humanize/
cp -r agents ~/.config/agents/skills/humanize/
# Hydrate runtime root placeholders inside SKILL.md files
for skill in humanize humanize-gen-plan humanize-refine-plan humanize-rlcr; do
sed -i.bak "s|{{HUMANIZE_RUNTIME_ROOT}}|$HOME/.config/agents/skills/humanize|g" \
"$HOME/.config/agents/skills/$skill/SKILL.md"
done
# Strip user-invocable flag from SKILL.md files for runtime visibility
# (This matches the behavior of scripts/install-skill.sh)
for skill in humanize humanize-gen-plan humanize-refine-plan humanize-rlcr; do
awk '
BEGIN { in_fm = 0; fm_done = 0 }
/^---[[:space:]]*$/ {
if (fm_done == 0) {
in_fm = !in_fm
if (in_fm == 0) {
fm_done = 1
}
}
print
next
}
in_fm && $0 ~ /^user-invocable:[[:space:]]*/ { next }
{ print }
' "$HOME/.config/agents/skills/$skill/SKILL.md" > "$HOME/.config/agents/skills/$skill/SKILL.md.tmp"
mv "$HOME/.config/agents/skills/$skill/SKILL.md.tmp" "$HOME/.config/agents/skills/$skill/SKILL.md"
done
```
### 3. Verify installation
```bash
# List installed skills
ls -la ~/.config/agents/skills/
# Should show:
# humanize/
# humanize-gen-plan/
# humanize-refine-plan/
# humanize-rlcr/
```
### 4. Restart kimi (if already running)
Skills are loaded at startup. Restart kimi to pick up the new skills:
```bash
# Exit current kimi session
/exit
# Or press Ctrl-D
# Start kimi again
kimi
```
## Usage
### List available skills
```bash
/help
```
Look for the "Skills" section in the help output.
### Use the skills
#### 1. Generate plan from draft
```bash
# Start the flow (will ask for input/output paths)
/flow:humanize-gen-plan
# Or load as standard skill
/skill:humanize-gen-plan
```
#### 2. Start RLCR development loop
```bash
# Start with plan file
/flow:humanize-rlcr path/to/plan.md
# With options
/flow:humanize-rlcr path/to/plan.md --max 20 --push-every-round
# Skip implementation, go directly to code review
/flow:humanize-rlcr --skip-impl
# Load as standard skill (no auto-execution)
/skill:humanize-rlcr
```
#### 3. Get general guidance
```bash
/skill:humanize
```
## Command Options
### RLCR Loop Options
| Option | Description | Default |
|--------|-------------|---------|
| `path/to/plan.md` | Plan file path | Required (unless --skip-impl) |
| `--max N` | Maximum iterations | 84 |
| `--codex-model MODEL:EFFORT` | Codex model | gpt-5.5:high |
| `--codex-timeout SECONDS` | Review timeout | 5400 |
| `--base-branch BRANCH` | Base for code review | auto-detect |
| `--full-review-round N` | Full alignment check interval | 5 |
| `--skip-impl` | Skip to code review | false |
| `--push-every-round` | Push after each round | false |
### Generate Plan Options
| Option | Description | Required |
|--------|-------------|----------|
| `--input <path>` | Draft file path | Yes |
| `--output <path>` | Plan output path | Yes |
## Prerequisites
Ensure you have `codex` CLI installed:
```bash
codex --version
```
The skills will use `gpt-5.5` with `high` effort level by default.
## Uninstall
To remove the skills:
```bash
rm -rf ~/.config/agents/skills/humanize
rm -rf ~/.config/agents/skills/humanize-gen-plan
rm -rf ~/.config/agents/skills/humanize-refine-plan
rm -rf ~/.config/agents/skills/humanize-rlcr
```
## Troubleshooting
### Skills not showing up
1. Check the skills directory exists:
```bash
ls ~/.config/agents/skills/
```
2. Ensure SKILL.md files are present:
```bash
cat ~/.config/agents/skills/humanize/SKILL.md | head -5
```
3. Restart kimi completely
### Codex not found
The skills expect `codex` to be in your PATH. If using a proxy, ensure `~/.zprofile` is configured:
```bash
# Add to ~/.zprofile if needed
export OPENAI_API_KEY="your-api-key"
# or other proxy settings
```
### Scripts not found
If skills report missing scripts like `setup-rlcr-loop.sh`, verify:
```bash
ls -la ~/.config/agents/skills/humanize/scripts
```
### Installer options
The installer supports:
```bash
./scripts/install-skill.sh --help
```
Common examples:
```bash
# Preview only
./scripts/install-skills-kimi.sh --dry-run
# Custom skills directory
./scripts/install-skills-kimi.sh --skills-dir /custom/skills/dir
```
### Output files not found
The skills save output to:
- Cache: `~/.cache/humanize/<project>/<timestamp>/`
- Loop data: `.humanize/rlcr/<timestamp>/`
Ensure these directories are writable.
## See Also
- [Kimi CLI Documentation](https://moonshotai.github.io/kimi-cli/)
- [Agent Skills Format](https://agentskills.io/)
- [Install for Codex](./install-for-codex.md)
- [Humanize README](../README.md)
# Optimize viz-dashboard: Merge into `humanize monitor` as a Web View
## Goal Description
Optimize the `feat/viz-dashboard` branch so that the RLCR visualization becomes a web view layered on top of the existing `humanize monitor` data sources, supports multiple concurrent live RLCR loops with real-time streamed log output, moves the entry point out of Claude (no more `/humanize:viz` slash command) into a new `humanize monitor web` CLI subcommand, exposes the dashboard for online (browser) viewing with explicit network-binding and authentication controls, and preserves cross-conversation history browsing.
The dashboard MUST consume the same files and events that `humanize monitor rlcr|skill|codex|gemini` already read; it MUST NOT introduce a parallel capture pipeline (no new hooks just for the dashboard). The single-server-per-project model replaces the existing server-global project switcher to eliminate the cross-client mutation bug. Remote access defaults to safe (localhost-only) and requires an explicit token to expose data or actions to the network.
## Acceptance Criteria
Following TDD philosophy, each criterion includes positive and negative tests for deterministic verification.
- AC-1: CLI entry-point migration from Claude command to `humanize monitor web`.
- Positive Tests (expected to PASS):
- `humanize monitor web --project <p>` starts the dashboard server and prints the bound URL.
- `humanize monitor rlcr`, `humanize monitor skill`, `humanize monitor codex`, `humanize monitor gemini` continue to behave exactly as before this change (verified by snapshot tests of usage text and exit behavior).
- `humanize monitor` (no subcommand) prints usage that includes `web` alongside `rlcr|skill|codex|gemini`.
- Negative Tests (expected to FAIL/be rejected):
- The Claude slash command `/humanize:viz` is no longer registered (`commands/viz.md` removed); attempting to invoke it through Claude does not resolve.
- `humanize monitor unknownsub` exits non-zero with usage; it does NOT silently fall through to a default.
- AC-2: Data-source reuse — no parallel capture pipeline.
- Positive Tests:
- With an active RLCR loop, `viz/server/parser.py` reads session metadata from `.humanize/rlcr/<session>/{state.md,goal-tracker.md,round-*-summary.md,round-*-review-result.md}` AND streamed bytes from `~/.cache/humanize/<sanitized-project>/<session>/round-*-codex-{run,review}.log`.
- A test that intercepts file opens shows the dashboard reading from the same paths the RLCR monitor uses (parity test against `scripts/humanize.sh` cache lookup logic at lines around 284-368).
- Negative Tests:
- Grep over `hooks/` shows no new `*-viz-*.sh` or dashboard-only hook script added.
- Grep over `viz/` shows no path writing to `.humanize/rlcr/` (the dashboard is a reader, not a writer of session state).
- AC-3: Multi-loop concurrent view enumerates all sessions, not only the newest.
- Positive Tests:
- With two concurrent active RLCR loops in the same project, the home page renders both session cards simultaneously, each showing session id, status, current round/max, current phase, and an independently updating live log pane.
- Session enumeration covers ALL directories under `.humanize/rlcr/`, partitioned into "active" (state.md present) vs "historical" (terminal `*-state.md` present).
- Negative Tests:
- The dashboard does NOT auto-switch to the newest session (the single-session behavior of `monitor_find_latest_session` in `scripts/lib/monitor-common.sh` MUST NOT leak into the web view).
- Adding a new active session while another is running does NOT remove or hide the existing one in the UI.
- AC-4: Live-log latency budget — append visible in browser within 2 seconds (HARD requirement).
- Positive Tests:
- An automated test appends N bytes to an active `round-*-codex-run.log`; the browser-side stream client receives those bytes within 2 seconds (measured end-to-end on the test harness).
- The streaming protocol delivers an initial snapshot followed by byte-offset append events (snapshot + offset tail).
- Truncation/rotation of the underlying log triggers a documented resync path (e.g. detect size shrink, restart from snapshot at offset 0).
- Negative Tests:
- The active-log path does NOT use a polling loop that re-fetches the full file body on every update.
- Median measured append-to-render latency under nominal load does NOT exceed 2.0s; failure of this assertion fails CI.
- AC-5: Cross-conversation / historical browsing preserved.
- Positive Tests:
- Completed sessions stored under `.humanize/rlcr/` from prior Claude conversations are listed in the "Historical" section and individually browsable.
- Ending an active loop transitions that session card from "Active" to "Historical" without removing it from view.
- Negative Tests:
- A finished session does NOT disappear from the dashboard after its terminal `*-state.md` appears.
- Switching between active and historical views does NOT clear the other list.
- AC-6: Remote-reachable + access controlled across ALL data surfaces.
- Positive Tests:
- With default flags, the server binds to `127.0.0.1` only.
- With `--host 0.0.0.0` (or any non-localhost host), startup REQUIRES a non-empty `--auth-token` (or the equivalent env var); otherwise the process exits non-zero with a clear error.
- In remote mode, every endpoint (session list, session detail, per-session log SSE stream, control endpoints) requires a valid token; missing/invalid token returns 401.
- Negative Tests:
- Starting the server with `--host 0.0.0.0` without a token does NOT start; it errors out.
- An unauthenticated remote request to `/api/sessions/<id>` or the per-session SSE stream is rejected with 401, not served.
- The server does NOT bind to `0.0.0.0` by default under any path of `humanize monitor web`.
- AC-7: Session-targeted cancel built and tested (per DEC-2 = build session-scoped cancel).
- Positive Tests:
- A new session-scoped cancel shell helper (next to `scripts/cancel-rlcr-loop.sh`) accepts a session id and cancels only that session.
- The dashboard cancel UI hits a per-session API; cancelling session A does not affect session B.
- Negative Tests:
- Calling the per-session cancel endpoint without specifying a session id returns 400, not a project-wide cancel.
- The dashboard does NOT directly call the existing project-global `scripts/cancel-rlcr-loop.sh` without a session id.
- AC-8: Multi-instance / project-isolation cleanups (per DEC-3 = CLI-fixed single project).
- Positive Tests:
- `viz/scripts/viz-start.sh` (or its replacement) uses a per-project tmux session name so starting a second project's dashboard does NOT kill the first.
- The per-project port file `.humanize/viz.port` is also per-project and does not collide.
- The server binds to one project chosen at startup via `--project`; there is no runtime project switch endpoint.
- Negative Tests:
- `viz/server/app.py` no longer exposes `/api/projects/switch` (or it returns 410/501 with a deprecation message).
- `viz/static/js/app.js` and `viz/static/js/actions.js` no longer render or wire a project switcher / "+ Add" UI; tests grep for these handlers and assert their removal.
- Starting `humanize monitor web --project A` while a `--project B` instance is already running does NOT terminate the project-B server.
- AC-9: Test coverage matrix.
- Positive Tests (the suite must include and pass):
- Two concurrent active RLCR sessions render and stream independently.
- Session with `.humanize/rlcr/<session>` metadata but no cache logs yet (startup race) renders without crashing and recovers when logs appear.
- Cache-log truncation/rotation triggers a documented resync rather than silent stall.
- Remote-mode auth enforcement: missing/invalid token => 401 on every data and control endpoint.
- Project-isolation: starting a second `humanize monitor web --project <other>` does NOT affect the first.
- Backward-compat: `humanize monitor rlcr|skill|codex|gemini` outputs unchanged (snapshot tests).
- Cache-path / session-mapping parity tests against `scripts/humanize.sh` (the source of truth at lines around 284-368).
- Negative Tests:
- Tests do NOT write into the user's real `~/.humanize` or `~/.cache/humanize`; all fixtures live under a tmp dir or repo `tests/` fixture tree.
- No test depends on network access to the public internet.
- AC-10: Code style compliance.
- Positive Tests:
- Grep over `viz/`, `scripts/`, and changed `commands/`/`hooks/` files for the literal substrings `AC-`, `Milestone`, `Step `, `Phase ` (with trailing space) returns zero matches in implementation code or comments (matches in plan/doc files do not count).
- Negative Tests:
- Adding new code with any of those workflow markers fails the style check.
## Path Boundaries
Path boundaries define the acceptable range of implementation quality and choices.
### Upper Bound (Maximum Acceptable Scope)
The implementation provides:
- An RLCR-specific Python helper (e.g. `viz/server/rlcr_sources.py`) that owns session enumeration and cache-log path discovery, with parity tests against `scripts/humanize.sh` (lines around 284-368).
- A frozen one-page event-protocol contract document (output of T2 architecture review) that fixes snapshot+byte-offset semantics, truncation/rotation handling, and the per-session vs project channel scoping.
- Per-session SSE streams over HTTP(S), each carrying an initial snapshot followed by append events identified by file path + byte offset.
- Bearer-token auth via query parameter on SSE streams and via `Authorization` header on standard HTTP endpoints; flask_sock WebSocket retained ONLY for localhost-bound deployments.
- Session-targeted cancel: a new `scripts/cancel-rlcr-session.sh` (or named equivalent) helper plus a per-session API endpoint, fully tested.
- A multi-loop UI grid that always shows every active session at once, with an inline expand-to-detail per-session log pane (no full-page navigation required to see live logs).
- A single-project-per-server CLI model: `humanize monitor web --project <path>`. The `/api/projects/switch` endpoint and the `+ Add` / Switch UI elements in `viz/static/js/app.js` and `viz/static/js/actions.js` are fully removed.
- Per-project tmux session naming and per-project port file for the optional `--daemon` mode (per DEC-1).
- Documentation for two remote-deployment patterns (SSH tunnel example FIRST, LAN bind example SECOND) plus an upgrade note explaining the `/humanize:viz` removal.
- Full test matrix per AC-9.
### Lower Bound (Minimum Acceptable Scope)
The implementation provides:
- Extensions to the existing `viz/server/parser.py` and `viz/server/watcher.py` so they additionally ingest cache round logs (`codex-run.log`, `codex-review.log`, gemini variants when present) and emit append events with byte offsets.
- A new per-session SSE endpoint in `viz/server/app.py` that supports the snapshot+offset protocol agreed in the T2 contract document, including a documented resync path for truncation.
- A new `humanize monitor web` dispatch entry in `scripts/humanize.sh` (alongside `rlcr|skill|codex|gemini`) that runs the dashboard in the foreground by default; an optional `--daemon` flag launches the existing tmux-managed server with a per-project tmux name and port file.
- `--host`, `--port`, `--auth-token` flags in `viz/server/app.py` (and forwarded by `humanize monitor web`); the server binds to `127.0.0.1` by default; non-localhost binding requires a non-empty token; unauthenticated remote requests are rejected on EVERY data and control endpoint, not just mutators.
- Removal of the server-global project switch: `/api/projects/switch` and the `+ Add` / Switch UI flows in `viz/static/js/app.js` and `viz/static/js/actions.js` are removed. `viz-projects.json` is no longer mutated by the server in v1.
- Removal of `/humanize:viz`: `commands/viz.md` and `skills/humanize-viz/SKILL.md` are deleted; a brief upgrade note is added to `README.md` (or equivalent) pointing users at `humanize monitor web`.
- The session-targeted cancel helper and per-session cancel API (per DEC-2 = build session-scoped cancel).
- All tests in AC-9 are present and pass in CI.
- Documentation: at minimum, the SSH tunnel deployment pattern.
### Allowed Choices
- Can use:
- The existing Flask + flask_sock stack (retained for localhost) plus a new SSE endpoint for per-session log streams.
- Reusing or extracting helper logic from `scripts/humanize.sh` for RLCR-specific cache-path discovery (RLCR-only — do not merge skill-monitor cache rules).
- Per-session byte offsets, file-path-keyed event streams.
- Either `python -m venv` (current `viz-start.sh` model) or system python for the foreground CLI invocation.
- Token sources: CLI flag `--auth-token <value>`, env var `HUMANIZE_VIZ_TOKEN`, or a token file at `${XDG_CONFIG_HOME:-$HOME/.config}/humanize/viz-token`.
- Cannot use:
- New Claude hooks added solely to capture data for the dashboard.
- Default network bind to `0.0.0.0` (must be opt-in).
- OAuth / OIDC / external IAM providers in v1.
- A cross-language shared "monitor-core" library that conflates the RLCR session model with the skill-invocation model.
- WebSocket as the remote-mode transport for log streams (browser WS cannot set `Authorization` headers; remote streams must be SSE per DEC-4). flask_sock WS may remain for localhost-bound use.
- Project-global cancel paths wired to per-session UI without explicit user warnings (per DEC-2 the dashboard MUST use a session-scoped cancel helper).
> **Note on Deterministic Designs**: DEC-1, DEC-2, DEC-3, and DEC-4 have already been fixed by user decision (recorded under `## Pending User Decisions`). The path boundaries above already reflect those choices and do not leave room for alternative interpretations of those four points.
## Feasibility Hints and Suggestions
> **Note**: This section is for reference and understanding only. These are conceptual suggestions, not prescriptive requirements.
### Conceptual Approach
One viable path:
1. Branch hygiene as a parallel preflight track. Rebase `feat/viz-dashboard` onto `upstream/dev` (currently 9 commits ahead). Conflicts are expected to be small because the branch already includes upstream commits 338b4dd (PR-loop removal) and 016caca (monitor split).
2. Add a small, RLCR-specific Python module (e.g. `viz/server/rlcr_sources.py`) that owns:
- listing all session directories under `.humanize/rlcr/<project>/`,
- mapping each session to its cache-log directory under `~/.cache/humanize/<sanitized-project>/<session>/`,
- returning per-session live log file paths (`round-N-codex-run.log`, `round-N-codex-review.log`, gemini variants).
Cover this module with parity tests that compare its outputs against the discovery logic in `scripts/humanize.sh` (around lines 284-368).
3. Run a focused architecture-review consultation (T2, `analyze` task via `/humanize:ask-codex`) to freeze the streaming protocol contract: snapshot+offset semantics, truncation/rotation behavior, per-session vs project channel scoping. Output a one-page contract document that subsequent code refers to.
4. Extend `viz/server/parser.py` to use the new helper and to read cache round logs (with graceful fallback when files are missing/partial). Extend `viz/server/watcher.py` to also watch the cache log directory and emit append events with `(path, offset, len)`.
5. Add a per-session SSE endpoint in `viz/server/app.py` keyed by session id; it serves a snapshot then appends; it survives truncation by detecting size shrink and restarting from offset 0 with a documented resync event.
6. Add `humanize monitor web` to the dispatch in `scripts/humanize.sh` next to `rlcr|skill|codex|gemini`. Foreground default; pass-through `--host`, `--port`, `--auth-token`, `--project`, `--daemon`. The `--daemon` path delegates to a refactored `viz/scripts/viz-start.sh` that uses a per-project tmux name and per-project port file.
7. Delete `commands/viz.md` and `skills/humanize-viz/SKILL.md`; add a one-line note in `README.md` directing users to `humanize monitor web`.
8. Replace the project switcher backend by a CLI-fixed model: remove `/api/projects/switch` from `viz/server/app.py`; remove the switch / + Add UI from `viz/static/js/app.js` and `viz/static/js/actions.js`. The frontend reads only the project the server was started against.
9. Add `--host`, `--port`, `--auth-token`. Default `--host=127.0.0.1`. If host is non-localhost, require a non-empty token. Apply auth middleware to ALL data and control endpoints (session list, session detail, SSE streams, cancel/report). Token propagation in the frontend: `Authorization: Bearer <t>` for fetch; `?token=<t>` query parameter for `EventSource`.
10. Build the session-targeted cancel helper (e.g. `scripts/cancel-rlcr-session.sh`) and wire a `POST /api/sessions/<id>/cancel` route to it. Mirror the existing project-global script's safety conventions.
11. Multi-loop UI: render all active sessions on the home page in a grid, each with an inline live-log pane that opens an SSE stream when expanded. Historical sessions are listed below.
12. Build the test matrix per AC-9. Use a tmp `.humanize/rlcr/` and tmp `~/.cache/humanize/` fixture tree per test.
13. Document the SSH tunnel deployment pattern first; add a LAN bind example second.
### Relevant References
- `scripts/humanize.sh:1196``humanize` dispatcher; this is where `monitor web` is added.
- `scripts/humanize.sh` (around lines 284-368) — current RLCR cache-log discovery logic; source of truth for parity tests.
- `scripts/lib/monitor-common.sh` — shared shell helpers (single-session by design); reused for terminal monitor only.
- `scripts/lib/monitor-skill.sh` — skill cache discovery (separate model from RLCR); deliberately NOT merged into the RLCR helper.
- `scripts/cancel-rlcr-loop.sh` — existing project-global cancel; the new session-scoped helper sits next to it.
- `viz/server/parser.py` — RLCR session parser; extended to read cache logs.
- `viz/server/watcher.py` — watchdog observer; extended to watch cache log dirs and emit append events.
- `viz/server/app.py` — Flask routes; gains `--host/--port/--auth-token`, per-session SSE, session-scoped cancel; loses `/api/projects/switch`.
- `viz/scripts/viz-start.sh` — tmux launcher; refactored for per-project naming and `--daemon` mode.
- `viz/static/js/app.js` and `viz/static/js/actions.js` — UI; loses project switcher; gains multi-session grid + per-session SSE client with token propagation.
- `commands/viz.md`, `skills/humanize-viz/SKILL.md` — deleted.
- `tests/test-viz.sh` — extended with the AC-9 matrix.
- `README.md`, `docs/usage.md` — gain monitor `web` entry and the remote-deploy guide.
## Dependencies and Sequence
### Milestones
1. M0 Branch hygiene (preflight, parallel track):
- Sub-step A: Fetch `upstream/dev`, list the 9 commits ahead, rebase `feat/viz-dashboard`, resolve conflicts.
- Sub-step B: Re-run existing tests (`tests/test-viz.sh` and any monitor smoke test).
- This milestone is NOT a hard gate for design tasks; T1+ may proceed once conflicts are mechanically resolved.
2. M1 Discovery and ingestion:
- Sub-step A: RLCR-specific session+cache-log discovery helper (T1).
- Sub-step B: Parser and watcher extensions to ingest cache round logs (T3, T4).
3. M2 Streaming protocol freeze (architecture gate):
- Sub-step A: Architecture review (T2, analyze) producing a one-page contract document for snapshot+offset semantics, truncation handling, channel scoping.
- This milestone gates T3/T4/T5 implementation details that depend on the contract.
4. M3 Live multi-loop streaming:
- Sub-step A: Per-session SSE endpoint (T5).
- Sub-step B: Multi-loop UI with independent live log panes (T6).
5. M4 CLI consolidation:
- Sub-step A: Add `humanize monitor web` to dispatch (T8).
- Sub-step B: Per-project tmux + port file refactor (T9).
- Sub-step C: Remove `/humanize:viz` (T12).
6. M5 Remote access + safety:
- Sub-step A: `--host/--port/--auth-token` + auth middleware on all surfaces (T11).
- Sub-step B: Remove server-global project switch and frontend switcher (T10).
- Sub-step C: Session-targeted cancel helper + endpoint (T7).
7. M6 Tests + docs:
- Sub-step A: Test matrix per AC-9 (T13).
- Sub-step B: Documentation: README monitor section + remote-deploy guide (T14).
Relative dependencies: M2 must precede the streaming-shape decisions in M1's parser/watcher work and all of M3. M5 access-control work (T11) depends on the basic streaming endpoints (M3) being available so it can layer auth on top. M6 tests depend on M3 + M4 + M5 being feature-complete. M0 is independent and can run alongside M1 until conflicts are mechanically resolved.
## Task Breakdown
Each task includes exactly one routing tag:
- `coding`: implemented by Claude
- `analyze`: executed via Codex (`/humanize:ask-codex`)
| Task ID | Description | Target AC | Tag | Depends On |
|---------|-------------|-----------|-----|------------|
| T0 | Preflight (parallel track): rebase `feat/viz-dashboard` onto `upstream/dev` (9 commits), resolve conflicts, rerun existing tests. NOT a hard gate for T1+. | AC-9 | coding | - |
| T1 | RLCR-specific session + cache-log discovery helper (e.g. `viz/server/rlcr_sources.py`); RLCR-only (do NOT merge skill-monitor cache rules); enumerates ALL sessions under `.humanize/rlcr/`. | AC-2, AC-3 | coding | - |
| T2 | Architecture review: select event protocol shape (snapshot + byte-offset tail, truncation/rotation behavior, per-session vs project channels) and confirm transport (SSE for remote streams + retained flask_sock for localhost only). Output: one-page contract document committed under `docs/`. | AC-4 | analyze | T1 |
| T3 | Extend `viz/server/parser.py` to ingest cache round logs (`codex-run.log`, `codex-review.log`, gemini variants); fall back gracefully when missing or partially written. | AC-2, AC-4 | coding | T2 |
| T4 | Extend `viz/server/watcher.py` to also watch the cache log directory; emit per-file append events `(path, offset, length)` per the T2 contract. | AC-4 | coding | T2 |
| T5 | Per-session SSE endpoint in `viz/server/app.py` per the T2 contract; supports initial snapshot then append; handles rotation/truncation resync. | AC-4 | coding | T3, T4 |
| T6 | Multi-loop UI in `viz/static/js/app.js`: list ALL sessions, partition into Active vs Historical, render every active session simultaneously with an independent live log pane (no fallback to single-session detail view for active loops). | AC-3, AC-5 | coding | T5 |
| T7 | Session-scoped cancel: new `scripts/cancel-rlcr-session.sh` helper + `POST /api/sessions/<id>/cancel` route + UI wiring; do NOT delegate to the project-global `scripts/cancel-rlcr-loop.sh`. | AC-7 | coding | T5 |
| T8 | Add `humanize monitor web` to the dispatch in `scripts/humanize.sh` next to `rlcr|skill|codex|gemini`; foreground default; pass-through `--host/--port/--auth-token/--project/--daemon`; preserve existing subcommands and usage text. | AC-1 | coding | - |
| T9 | Refactor `viz/scripts/viz-start.sh`: per-project tmux session name (no more global `humanize-viz`); per-project port file; only invoked by the `--daemon` path of `humanize monitor web`. | AC-8 | coding | T8 |
| T10 | Remove server-global project mutation in `viz/server/app.py`: remove `/api/projects/switch` (or convert to read-only listing); remove project switcher / + Add flows in `viz/static/js/app.js` and `viz/static/js/actions.js`; do not mutate `viz-projects.json` from server. | AC-5, AC-8 | coding | T8 |
| T11 | Add `--host`, `--port`, `--auth-token` to `viz/server/app.py` + propagate through `viz/scripts/viz-start.sh` and `humanize monitor web`; default `--host=127.0.0.1`; reject non-local startup without token; gate ALL data/control endpoints (session list, session detail, SSE stream, cancel) behind token in remote mode; frontend token propagation: `Authorization: Bearer` for fetch + `?token=...` for SSE `EventSource`. | AC-6 | coding | T5, T10 |
| T12 | Remove `/humanize:viz`: delete `commands/viz.md` and `skills/humanize-viz/SKILL.md`; add a one-line upgrade note in `README.md` pointing users at `humanize monitor web`. | AC-1 | coding | T8 |
| T13 | Test matrix per AC-9: concurrent active loops, missing-cache-log startup, log rotation/truncation recovery, remote auth on every endpoint, project isolation, monitor backward-compat, per-project port-file collision avoidance, parity tests for cache-path/session mapping vs `scripts/humanize.sh`. | AC-9 | coding | T6, T7, T11 |
| T14 | Docs: README monitor section update; remote-deploy guide (SSH tunnel example FIRST, LAN bind example SECOND); upgrade note for `/humanize:viz` removal. | AC-1, AC-6 | coding | T13 |
## Claude-Codex Deliberation
### Agreements
- Reusing the existing `humanize monitor` data sources (the `.humanize/rlcr/<session>/*` files plus `~/.cache/humanize/<project>/<session>/round-*-codex-{run,review}.log`) is the correct architecture; the dashboard is a reader, not a parallel capture pipeline.
- Moving the entry point into the `humanize monitor` dispatch in `scripts/humanize.sh` and removing `/humanize:viz` is a natural extension of the existing CLI shape and avoids a stranded slash-command surface.
- Tightening network exposure with localhost default plus explicit `--host` + `--auth-token` for remote opt-in is the right baseline given the unauthenticated mutators in the current `viz/server/app.py`.
- The current global `humanize-viz` tmux session name in `viz/scripts/viz-start.sh` is a real collision bug; per-project naming is required.
- The feat/viz-dashboard branch already includes upstream commits 338b4dd (PR-loop removal) and 016caca (monitor split). The rebase is therefore drift cleanup (9 commits), not a missing prerequisite.
- The streaming protocol must support snapshot + byte-offset append + truncation/rotation resync; "no full-file refetch loop" was tightened from "append-only forever" to allow legitimate snapshot/resync paths.
### Resolved Disagreements
- Topic: Should the rebase be the dependency root for the entire plan (M0/T0 as a hard gate)?
- Claude (v1): yes, M0 first, T0 blocks all other tasks.
- Codex: no, branch hygiene already includes the critical upstream commits; making T0 a hard gate turns unrelated upstream drift into a blocker for design.
- Resolution: M0/T0 is a parallel preflight track. T1+ may proceed once rebase conflicts are mechanically resolved. Recorded in M0 description and in T0's wording.
- Topic: Should there be a single shared "monitor-core" library consumed by both terminal and web monitors?
- Claude (v1): yes, extract a shared module to keep terminal and web in lockstep.
- Codex: no, the shell `monitor-common.sh` is single-session by design and the web side is Python; forcing a cross-language core conflates models.
- Resolution: do NOT build a shared cross-language core. Keep terminal helpers in shell where they help; build a separate small RLCR-specific Python helper for the web side (`viz/server/rlcr_sources.py`) and validate it via parity tests against `scripts/humanize.sh` cache logic.
- Topic: Should T2 (extract shared cache-discovery helper) merge logic from `scripts/humanize.sh` (RLCR) with `scripts/lib/monitor-skill.sh` (skill invocations)?
- Claude (v1): yes, factor the cache-discovery patterns into one helper.
- Codex: no, RLCR session caches and skill invocation caches are adjacent but different models; merging conflates them.
- Resolution: T1 helper is RLCR-specific only. Skill-monitor cache rules stay separate.
- Topic: When should the architecture review for the streaming protocol shape happen?
- Claude (v1): T13 at the end, after watcher and endpoint code.
- Codex: backwards; it has to gate watcher and endpoint design.
- Resolution: T2 is now an `analyze` task that runs BEFORE T3/T4/T5 and outputs a one-page contract document.
- Topic: Should the streaming protocol forbid full-file refetch entirely?
- Claude (v1): yes, append-only.
- Codex: append-only forever breaks late-joining clients and rotation recovery.
- Resolution: AC-4 reworded to "snapshot + byte-offset append + documented resync" and "no polling loop that re-fetches the full file body on every update." Both intents preserved.
- Topic: Is removing `/api/projects/switch` enough to fix the multi-project bug?
- Claude (v1): yes.
- Codex: no, the frontend switcher / + Add flows in `viz/static/js/app.js` and `viz/static/js/actions.js` would still be wired.
- Resolution: T10 expanded to also remove the frontend switcher chrome; AC-8 expanded to test for the absence of these UI elements.
- Topic: Does remote auth need to cover read endpoints, or just mutators?
- Claude (v2): just mutators.
- Codex: no, read endpoints serve session data too; remote unauth must be blocked everywhere.
- Resolution: AC-6 expanded; T11 expanded to cover ALL data and control surfaces, plus token propagation in the frontend (`Authorization` for fetch, `?token=...` for SSE).
- Topic: Cancel semantics in the multi-loop UI.
- Claude (v1/v2): keep cancel + report.
- Codex: the existing `scripts/cancel-rlcr-loop.sh` is project-global, not session-targeted; either build a session-scoped path or freeze v1 with cancel disabled.
- Resolution: User chose DEC-2 = build session-scoped cancel. T7 builds a new `scripts/cancel-rlcr-session.sh` helper plus a per-session API and tests it.
- Topic: Auth transport for live log streams (browser WebSocket cannot set `Authorization` header).
- Claude (v2): bearer token via `--auth-token`, transport unspecified.
- Codex: WS in browsers cannot send arbitrary auth headers; either define a precise WS auth handshake or drop WS for remote.
- Resolution: User chose DEC-4 = SSE over HTTPS with token query-param for remote streams; flask_sock WS retained for localhost only.
### Convergence Status
- Final Status: `converged`
- Convergence rounds executed: 3 (round 1 surfaced 7 required changes; round 2 surfaced 5 tighteners; round 3 returned no required changes and no high-impact disagreements).
## Pending User Decisions
All decisions raised during planning have been resolved by the user. None remain `PENDING`.
- DEC-1: How should `humanize monitor web` be launched (lifecycle)?
- Claude Position: Foreground default + optional `--daemon` flag; matches CLI monitor UX and avoids hidden processes.
- Codex Position: Either foreground or daemon is defensible, but the v1 plan must pick one to avoid mixed ownership of `viz/scripts/viz-start.sh`.
- Tradeoff Summary: Foreground = matches `humanize monitor rlcr` UX, no orphan tmux sessions, simpler test harness. Daemon = "always on" convenience, but hidden processes and tmux name collisions to manage.
- Decision Status: `Foreground default + --daemon opt-in` (user-confirmed).
- DEC-2: Cancel button policy in the multi-loop dashboard for v1?
- Claude Position: Build a session-scoped cancel.
- Codex Position: Either build a session-scoped path or freeze v1 with cancel disabled; the existing `scripts/cancel-rlcr-loop.sh` is project-global and unsafe in multi-loop mode.
- Tradeoff Summary: Build = correct UX, more work (new shell helper + API + tests). Disable = smaller v1, defers the cancel feature. Keep-global = correctness bug.
- Decision Status: `Build session-scoped cancel` (user-confirmed). T7 builds `scripts/cancel-rlcr-session.sh`.
- DEC-3: How should the dashboard handle multiple projects?
- Claude Position: CLI-fixed single project per server (`humanize monitor web --project <path>`); multi-project means run multiple processes.
- Codex Position: Either CLI-fixed, per-client state, or separate instances per project; ambiguity blocks AC-5/AC-8.
- Tradeoff Summary: CLI-fixed = clean isolation, simple backend, removes the server-global mutation bug, costs the in-server switcher convenience. Per-client = complex backend. Server-global = current bug.
- Decision Status: `CLI-fixed single project per server` (user-confirmed). `/api/projects/switch` is removed; frontend switcher chrome is removed.
- DEC-4: Remote auth transport for live log streaming?
- Claude Position: Bearer token; transport open.
- Codex Position: Browser WebSocket clients cannot set `Authorization` header; pick SSE for remote, or define a precise WS handshake.
- Tradeoff Summary: SSE = clean browser auth via query-param token over HTTPS, append-shaped traffic matches SSE strength, drops bidirectional control. WS = bidirectional but auth requires custom subprotocol/handshake.
- Decision Status: `SSE over HTTPS with token query-param for remote streams; flask_sock WS retained for localhost only` (user-confirmed).
- AC-4 latency budget: hard requirement vs directional target?
- Claude Position: Hard requirement (<=2s) to give "live" a precise meaning.
- Codex Position: Either is defensible; the plan must record the choice.
- Tradeoff Summary: Hard = strict CI assertion, sharper failure mode. Directional = looser SLA, easier to pass under load.
- Decision Status: `Hard requirement (<=2s end-to-end)` (user-confirmed). AC-4 negative tests fail CI when median latency exceeds 2.0s under nominal load.
## Implementation Notes
### Code Style Requirements
- Implementation code and comments must NOT contain plan-specific terminology such as "AC-", "Milestone", "Step", "Phase", or similar workflow markers. These belong only in plan documentation.
- Use descriptive, domain-appropriate naming in code instead. For example, prefer `RLCRSessionEnumerator` / `cache_log_discovery` / `live_log_stream` over names that reference plan task ids.
- All implementation, comments, tests, and documentation must be in English. No emoji or CJK characters in code or comments (per project rules in `.claude/CLAUDE.md`).
- Per project rules in `.claude/CLAUDE.md`: any commit on `main` must include a version bump in `.claude-plugin/plugin.json`, `.claude-plugin/marketplace.json`, and `README.md` (the "Current Version" line). For commits on `feat/viz-dashboard`, the branch's `version` in those three files must already be ahead of `main`'s version. Implementation work must respect that policy.
### Branch and Rebase Note
- Implementation begins on `feat/viz-dashboard` (NOT the current `feat/rlcr-integral-context` branch).
- T0 rebases `feat/viz-dashboard` onto `upstream/dev` (9 commits ahead). It is a parallel preflight, not a hard gate for design tasks.
- `gen-plan` itself does not perform any git operation. The rebase happens at the start of the implementation loop (`/humanize:start-rlcr-loop`).
--- Original Design Draft Start ---
# Draft: Optimize viz-dashboard — Merge into `humanize monitor` as a Web View
## Background
The `feat/viz-dashboard` branch currently introduces a `/humanize:viz` Claude
slash command and a local visualization dashboard for Humanize. While the
dashboard does show some data, the visualization of a *live, dynamically
running RLCR loop* is not clear enough today: status, progress per round, and
streamed log output are hard to follow as a loop progresses.
Separately, Humanize already ships a CLI-side monitoring capability that the
user runs in another terminal (NOT inside Claude Code):
```bash
source <path/to/humanize>/scripts/humanize.sh # or add to .bashrc / .zshrc
humanize monitor rlcr # RLCR loop
humanize monitor skill # All skill invocations (codex + gemini)
humanize monitor codex # Codex invocations only
humanize monitor gemini # Gemini invocations only
```
This monitor capability already captures live state (RLCR rounds, skill / Codex
/ Gemini invocations, log output). The web dashboard does not need to invent
its own capture pipeline — it should consume what `humanize monitor` already
provides.
## Goal
Optimize the viz-dashboard branch so that:
1. The dashboard becomes a **web view** layered on top of the existing
`humanize monitor` data sources, rather than an independent capture layer.
2. The dashboard can show **multiple live RLCR loops simultaneously**, with
per-loop status and streamed log output.
3. The entry point moves out of Claude (no more `/humanize:viz` slash command)
and into the `humanize monitor` CLI command, as a new web-online viewing
subcommand.
4. The new capability targets **online / remote viewing in a browser**, not a
local-only viewer that requires the user to be on the same machine running
Claude.
5. Useful features from the existing viz-dashboard branch — notably **cross-
conversation querying** (browsing past sessions / loops across different
Claude conversations) — are preserved.
## Non-goals
- Reimplementing the monitor capture pipeline (`humanize monitor rlcr/skill/
codex/gemini`). The dashboard consumes it; it does not replace it.
- Continuing to ship `/humanize:viz` as a Claude slash command.
- Adding chart panels or features explicitly removed in commit 1b575fe
("multi-project switcher + restart + remove chart panels").
## Required behaviors
1. **CLI entry point unification**
- Remove `commands/viz.md` and any `/humanize:viz` Claude command surface.
- Add a new `humanize monitor` subcommand (name to be agreed during
planning, e.g. `humanize monitor web` or `humanize monitor dashboard`)
that starts the web dashboard server.
- The other `humanize monitor rlcr|skill|codex|gemini` subcommands must
keep working unchanged (terminal-attached live tail).
2. **Live multi-loop view**
- The web dashboard MUST be able to display 2+ concurrently running RLCR
loops at the same time, each with:
- current status (running, paused, converged, stopped, …)
- current round / phase
- live streamed log output, updated in near real time
3. **Reuse existing monitor data**
- The dashboard MUST source its data from the same files / events that
`humanize monitor rlcr/skill/codex/gemini` already read. It MUST NOT add
a parallel capture mechanism (no new hooks just for the dashboard).
4. **Online / remote-viewable**
- The dashboard MUST be reachable from a browser over the network, not
only via `localhost` on the machine running Claude. Concrete binding /
auth design to be agreed during planning.
5. **Cross-conversation history**
- Cross-conversation querying (browsing past loops from different Claude
conversations / sessions) from the existing viz-dashboard branch MUST be
preserved.
## Branch hygiene
Before implementation begins, the branch `feat/viz-dashboard` MUST be rebased
onto the latest `upstream/dev` (humania-org/humanize). Several relevant changes
have landed on `upstream/dev` after the branch diverged, including:
- `Add ask-gemini skill and tool-filtered monitor subcommands` (introduces the
`humanize monitor skill|codex|gemini` subcommands the dashboard must reuse)
- `Remove PR loop feature entirely` (the viz-dashboard branch still references
PR-loop concepts via `commands/cancel-pr-loop.md`, `commands/start-pr-loop.md`,
`hooks/pr-loop-stop-hook.sh`)
- Multiple monitor / hook fixes
The rebase is therefore both a precondition for correctness (the dashboard
consumes the new monitor subcommands) and a cleanup step (PR-loop references
must be dropped).
## Out of scope (for this plan)
- Changes to RLCR semantics, hooks, or skill behavior.
- Authentication providers, identity systems, or multi-user account models —
basic remote-access protection is in scope, but full IAM is not.
--- Original Design Draft End ---
# Streaming Protocol Contract
## Status
Frozen on April 17, 2026. Any change requires a new dated revision section appended below.
## Scope
This contract governs live streaming of RLCR round log files discovered for a single server project from `XDG_CACHE_HOME` or `HOME/.cache/humanize/SANITIZED/SID/round-N-{codex,gemini}-{run,review}.log`, where `SANITIZED` follows the rule implemented in `viz/server/rlcr_sources.py`. Session identity and liveness are derived from `.humanize/rlcr/SID/` metadata, but this contract does not define polling, parsing, or REST retrieval of frontmatter status files, goal-tracker files, round summaries, or review-result files.
## Channel Model
Streams are per-session, per-file. A stream is identified by `GET /api/sessions/SID/logs/FNAME`, where `SID` is the RLCR session id and `FNAME` is the exact cache-log basename such as `round-3-codex-run.log`. Each URL maps to one logical byte stream for one file generation within one session. Multiple sessions MAY be active concurrently, and clients MAY open multiple such channels in parallel.
## Event Shape
The live-log transport is Server-Sent Events. Every SSE frame MUST include `event: TYPE`, `id: N`, and one `data:` line containing exactly one JSON object. `TYPE` MUST equal the JSON `type` field. `id` MUST be a strictly increasing decimal string within the stream. `path` MUST be the canonical `FNAME` for the channel, not an absolute filesystem path. Raw file bytes MUST be base64 encoded into `bytes_b64` with standard RFC 4648 base64 and no line breaks. Payloads are: `snapshot` = `{ "type": "snapshot", "path": "...", "offset": 0, "bytes_b64": "...", "eof": false }`; `append` = `{ "type": "append", "path": "...", "offset": N, "bytes_b64": "..." }`; `resync` = `{ "type": "resync", "path": "...", "reason": "truncated|rotated|recreated|missing|overflow" }`; `eof` = `{ "type": "eof", "path": "..." }`. `offset` is the starting byte offset represented by `bytes_b64`.
## Truncation and Rotation Resync
The server MUST track the last emitted byte offset for each stream and, on POSIX, MUST also track `(st_dev, st_ino)` for the currently open file. If observed size shrinks below the last known offset, or `(st_dev, st_ino)` changes, or the file disappears, the server MUST emit `resync` and MUST restart the channel at offset `0` with a fresh `snapshot` as soon as the current file generation is readable again.
## Snapshot vs Append Semantics
A late-joining client MUST receive `snapshot` first. After that, only `append` events flow until a resync condition fires. Initial snapshots MUST be chunked at a maximum of `64 KiB` raw bytes per event; large files therefore produce multiple ordered `snapshot` events with increasing `offset` values until current EOF. `snapshot.eof=true` MAY be used only when the file is already terminal at snapshot time.
## Transport Mapping
When the server host is not `127.0.0.1`, live logs MUST be delivered only as SSE over HTTPS, and clients MUST authenticate with `?token=BEARER` on the stream URL. In that mode, WebSocket endpoints MUST be disabled or otherwise unreachable. When the server host equals `127.0.0.1`, SSE remains the live-log transport; `flask_sock` WebSocket MAY serve coarse session-level notifications such as `session-list-changed`, but MUST NOT carry per-file append data.
## Reconnect Behavior
On disconnect, the client SHOULD reconnect to the same stream URL and send `Last-Event-Id`. The server MUST retain the last `256` events per stream and MUST replay all events newer than that id when available. If the requested id is older than retained history or invalid for the current file generation, the server MUST recover by emitting `resync` and then a fresh `snapshot` from offset `0`.
## Latency Budget
Under nominal load of one project, up to `5` concurrent active sessions, and append rate not exceeding `100 KB/s` per stream, median append-to-render latency MUST be `<= 2.0s`. Tail `p95` latency MUST be `<= 5.0s`. Failure of the median assertion in CI MUST fail the build.
## Backpressure
If a client cannot keep up, the server MAY drop the oldest pending or retained `append` events for that stream, but it MUST emit a final `resync` with reason `overflow` and then provide a fresh `snapshot`. Silent data loss is forbidden.
## Out of Scope
This contract does not define the cancel control channel at `POST /api/sessions/SID/cancel`, project switching, daemon lifecycle, token issuance or validation, coarse session-list events, or any non-log REST payloads. Those surfaces require their own specifications.
## Example Event Stream
```text
event: snapshot
id: 101
data: {"type":"snapshot","path":"round-3-codex-run.log","offset":0,"bytes_b64":"U3RhcnQK","eof":false}
event: append
id: 102
data: {"type":"append","path":"round-3-codex-run.log","offset":6,"bytes_b64":"TW9yZQo="}
event: append
id: 103
data: {"type":"append","path":"round-3-codex-run.log","offset":11,"bytes_b64":"RGF0YQo="}
event: resync
id: 104
data: {"type":"resync","path":"round-3-codex-run.log","reason":"rotated"}
event: snapshot
id: 105
data: {"type":"snapshot","path":"round-3-codex-run.log","offset":0,"bytes_b64":"TmV3IGZpbGUK","eof":false}
```
# Humanize Usage Guide
Detailed usage documentation for the Humanize plugin. For installation, see [Install for Claude Code](install-for-claude.md).
## How It Works
Humanize creates an iterative feedback loop with two phases:
1. **Implementation Phase**: Claude works on your plan, Codex reviews summaries until COMPLETE
2. **Review Phase**: `codex review --base <branch>` checks code quality with `[P0-9]` severity markers
The loop continues until all acceptance criteria are met or no issues remain.
## Begin with the End in Mind
Before the RLCR loop starts any work, Humanize runs a **Plan Understanding Quiz** -- a brief pre-flight check that verifies you genuinely understand the plan you are about to execute.
### Why This Exists
The most expensive failure in AI-assisted development is not a bug. It is running a 40-round RLCR loop on a plan you never actually read. We call this **wishful coding**: treating a generated plan like a wish -- toss it in, hope for the best, check back later.
The problem is structural. An RLCR loop is an amplifier: it will faithfully execute whatever plan you give it. If the plan is wrong, the loop makes it wrong faster and at scale. If the plan is right but you do not understand it, you cannot course-correct when Codex raises questions, and the loop drifts.
Understanding your plan before execution is not optional overhead. It is the single highest-leverage thing you can do to ensure the loop succeeds.
### How the Quiz Works
When you run `start-rlcr-loop`, an independent agent analyzes the plan and generates two multiple-choice questions about the plan's technical implementation details:
1. **What components are changing and how?** -- Tests whether you know the core mechanism.
2. **How do the pieces connect?** -- Tests whether you understand the architecture being modified.
If you answer both correctly, the loop proceeds immediately. If you miss one or both, Humanize explains what the plan actually does and offers a choice: proceed anyway, or stop and review.
The quiz is advisory, not a gate. You always have the option to proceed. But that moment of friction -- the two seconds it takes to read the question and realize you do not know the answer -- is the entire point.
### Skipping the Quiz
- `--skip-quiz` -- Skip the quiz only. The rest of the RLCR loop behaves normally.
- `--yolo` -- Skip the quiz AND let Claude answer Codex's open questions directly (`--claude-answer-codex`). This is full automation mode for users who have already reviewed the plan and want to hand over complete control.
- Plans started via `gen-plan --auto-start-rlcr-if-converged` skip the quiz automatically, because the gen-plan convergence discussion already verified the user's understanding.
## Typical Planning Flow
1. Generate the initial implementation plan:
```bash
/humanize:gen-plan --input draft.md --output docs/plan.md
```
2. If the plan is reviewed with comment annotations, refine it and generate a QA ledger:
```bash
/humanize:refine-plan --input docs/plan.md
```
3. Start the RLCR loop on the refined plan:
```bash
/humanize:start-rlcr-loop docs/plan.md
```
## Commands
| Command | Purpose |
|---------|---------|
| `/start-rlcr-loop <plan.md>` | Start iterative development with Codex review |
| `/cancel-rlcr-loop` | Cancel active loop |
| `/gen-plan --input <draft.md> --output <plan.md>` | Generate structured plan from draft |
| `/refine-plan --input <annotated-plan.md>` | Refine an annotated plan and generate a QA ledger |
| `/ask-codex [question]` | One-shot consultation with Codex |
## Command Reference
### start-rlcr-loop
```
/humanize:start-rlcr-loop [path/to/plan.md | --plan-file path/to/plan.md] [OPTIONS]
OPTIONS:
--plan-file <path> Explicit plan file path (alternative to positional arg)
--max <N> Maximum iterations before auto-stop (default: 84)
--codex-model <MODEL:EFFORT>
Codex model and reasoning effort (default from config, fallback gpt-5.5:high)
--codex-timeout <SECONDS>
Timeout for each Codex review in seconds (default: 5400)
--track-plan-file Indicate plan file should be tracked in git (must be clean)
--push-every-round Require git push after each round (default: commits stay local)
--base-branch <BRANCH> Base branch for code review phase (default: auto-detect)
Priority: user input > remote default > main > master
--full-review-round <N>
Interval for Full Alignment Check rounds (default: 5, min: 2)
Full Alignment Checks occur at rounds N-1, 2N-1, 3N-1, etc.
--skip-impl Skip implementation phase, go directly to code review
Plan file is optional when using this flag
--claude-answer-codex When Codex finds Open Questions, let Claude answer them
directly instead of asking user via AskUserQuestion
--agent-teams Enable Claude Code Agent Teams mode for parallel development.
Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 environment variable.
Claude acts as team leader, splitting tasks among team members.
--yolo Skip Plan Understanding Quiz and let Claude answer Codex Open
Questions directly. Alias for --skip-quiz --claude-answer-codex.
--skip-quiz Skip the Plan Understanding Quiz only (without other changes).
-h, --help Show help message
```
### gen-plan
```
/humanize:gen-plan --input <path/to/draft.md> --output <path/to/plan.md> [OPTIONS]
OPTIONS:
--input Path to the input draft file (required)
--output Path to the output plan file (required)
--auto-start-rlcr-if-converged
Start the RLCR loop automatically when the plan is converged
(discussion mode only; ignored in --direct)
--discussion Use discussion mode (iterative Claude/Codex convergence rounds)
--direct Use direct mode (skip convergence rounds, proceed immediately to plan)
-h, --help Show help message
```
The gen-plan command transforms rough draft documents into structured implementation plans.
Workflow:
1. Validates input/output paths
2. Checks if draft is relevant to the repository
3. Analyzes draft for clarity, consistency, completeness, and functionality
4. Engages user to resolve any issues found
5. Generates a structured plan.md with acceptance criteria
6. Optionally starts `/humanize:start-rlcr-loop` if `--auto-start-rlcr-if-converged` conditions are met
If reviewers later annotate the generated plan with comment blocks, run
`/humanize:refine-plan --input <plan.md>` before starting or resuming implementation.
### refine-plan
```
/humanize:refine-plan --input <path/to/annotated-plan.md> [OPTIONS]
OPTIONS:
--input <path> Path to the annotated plan file (required)
--output <path> Path to the refined plan output file
Defaults to refining --input in place
--qa-dir <path> Directory for QA document output
Default: .humanize/plan_qa
--alt-language <LANG>
Generate translated plan and QA variants
Supported: zh, ko, ja, es, fr, de, pt, ru, ar
Full language names are also accepted; en/English is a no-op
--discussion Interactive mode for ambiguous comment classification
--direct Non-interactive mode; makes minimal safe assumptions
-h, --help Show help message
```
The refine-plan command reads an annotated `gen-plan` document, processes embedded review
comments, removes those comment blocks from the final plan, and writes a QA ledger that records
how each comment was handled.
**Usage examples:**
```bash
# Refine a plan in place and write QA output to the default directory
/humanize:refine-plan --input docs/plan.md
# Write the refined plan to a new file and store QA output in a custom directory
/humanize:refine-plan --input docs/plan.annotated.md --output docs/plan.refined.md --qa-dir docs/plan-qa
# Run in direct mode and generate translated variants
/humanize:refine-plan --input docs/plan.md --direct --alt-language zh
```
**Annotated comment block format:**
`refine-plan` supports three comment formats for reviewer annotations. Both inline
and multi-line comment blocks are supported in all formats:
**Classic format (CMT:/ENDCMT):**
```markdown
Text before CMT: clarify why AC-3 is split here ENDCMT text after
```
```markdown
CMT:
Please investigate whether this task should depend on task4 or task5.
If the dependency is unclear, add a pending decision instead of guessing.
ENDCMT
```
**Short tag format (<cmt></cmt>):**
```markdown
Text before <cmt>clarify why AC-3 is split here</cmt> text after
```
```markdown
<cmt>
Please investigate whether this task should depend on task4 or task5.
If the dependency is unclear, add a pending decision instead of guessing.
</cmt>
```
**Long tag format (<comment></comment>):**
```markdown
Text before <comment>clarify why AC-3 is split here</comment> text after
```
```markdown
<comment>
Please investigate whether this task should depend on task4 or task5.
If the dependency is unclear, add a pending decision instead of guessing.
</comment>
```
Rules:
- At least one non-empty comment block must exist in the input file.
- Comment markers inside fenced code blocks or HTML comments are ignored.
- Empty comment blocks are removed but do not create QA ledger entries.
- The input plan must still follow the `gen-plan` section schema.
- All three formats can be mixed within the same file.
**QA output structure:**
For an input plan named `plan.md`, the default QA output path is `.humanize/plan_qa/plan-qa.md`.
The generated QA document includes:
- `## Summary`: overall refinement outcome and comment counts
- `## Comment Ledger`: one row per raw `CMT-N` block with classification, location, excerpt, and disposition
- `## Answers`: responses to question comments and any clarifying edits
- `## Research Findings`: repository research performed for `research_request` comments
- `## Plan Changes Applied`: changes made for `change_request` comments and cross-reference updates
- `## Remaining Decisions`: unresolved items or assumption-heavy decisions that still need user input
- `## Refinement Metadata`: input/output paths, QA path, classification counts, modified sections, convergence status, and date
Disposition values in the ledger are `answered`, `applied`, `researched`, `deferred`, or
`resolved`.
If `--alt-language` is set to a supported non-English language, the command also generates
translated plan and QA variants by inserting `_<code>` before the file extension, such as
`plan_zh.md` and `plan-qa_zh.md`.
### ask-codex
```
/humanize:ask-codex [OPTIONS] <question or task>
OPTIONS:
--codex-model <MODEL:EFFORT>
Codex model and reasoning effort (default from config, fallback gpt-5.5:high)
--codex-timeout <SECONDS>
Timeout for the Codex query in seconds (default: 3600)
-h, --help Show help message
```
The ask-codex skill sends a one-shot question or task to Codex and returns the response
inline. Unlike the RLCR loop, this is a single consultation without iteration -- useful
for getting a second opinion, reviewing a design, or asking domain-specific questions.
Responses are saved to `.humanize/skill/<timestamp>/` with `input.md`, `output.md`,
and `metadata.md` for reference.
## Configuration
Humanize uses a 4-layer config hierarchy (lowest to highest priority):
1. **Plugin defaults**: `config/default_config.json`
2. **User config**: `~/.config/humanize/config.json`
3. **Project config**: `.humanize/config.json`
4. **CLI flags**: Command-line arguments (where available)
Current built-in keys:
| Key | Default | Description |
|-----|---------|-------------|
| `codex_model` | `gpt-5.5` | Shared default model for Codex-backed review and analysis |
| `codex_effort` | `high` | Shared default reasoning effort (`xhigh`, `high`, `medium`, `low`) |
| `bitlesson_model` | `haiku` | Model used by the BitLesson selector agent |
| `provider_mode` | unset | Optional runtime mode hint such as `codex-only` |
| `agent_teams` | `false` | Project-level default for agent teams workflow |
| `alternative_plan_language` | `""` | Optional translated plan variant language; supported values include `Chinese`, `Korean`, `Japanese`, `Spanish`, `French`, `German`, `Portuguese`, `Russian`, `Arabic`, or ISO codes like `zh` |
| `gen_plan_mode` | `discussion` | Default plan-generation mode |
### Codex Model Configuration
All Codex-using features (RLCR loop, ask-codex) share the same model configuration:
| Key | Default | Description |
|-----|---------|-------------|
| `codex_model` | `gpt-5.5` | Model used for Codex operations (reviews, analysis, queries) |
| `codex_effort` | `high` | Reasoning effort (`xhigh`, `high`, `medium`, `low`) |
To override, add to `.humanize/config.json`:
```json
{
"codex_model": "gpt-5.2",
"codex_effort": "xhigh",
"bitlesson_model": "sonnet"
}
```
When installing the Humanize runtime into Codex CLI, Humanize also seeds
`${XDG_CONFIG_HOME:-~/.config}/humanize/config.json` with a Codex/OpenAI
`bitlesson_model` and `provider_mode: "codex-only"` when those keys are unset.
That flag is only a routing hint for that Codex runtime; the repository also
supports Claude Code and Kimi installs.
Codex model is resolved with this precedence:
1. CLI `--codex-model` flag (highest priority)
2. Feature-specific defaults
3. Config-backed defaults from the 4-layer hierarchy above
4. Hardcoded fallback (`gpt-5.5:high`)
**Migration note**: If your `.humanize/config.json` contains the legacy keys
`loop_reviewer_model` or `loop_reviewer_effort`, they are silently ignored.
Use `codex_model` and `codex_effort` instead.
## Monitoring
Set up the monitoring helper for real-time progress tracking:
```bash
# Add to your .bashrc or .zshrc
source ~/.claude/plugins/cache/PolyArch/humanize/<LATEST.VERSION>/scripts/humanize.sh
# Terminal monitors (one project per terminal):
humanize monitor rlcr # latest RLCR loop log
humanize monitor skill # all skill invocations (codex + gemini)
humanize monitor codex # ask-codex skill invocations only
humanize monitor gemini # ask-gemini skill invocations only
# Browser dashboard (multiple loops at once, foreground default):
humanize monitor web --project /path/to/project
```
Progress data is stored in `.humanize/rlcr/<timestamp>/` for each loop session.
### Browser dashboard (`humanize monitor web`)
The web dashboard layers on top of the same `.humanize/rlcr/<session>/`
metadata and `~/.cache/humanize/<sanitized-project>/<session>/round-*-codex-{run,review}.log`
cache logs that the terminal monitors read. There is no parallel
capture pipeline; the dashboard is a reader, not a writer.
Lifecycle (per DEC-1, DEC-3):
- Foreground default (`humanize monitor web --project <path>`). Press
Ctrl+C to stop. The server is CLI-fixed to one project at startup;
to monitor several projects simultaneously, run multiple instances
(one per project) with different `--port` values.
- `--daemon` runs the same server inside a per-project tmux session
(`humanize-viz-<8-hex>`); use `viz-stop.sh --project <path>` or
the project's own tmux kill command to stop it.
Per-session inline live log panes appear on the home page for every
active session, driven by Server-Sent Events from
`/api/sessions/<session_id>/logs/<basename>`. Multiple loops stream
in parallel without leaving the home page.
### Remote browser access
The dashboard binds to `127.0.0.1` by default. To expose it over the
network, supply `--host` and an authentication token. The token is
required for any non-loopback host; the server refuses to start
otherwise.
Token-aware endpoints honor `Authorization: Bearer <tok>` for normal
fetch requests and `?token=<tok>` query parameters for the SSE stream
(per DEC-4: browsers cannot set arbitrary headers on EventSource).
WebSocket transport is rejected entirely in remote mode.
#### Pattern 1 (recommended): SSH tunnel
The safest remote pattern keeps the server bound to localhost and
forwards the port over SSH:
```bash
# On the server machine:
humanize monitor web --project /path/to/project --port 18000
# On your laptop:
ssh -N -L 18000:localhost:18000 user@server.example.com
# Then open http://localhost:18000 in the local browser.
```
No token is required because the server still binds to loopback. The
SSH tunnel provides authentication and encryption.
#### Pattern 2: Direct LAN bind
For trusted-network deployments where SSH tunneling is impractical:
```bash
# Generate a strong random token (one-time):
TOKEN="$(openssl rand -hex 32)"
# Start the dashboard:
humanize monitor web \
--project /path/to/project \
--host 0.0.0.0 \
--port 18000 \
--auth-token "$TOKEN"
# Or supply the token via env var instead of CLI:
HUMANIZE_VIZ_TOKEN="$TOKEN" humanize monitor web \
--project /path/to/project --host 0.0.0.0 --port 18000
```
Open the dashboard with `http://server:18000/?token=<TOKEN>` once;
the browser caches the token in `sessionStorage` and propagates it
on subsequent fetches and SSE reconnects.
## Cancellation
- **RLCR loop**: `/humanize:cancel-rlcr-loop`
## Environment Variables
### HUMANIZE_CODEX_BYPASS_SANDBOX
**WARNING: This is a dangerous option that disables security protections. Use only if you understand the implications.**
- **Purpose**: Controls whether Codex runs with sandbox protection
- **Default**: Not set (uses `--full-auto` with sandbox protection)
- **Values**:
- `true` or `1`: Bypasses Codex sandbox and approvals (uses `--dangerously-bypass-approvals-and-sandbox`)
- Any other value or unset: Uses safe mode with sandbox
**When to use this**:
- Linux servers without landlock kernel support (where Codex sandbox fails)
- Automated CI/CD pipelines in trusted environments
- Development environments where you have full control
**When NOT to use this**:
- Public or shared development servers
- When reviewing untrusted code or pull requests
- Production systems
- Any environment where unauthorized system access could cause damage
**Security implications**:
- Codex will have unrestricted access to your filesystem
- Codex can execute arbitrary commands without approval prompts
- Review all code changes carefully when using this mode
**Usage example**:
```bash
# Export before starting Claude Code
export HUMANIZE_CODEX_BYPASS_SANDBOX=true
# Or set for a single session
HUMANIZE_CODEX_BYPASS_SANDBOX=true claude --plugin-dir /path/to/humanize
```
#!/usr/bin/env python3
"""
Helper script to check for incomplete tasks from Claude Code.
Supports both:
- Legacy TodoWrite tool (parsed from transcript)
- New Task system (read directly from ~/.claude/tasks/<session_id>/)
Exit codes:
0 - All tasks are completed (or no tasks exist)
1 - There are incomplete tasks (details on stdout)
2 - Parse error reading hook input JSON
Usage:
echo '{"session_id": "...", "transcript_path": "/path/to/transcript.jsonl"}' | python3 check-todos-from-transcript.py
"""
import json
import re
import sys
from pathlib import Path
from typing import List, Tuple
LANE_PREFIX_PATTERN = re.compile(r"^\s*\[(mainline|blocking|queued)\](?:\s|$)", re.IGNORECASE)
def classify_lane(*parts: str) -> str:
"""Infer the task lane from content, defaulting to blocking for safety."""
for part in parts:
if not part:
continue
match = LANE_PREFIX_PATTERN.match(part)
if match:
return match.group(1).lower()
return "blocking"
def extract_tool_calls_from_entry(entry: dict) -> List[Tuple[str, dict]]:
"""
Extract tool calls from a transcript entry.
Returns list of (tool_name, tool_input) tuples.
"""
tool_calls = []
entry_type = entry.get("type", "")
# Pattern 1 & 2: Extract content list from assistant or message entries
if entry_type == "assistant":
content = entry.get("message", {}).get("content", [])
elif entry_type == "message":
content = entry.get("content", [])
else:
content = []
# Extract tool calls from content list
if isinstance(content, list):
for block in content:
if isinstance(block, dict) and block.get("type") == "tool_use":
tool_name = block.get("name", "")
tool_input = block.get("input", {})
if tool_name:
tool_calls.append((tool_name, tool_input))
# Pattern 3: Direct tool_use entry
if entry_type == "tool_use":
tool_name = entry.get("name", "") or entry.get("tool_name", "")
tool_input = entry.get("input", {}) or entry.get("tool_input", {})
if tool_name:
tool_calls.append((tool_name, tool_input))
return tool_calls
def find_incomplete_todos_from_transcript(transcript_path: Path) -> List[dict]:
"""
Parse transcript JSONL and find incomplete legacy todos (TodoWrite only).
Returns list of incomplete items with 'status' and 'content' keys.
"""
if not transcript_path.exists():
return []
# Legacy: track the most recent TodoWrite todos
latest_todos = []
with open(transcript_path, 'r', encoding='utf-8') as f:
for line in f:
line = line.strip()
if not line:
continue
try:
entry = json.loads(line)
except json.JSONDecodeError:
continue
# Extract all tool calls from this entry
for tool_name, tool_input in extract_tool_calls_from_entry(entry):
# Legacy: TodoWrite
if tool_name == "TodoWrite":
todos = tool_input.get("todos", [])
if todos:
latest_todos = todos
# Build list of incomplete items from legacy todos
incomplete = []
for todo in latest_todos:
status = todo.get("status", "")
content = todo.get("content", "")
if status != "completed":
lane = classify_lane(content)
if lane == "queued":
continue
incomplete.append({
"status": status,
"content": content,
"source": "todo",
"lane": lane,
})
return incomplete
def find_incomplete_tasks_from_directory(session_id: str, tasks_base_dir: str = "") -> List[dict]:
"""
Read task files directly from ~/.claude/tasks/<session_id>/ directory.
This is the authoritative source for task state, as it reflects
the actual in-memory task list that Claude Code maintains.
Args:
session_id: The Claude Code session ID
tasks_base_dir: Optional override for tasks base directory (for testing)
Returns list of incomplete items with 'status' and 'content' keys.
"""
if tasks_base_dir:
tasks_dir = Path(tasks_base_dir) / session_id
else:
tasks_dir = Path.home() / ".claude" / "tasks" / session_id
if not tasks_dir.exists() or not tasks_dir.is_dir():
return []
incomplete = []
for task_file in tasks_dir.glob("*.json"):
try:
with open(task_file, 'r', encoding='utf-8') as f:
task = json.load(f)
status = task.get("status", "pending")
if status not in ("completed", "deleted"):
# Task is incomplete
subject = task.get("subject", "")
description = task.get("description", "")
task_id = task_file.stem # Filename without .json
content = subject or description or f"Task {task_id}"
lane = classify_lane(subject, description)
if lane == "queued":
continue
incomplete.append({
"status": status,
"content": content,
"source": "task",
"task_id": task_id,
"lane": lane,
})
except (json.JSONDecodeError, OSError):
# Skip malformed or unreadable task files
continue
return incomplete
def main():
# Read hook input from stdin
try:
stdin_content = sys.stdin.read().strip()
if not stdin_content:
# Empty input - no data available, allow proceeding
sys.exit(0)
hook_input = json.loads(stdin_content)
except json.JSONDecodeError as e:
# Parse error - exit with code 2
print(f"PARSE_ERROR: {e}", file=sys.stderr)
sys.exit(2)
incomplete_items = []
# Check new Task system using external task directory (authoritative source)
session_id = hook_input.get("session_id", "")
tasks_base_dir = hook_input.get("tasks_base_dir", "") # For testing
if session_id:
incomplete_items.extend(find_incomplete_tasks_from_directory(session_id, tasks_base_dir))
# Check legacy TodoWrite from transcript
transcript_path = hook_input.get("transcript_path", "")
if transcript_path:
transcript_path = Path(transcript_path).expanduser()
incomplete_items.extend(find_incomplete_todos_from_transcript(transcript_path))
if not incomplete_items:
# No incomplete items, allow proceeding
sys.exit(0)
# Format output
output_lines = []
for item in incomplete_items:
status = item.get("status", "unknown")
content = item.get("content", "")
source = item.get("source", "unknown")
lane = item.get("lane", "blocking")
lane_marker = f"[{lane}]"
if source == "task":
task_id = item.get("task_id", "?")
output_lines.append(f" - [{status}] {lane_marker} (Task #{task_id}) {content}")
else:
output_lines.append(f" - [{status}] {lane_marker} {content}")
# Output marker and incomplete items both to stdout
print("INCOMPLETE_TODOS")
print("\n".join(output_lines))
sys.exit(1)
if __name__ == "__main__":
main()
{
"description": "Humanize Plugin Hooks - Validation hooks and Stop hooks for /start-rlcr-loop",
"hooks": {
"UserPromptSubmit": [
{
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-plan-file-validator.sh"
}
]
}
],
"PreToolUse": [
{
"matcher": "Write",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-write-validator.sh"
}
]
},
{
"matcher": "Edit",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-edit-validator.sh"
}
]
},
{
"matcher": "Read",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-read-validator.sh"
}
]
},
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-bash-validator.sh"
}
]
}
],
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-post-bash-hook.sh"
}
]
}
],
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/loop-codex-stop-hook.sh",
"timeout": 7200
}
]
}
]
}
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment