# hytop - monitoring tools ## Quick start ```bash uv pip install -e . hytop gpu --help ``` ## Prerequisites - Python >= 3.10 - Python packages: `rich`, `typer` - Passwordless SSH for remote monitoring ## `hytop gpu` A lightweight script for live `hy-smi` polling with rolling averages across multiple hosts. It features a modern terminal UI and can be used as a blocking scheduler for GPU jobs. ### Usage Simple examples: ```bash # Local node, all GPUs, 5-second rolling window hytop gpu -n 1 --window 5 # Two nodes, monitor only GPU 0 and 1 hytop gpu -H node01,node02 --devices 0,1 -n 1 # Exit with code 0 when all monitored GPUs are available hytop gpu --devices 0,1 --wait-idle # Wait at most 300s for availability (exit 124 on timeout) hytop gpu --devices 0,1 --wait-idle --timeout 300 # Fine-grained columns (output order follows show-flag order) hytop gpu --showtemp --showpower hytop gpu --showpower --showtemp ``` Queue jobs in shared environments: ```bash if hytop gpu -H node01,node02 --wait-idle --timeout 300; then echo "GPUs available, starting workload..." # YOUR COMMAND HERE (e.g., python train.py) else echo "Error: GPUs not available in time, aborting pipeline." exit 1 fi ``` ### Exit Codes Designed to be script-friendly: * `0`: Availability condition met (GPUs are idle). * `124`: Timeout reached before the availability condition was met. * `130`: Interrupted by the user (Ctrl+C). * `2`: Argument or input error. ### Fine-grained metric flags `hytop gpu` uses formatted `hy-smi --json` output and supports a subset of `hy-smi` `--show*` flags: - `--showtemp`: GPU core temperature (`Temp`) - `--showpower`: average package power (`AvgPwr`, plus `AvgPwr@window`) - `--showhcuclocks`: sclk frequency (`sclk`) - `--showmemuse`: VRAM usage (`VRAM%`) - `--showuse`: GPU utilization (`GPU%`, plus `GPU%@window`) If no `--show*` flags are specified, hytop defaults to: `--showtemp --showpower --showhcuclocks --showmemuse --showuse`. ## Development ### Version bump Version is sourced from `src/hytop/__init__.py` (`__version__`). ```bash # patch: 0.1.0 -> 0.1.1 python scripts/bump_version.py patch # minor: 0.1.1 -> 0.2.0 python scripts/bump_version.py minor # major: 0.2.0 -> 1.0.0 python scripts/bump_version.py major # set an explicit version python scripts/bump_version.py set 1.2.3 ```