// GPUInfo contains discovered GPU configuration from cluster nodes
typeGPUInfostruct{
NodeNamestring// Name of the node with this GPU configuration
...
...
@@ -89,17 +155,20 @@ type GPUInfo struct {
Systemnvidiacomv1beta1.GPUSKUType// AIC hardware system identifier (e.g., "h100_sxm", "h200_sxm"), empty if unknown
MIGEnabledbool// True if MIG is enabled (inferred from model or additional labels, not implemented in this version)
MIGProfilesmap[string]int// Optional: map of MIG profile name to count (requires additional label parsing, not implemented in this version)
CloudProviderstring// NEW: aws | gcp | aks | other | unknown
CloudProviderstring// aws | gcp | aks | other | unknown
RDMAEnabledbool// Indicates whether RDMA is enabled for this node (e.g., via InfiniBand, RoCE, or similar high-speed networking)
RDMATypestring// Type of RDMA transport detected (e.g., "infiniband", "roce", "rdma", "sriov", or "none")
Interconnectstring// Primary GPU-to-GPU interconnect technology used within the node (e.g., "nvlink" for high-bandwidth links or "pcie" for standard bus-based communication)
InterconnectTierstring// Qualitative or platform-specific classification of the interconnect (e.g., NVLink generation, topology tier, or vendor-defined performance level)
NVLinkLinksint// Number of NVLink connections per GPU (0 if NVLink is not present or interconnect is PCIe-only)
| `gpuSku` _[GPUSKUType](#gpuskutype)_ | GPUSKU is the AIC hardware system identifier for the GPU.<br />When omitted, the operator auto-detects this via InferHardwareSystem from cluster GPU node labels. | | Enum: [gb200_sxm h200_sxm h100_sxm b200_sxm a100_sxm l40s] <br />Optional: \{\} <br /> |
| `gpuSku` _[GPUSKUType](#gpuskutype)_ | GPUSKU is the AIC hardware system identifier for the GPU.<br />When omitted, the operator auto-detects this via InferHardwareSystem from cluster GPU node labels. | | Enum: [gb200_sxm b200_sxm h200_sxm h100_sxm h100_pcie a100_sxm a100_pcie l40s l40 l4 v100_sxm v100_pcie t4 mi200 mi300] <br />Optional: \{\} <br /> |
| `vramMb` _float_ | VRAMMB is the VRAM per GPU in MiB. | | Optional: \{\} <br /> |
| `totalGpus` _integer_ | TotalGPUs is the total number of GPUs available in the cluster. | | Optional: \{\} <br /> |
| `numGpusPerNode` _integer_ | NumGPUsPerNode is the number of GPUs per node. | | Optional: \{\} <br /> |
| `interconnect` _string_ | Interconnect describes the GPU interconnect type within a node.<br />Examples: "pcie", "nvlink", "infiniband". | | Optional: \{\} <br /> |
| `rdma` _boolean_ | RDMA indicates whether RDMA is available on the cluster. | | Optional: \{\} <br /> |