Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
b230980f
"lib/vscode:/vscode.git/clone" did not exist on "9e5407f20d2027c2ca2aa33d9e898760e3d97393"
Unverified
Commit
b230980f
authored
Feb 23, 2026
by
Ran Rubin
Committed by
GitHub
Feb 23, 2026
Browse files
ci: modify buildkit router (#6480)
parent
61c67804
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
136 additions
and
87 deletions
+136
-87
.github/scripts/route_buildkit.sh
.github/scripts/route_buildkit.sh
+136
-87
No files found.
.github/scripts/route_buildkit.sh
View file @
b230980f
...
...
@@ -6,44 +6,42 @@
# route_buildkit.sh - Discover and route BuildKit pods for CI builds
# =============================================================================
#
# ROUTING LOGIC:
# --------------
# Routing is optimized for Docker layer caching based on shared base images:
# - vLLM and SGLang share the same base image (cuda-dl-base) when CUDA versions match
# - TensorRT-LLM uses a different base (pytorch), so it's isolated
# - General builds have no framework, grouped with trtllm for isolation
# ROUTING LOGIC (Coverage-Aware Ranked Rendezvous Hashing with SHA-256):
# ---------------------------------------------------------
# Routing is optimized for Docker layer caching, linear scaling, and
# 100% pod utilization across any number of BuildKit pods.
#
# Pool assignment is also optimized for uneven uptime: pod 0 is the only pod
# running outside business hours (via KEDA), so it accumulates fallback cache
# for all flavors overnight. To compensate, pool 0 is assigned the LIGHTEST
# daytime workload (trtllm + general), while pool 2 (only active during
# business hours) gets the HEAVIEST workload (vllm/sglang-cuda12).
# CACHE GROUPS (3 distinct groups to maximize layer reuse):
# - Group 0 (cuda-dl-base-13): vLLM & SGLang (CUDA 13.x)
# - Group 1 (cuda-dl-base-12): vLLM & SGLang (CUDA 12.x)
# - Group 2 (general-trt-combined): TRT-LLM & General Builds
#
# Flavors are routed to BuildKit pods using modulo 3 on the pod index:
# - Pool 0 (idx % 3 == 0): trtllm (any CUDA), general (lightest - offsets overnight fallback load)
# - Pool 1 (idx % 3 == 1): vllm-cuda13, sglang-cuda13 (share cuda-dl-base + wheel_builder cache)
# - Pool 2 (idx % 3 == 2): vllm-cuda12, sglang-cuda12 (heaviest - only active during business hours)
# Note: Unrecognized route keys (e.g. trtllm-cuda12) fall through to pool 0 via wildcard.
# ALGORITHM:
# 1. SCORING: Each group key is hashed with every active pod index (SHA-256)
# to produce a uniformly distributed score per (group, arch, pod) triple.
# 2. RANKING: Pods are sorted by score (descending) per group. StatefulSet
# pod names are constant, so rankings are stable across invocations.
# 3. POOL SIZING: Pool Size = ceil(Active Pods / 3) ensures even distribution.
# 4. COVERAGE-AWARE SELECTION: Pools are built round-by-round across all 3
# groups simultaneously. In each round, each group picks its highest-ranked
# pod that is NOT YET in any group's pool (preferring uncovered pods).
# This guarantees every active pod appears in at least one group's pool.
# 5. RANDOM PICK: ONE pod is randomly selected from the candidate pool.
#
# SELECTION: From the candidate pool, ONE pod is randomly selected and its
# tcp:// address is written to $GITHUB_OUTPUT.
#
# FALLBACK: If no pods match the target pool, the highest available index is used.
#
# CANDIDATE POOL TABLE (one pod is randomly selected from the candidate set):
# +------+---------------------+---------+---------------+---------------+---------------+---------------+
# | Pods | trtllm (any cuda) | general | vllm-cuda13 | sglang-cuda13 | vllm-cuda12 | sglang-cuda12 |
# | | (pool 0, mod 0) | (pool 0)| (pool 1,mod 1)| (pool 1,mod 1)| (pool 2,mod 2)| (pool 2,mod 2)|
# +------+---------------------+---------+---------------+---------------+---------------+---------------+
# | 1 | {0} | {0} | {0} (fb) | {0} (fb) | {0} (fb) | {0} (fb) |
# | 2 | {0} | {0} | {1} | {1} | {1} (fb) | {1} (fb) |
# | 3 | {0} | {0} | {1} | {1} | {2} | {2} |
# | 4 | {0, 3} | {0, 3} | {1} | {1} | {2} | {2} |
# | 5 | {0, 3} | {0, 3} | {1, 4} | {1, 4} | {2} | {2} |
# | 6 | {0, 3} | {0, 3} | {1, 4} | {1, 4} | {2, 5} | {2, 5} |
# +------+---------------------+---------+---------------+---------------+---------------+---------------+
# {x, y} = candidate pool; ONE pod is randomly selected from this set
# (fb) = no pods in target pool; falls back to highest available index
# LOAD DISTRIBUTION (cksum-based, all pods utilized):
# +------+------+-------------------+-------------------+---------------------+
# | Pods | Pool | G0: vLLM/SGL C13 | G1: vLLM/SGL C12 | G2: TRT-LLM/General |
# +------+------+-------------------+-------------------+---------------------+
# | 1 | 1 | {0} | {0} | {0} |
# | 2 | 1 | {0} | {1} | {1} |
# | 3 | 1 | {0} | {2} | {1} |
# | 4 | 2 | {0, 3} | {2, 1} | {1, 2} |
# | 5 | 2 | {0, 3} | {2, 4} | {1, 2} |
# | 6 | 2 | {0, 3} | {5, 1} | {2, 4} |
# | 7 | 3 | {0, 3, 4} | {5, 1, 2} | {2, 6, 5} |
# | 8 | 3 | {7, 0, 3} | {5, 1, 4} | {2, 6, 5} |
# | 9 | 3 | {7, 0, 3} | {8, 5, 1} | {2, 6, 4} |
# +------+------+-------------------+-------------------+---------------------+
#
# =============================================================================
...
...
@@ -138,7 +136,7 @@ fi
# --- CONFIGURATION ---
NAMESPACE
=
"buildkit"
PORT
=
"1234"
MAX_POD_CHECK
=
10
# How many pod indices to probe (e.g., 0 to 3)
MAX_POD_CHECK
=
10
# ---------------------
if
!
command
-v
nslookup &> /dev/null
;
then
...
...
@@ -146,6 +144,11 @@ if ! command -v nslookup &> /dev/null; then
exit
1
fi
if
!
command
-v
sha256sum
&> /dev/null
;
then
echo
"❌ Error: sha256sum not found. Please install coreutils."
exit
1
fi
# --- RETRY CONFIGURATION ---
MAX_RETRIES
=
${
MAX_RETRIES
:-
2
}
RETRY_DELAY
=
${
RETRY_DELAY
:-
30
}
...
...
@@ -171,64 +174,109 @@ get_active_indices() {
echo
"
${
active_indices
[@]
}
"
}
# Function to route flavors to specific active indices based on Modulo 3
get_target_indices
()
{
local
flavor
=
$1
local
cuda_version
=
$2
# Read remaining arguments as an array of available indices
local
-a
available_indices
=(
"
${
@
:3
}
"
)
if
[
${#
available_indices
[@]
}
-eq
0
]
;
then
echo
""
return
fi
local
cuda_major
=
${
cuda_version
%%.*
}
local
route_key
=
"
${
flavor
}
-cuda
${
cuda_major
}
"
local
target_mod
GROUP_KEYS
=(
"cuda-dl-base-13"
"cuda-dl-base-12"
"general-trt-combined"
)
case
"
$route_key
"
in
# --- POOL 0: Isolated builds — lightest load offsets overnight fallback accumulation ---
trtllm-cuda13|general-
*
)
target_mod
=
0
;;
# --- POOL 1: CUDA 13 builds (vLLM + SGLang share cuda-dl-base:cuda13.0) ---
vllm-cuda13|sglang-cuda13
)
target_mod
=
1
;;
# --- POOL 2: CUDA 12 builds — heaviest load, only active during business hours ---
vllm-cuda12|sglang-cuda12
)
target_mod
=
2
;;
# --- FALLBACK ---
*
)
target_mod
=
0
# Map a flavor + CUDA version to a group index (0, 1, or 2)
flavor_to_group
()
{
local
flavor
=
$1
local
cuda_major
=
${
2
%%.*
}
case
"
$flavor
"
in
vllm|sglang
)
case
"
$cuda_major
"
in
13
)
echo
0
;;
*
)
echo
1
;;
esac
;;
trtllm|general|
*
)
echo
2
;;
esac
}
echo
" [DEBUG] Routing Key: '
$route_key
' -> Worker Index Modulo:
$target_mod
"
>
&2
# Compute coverage-aware pool assignments for all 3 groups.
# Outputs pipe-separated pools: "pool0|pool1|pool2"
compute_group_pools
()
{
local arch
=
$1
local
-a
available_indices
=(
"
${
@
:2
}
"
)
local
count
=
${#
available_indices
[@]
}
local
final_targets
=()
if
[
"
$count
"
-eq
0
]
;
then
echo
"||"
return
fi
# Filter the AVAILABLE indices (not just 0..count)
for
idx
in
"
${
available_indices
[@]
}
"
;
do
if
[
$((
idx
%
3
))
-eq
"
$target_mod
"
]
;
then
final_targets+
=(
"
$idx
"
)
fi
done
local
pool_size
=
$((
(
count
+
2
)
/
3
))
# If no pods match the specific modulo, fallback to the highest available index
if
[
"
${#
final_targets
[@]
}
"
-eq
"0"
]
;
then
local
max_idx
=
${
available_indices
[0]
}
local
rank0
=
""
rank1
=
""
rank2
=
""
for
g
in
0 1 2
;
do
local
scored_list
=()
for
idx
in
"
${
available_indices
[@]
}
"
;
do
if
[
"
$idx
"
-gt
"
$max_idx
"
]
;
then
max_idx
=
$idx
local
combo
=
"
${
GROUP_KEYS
[
$g
]
}
-buildkit-
${
arch
}
-
${
idx
}
"
local
score
=
$(
echo
-n
"
$combo
"
|
sha256sum
|
awk
'{print $1}'
)
scored_list+
=(
"
${
score
}
:
${
idx
}
"
)
done
local
sorted_str
=
$(
printf
"%s
\n
"
"
${
scored_list
[@]
}
"
|
sort
-r
|
cut
-d
':'
-f2
|
tr
'\n'
' '
)
if
[
"
$g
"
-eq
0
]
;
then
rank0
=
"
$sorted_str
"
;
fi
if
[
"
$g
"
-eq
1
]
;
then
rank1
=
"
$sorted_str
"
;
fi
if
[
"
$g
"
-eq
2
]
;
then
rank2
=
"
$sorted_str
"
;
fi
done
local
pool0
=
" "
pool1
=
" "
pool2
=
" "
local
covered
=
" "
for
((
round
=
0
;
round<pool_size
;
round++
))
;
do
for
g
in
0 1 2
;
do
local
current_rank
=
""
current_pool
=
""
if
[
"
$g
"
-eq
0
]
;
then
current_rank
=
"
$rank0
"
;
current_pool
=
"
$pool0
"
;
fi
if
[
"
$g
"
-eq
1
]
;
then
current_rank
=
"
$rank1
"
;
current_pool
=
"
$pool1
"
;
fi
if
[
"
$g
"
-eq
2
]
;
then
current_rank
=
"
$rank2
"
;
current_pool
=
"
$pool2
"
;
fi
local
picked
=
""
for
candidate
in
$current_rank
;
do
[[
"
$current_pool
"
==
*
"
$candidate
"
*
]]
&&
continue
if
[[
"
$covered
"
!=
*
"
$candidate
"
*
]]
;
then
picked
=
$candidate
;
break
fi
done
if
[
-z
"
$picked
"
]
;
then
for
candidate
in
$current_rank
;
do
[[
"
$current_pool
"
==
*
"
$candidate
"
*
]]
&&
continue
picked
=
$candidate
;
break
done
fi
if
[
-n
"
$picked
"
]
;
then
current_pool
=
"
${
current_pool
}${
picked
}
"
covered
=
"
${
covered
}${
picked
}
"
if
[
"
$g
"
-eq
0
]
;
then
pool0
=
"
$current_pool
"
;
fi
if
[
"
$g
"
-eq
1
]
;
then
pool1
=
"
$current_pool
"
;
fi
if
[
"
$g
"
-eq
2
]
;
then
pool2
=
"
$current_pool
"
;
fi
fi
done
echo
"
$max_idx
"
else
echo
"
${
final_targets
[@]
}
"
done
pool0
=
$(
echo
"
$pool0
"
| xargs
)
pool1
=
$(
echo
"
$pool1
"
| xargs
)
pool2
=
$(
echo
"
$pool2
"
| xargs
)
echo
"
${
pool0
}
|
${
pool1
}
|
${
pool2
}
"
}
# Route a flavor to its group's pre-computed pool.
get_target_indices
()
{
local
flavor
=
$1
local
cuda_version
=
$2
local arch
=
$3
local
-a
available_indices
=(
"
${
@
:4
}
"
)
if
[
${#
available_indices
[@]
}
-eq
0
]
;
then
echo
""
return
fi
local
group
=
$(
flavor_to_group
"
$flavor
"
"
$cuda_version
"
)
local
cuda_major
=
${
cuda_version
%%.*
}
echo
" [DEBUG] Routing Key: '
$flavor
-cuda
$cuda_major
' -> Group:
$group
(
${
GROUP_KEYS
[
$group
]
}
)"
>
&2
local
all_pools
=
$(
compute_group_pools
"
$arch
"
"
${
available_indices
[@]
}
"
)
echo
"
$all_pools
"
|
cut
-d
'|'
-f
$((
group
+
1
))
}
# Process each architecture
...
...
@@ -275,11 +323,11 @@ for ARCH in "${ARCHS[@]}"; do
# Iterate over flavors and set outputs
for
flavor
in
"
${
FLAVORS
[@]
}
"
;
do
# Pass the discovered ACTIVE_INDICES to the routing function
TARGET_INDICES
=(
$(
get_target_indices
"
$flavor
"
"
$CUDA_VERSION
"
"
${
ACTIVE_INDICES
[@]
}
"
)
)
# Pass the discovered ACTIVE_INDICES to the routing function
to get the candidate pool
TARGET_INDICES
=(
$(
get_target_indices
"
$flavor
"
"
$CUDA_VERSION
"
"
$ARCH
"
"
${
ACTIVE_INDICES
[@]
}
"
)
)
ADDRS
=
""
# 2. Get the number of elements in the array
# 2. Get the number of elements in the
candidate pool
array
TARGET_INDICES_LENGTH
=
${#
TARGET_INDICES
[@]
}
# 3. Generate a random index between 0 and length-1
...
...
@@ -288,7 +336,8 @@ for ARCH in "${ARCHS[@]}"; do
RANDOM_VALUE
=
"
${
TARGET_INDICES
[
$RANDOM_INDEX
]
}
"
POD_NAME
=
"
${
POD_PREFIX
}
-
${
RANDOM_VALUE
}
"
ADDRS
=
"tcp://
${
POD_NAME
}
.
${
SERVICE_NAME
}
.
${
NAMESPACE
}
.svc.cluster.local:
${
PORT
}
"
echo
" -> Routing
${
flavor
}
_
${
ARCH
}
to pod indices:
${
TARGET_INDICES
[*]
}
"
echo
" -> Routing
${
flavor
}
_
${
ARCH
}
to Candidate Pool: {
${
TARGET_INDICES
[*]
}
} | Selected:
${
RANDOM_VALUE
}
"
# Write to GitHub Output
echo
"
${
flavor
}
_
${
ARCH
}
=
$ADDRS
"
>>
"
$GITHUB_OUTPUT
"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment