Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
evt_fugx1
dcu_megatron
Commits
e3cce568
"examples/quick_start/srt_example_yi_vl.py" did not exist on "c6576e820c87a801d2c9c94ad81e812159c75804"
Commit
e3cce568
authored
May 19, 2025
by
silencealiang
Browse files
add launch_with_binding
parent
48b2dafd
Changes
11
Hide whitespace changes
Inline
Side-by-side
Showing
11 changed files
with
23 additions
and
260 deletions
+23
-260
examples/deepseek_v3/train_deepseekv3_671B_128nodes.sh
examples/deepseek_v3/train_deepseekv3_671B_128nodes.sh
+1
-26
examples/deepseek_v3/train_deepseekv3_671B_1nodes.sh
examples/deepseek_v3/train_deepseekv3_671B_1nodes.sh
+1
-26
examples/deepseek_v3/train_deepseekv3_671B_4nodes.sh
examples/deepseek_v3/train_deepseekv3_671B_4nodes.sh
+1
-26
examples/gpt3/train_gpt_567B_128nodes.sh
examples/gpt3/train_gpt_567B_128nodes.sh
+1
-26
examples/gpt3/train_gpt_567B_1nodes.sh
examples/gpt3/train_gpt_567B_1nodes.sh
+1
-26
examples/llama/train_llama2_7b_1nodes.sh
examples/llama/train_llama2_7b_1nodes.sh
+1
-26
examples/mixtral/train_mixtral_8x22B_1nodes.sh
examples/mixtral/train_mixtral_8x22B_1nodes.sh
+1
-26
examples/mixtral/train_mixtral_8x22B_8nodes.sh
examples/mixtral/train_mixtral_8x22B_8nodes.sh
+1
-26
examples/mixtral/train_mixtral_8x7B_1nodes.sh
examples/mixtral/train_mixtral_8x7B_1nodes.sh
+1
-26
examples/mixtral/train_mixtral_8x7B_4nodes.sh
examples/mixtral/train_mixtral_8x7B_4nodes.sh
+1
-26
requirements/launch_with_binding.sh
requirements/launch_with_binding.sh
+13
-0
No files found.
examples/deepseek_v3/train_deepseekv3_671B_128nodes.sh
View file @
e3cce568
...
...
@@ -430,29 +430,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/deepseek_v3/train_deepseekv3_671B_1nodes.sh
View file @
e3cce568
...
...
@@ -430,29 +430,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/deepseek_v3/train_deepseekv3_671B_4nodes.sh
View file @
e3cce568
...
...
@@ -430,29 +430,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/gpt3/train_gpt_567B_128nodes.sh
View file @
e3cce568
...
...
@@ -165,29 +165,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/gpt3/train_gpt_567B_1nodes.sh
View file @
e3cce568
...
...
@@ -165,29 +165,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/llama/train_llama2_7b_1nodes.sh
View file @
e3cce568
...
...
@@ -159,29 +159,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/mixtral/train_mixtral_8x22B_1nodes.sh
View file @
e3cce568
...
...
@@ -168,29 +168,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/mixtral/train_mixtral_8x22B_8nodes.sh
View file @
e3cce568
...
...
@@ -168,29 +168,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/mixtral/train_mixtral_8x7B_1nodes.sh
View file @
e3cce568
...
...
@@ -168,29 +168,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
examples/mixtral/train_mixtral_8x7B_4nodes.sh
View file @
e3cce568
...
...
@@ -168,29 +168,4 @@ elif [[ $profiling == "hip" ]]; then
fi
#for hygon cpu
case
${
LOCAL_RANK
}
in
0
)
export
HIP_VISIBLE_DEVICES
=
0
numactl
--cpunodebind
=
0
--membind
=
0
${
APP
}
;;
1
)
export
HIP_VISIBLE_DEVICES
=
1
numactl
--cpunodebind
=
1
--membind
=
1
${
APP
}
;;
2
)
export
HIP_VISIBLE_DEVICES
=
2
numactl
--cpunodebind
=
2
--membind
=
2
${
APP
}
;;
3
)
export
HIP_VISIBLE_DEVICES
=
3
numactl
--cpunodebind
=
3
--membind
=
3
${
APP
}
;;
4
)
export
HIP_VISIBLE_DEVICES
=
4
numactl
--cpunodebind
=
4
--membind
=
4
${
APP
}
;;
5
)
export
HIP_VISIBLE_DEVICES
=
5
numactl
--cpunodebind
=
5
--membind
=
5
${
APP
}
;;
6
)
export
HIP_VISIBLE_DEVICES
=
6
numactl
--cpunodebind
=
6
--membind
=
6
${
APP
}
;;
7
)
export
HIP_VISIBLE_DEVICES
=
7
numactl
--cpunodebind
=
7
--membind
=
7
${
APP
}
;;
esac
\ No newline at end of file
${
MEGATRON_PATH
}
/requirements/launch_with_binding.sh
${
LOCAL_RANK
}
${
APP
}
\ No newline at end of file
requirements/launch_with_binding.sh
0 → 100755
View file @
e3cce568
#!/bin/bash
LOCAL_RANK
=
$1
shift
gpu_map
=(
0 1 2 3 4 5 6 7
)
numa_map
=(
0 1 2 3 4 5 6 7
)
GPU_ID
=
${
gpu_map
[
$LOCAL_RANK
]
}
NUMA_ID
=
${
numa_map
[
$LOCAL_RANK
]
}
export
HIP_VISIBLE_DEVICES
=
${
GPU_ID
}
numactl
--cpunodebind
=
${
NUMA_ID
}
--membind
=
${
NUMA_ID
}
"
$@
"
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment