Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
TransformerEngine
Commits
27ddce40
Commit
27ddce40
authored
Oct 11, 2025
by
wenjh
Browse files
Merge branch 'nv_main'
parents
d262ef4c
5b3092a0
Changes
208
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
3620 additions
and
12 deletions
+3620
-12
.github/workflows/trigger-ci.yml
.github/workflows/trigger-ci.yml
+3
-0
.gitmodules
.gitmodules
+0
-3
3rdparty/cutlass
3rdparty/cutlass
+1
-0
README.rst
README.rst
+3
-3
build_tools/VERSION.txt
build_tools/VERSION.txt
+1
-1
build_tools/utils.py
build_tools/utils.py
+2
-2
docs/api/pytorch.rst
docs/api/pytorch.rst
+4
-1
docs/examples/attention/attention.ipynb
docs/examples/attention/attention.ipynb
+1
-1
docs/examples/onnx/onnx_export.ipynb
docs/examples/onnx/onnx_export.ipynb
+1
-1
docs/examples/te_gemma/media/calibration.svg
docs/examples/te_gemma/media/calibration.svg
+620
-0
docs/examples/te_gemma/media/calibration_1_half.svg
docs/examples/te_gemma/media/calibration_1_half.svg
+415
-0
docs/examples/te_gemma/media/calibration_2_half.svg
docs/examples/te_gemma/media/calibration_2_half.svg
+401
-0
docs/examples/te_gemma/media/fp8_model_init.svg
docs/examples/te_gemma/media/fp8_model_init.svg
+500
-0
docs/examples/te_gemma/media/fp8_model_init_1_half.svg
docs/examples/te_gemma/media/fp8_model_init_1_half.svg
+358
-0
docs/examples/te_gemma/media/fp8_model_init_2_half.svg
docs/examples/te_gemma/media/fp8_model_init_2_half.svg
+371
-0
docs/examples/te_gemma/media/generation_animation.gif
docs/examples/te_gemma/media/generation_animation.gif
+0
-0
docs/examples/te_gemma/media/graphs.svg
docs/examples/te_gemma/media/graphs.svg
+232
-0
docs/examples/te_gemma/media/transformer_cuda_graphed.png
docs/examples/te_gemma/media/transformer_cuda_graphed.png
+0
-0
docs/examples/te_gemma/requirements.txt
docs/examples/te_gemma/requirements.txt
+4
-0
docs/examples/te_gemma/te_gemma.py
docs/examples/te_gemma/te_gemma.py
+703
-0
No files found.
.github/workflows/trigger-ci.yml
View file @
27ddce40
...
@@ -55,6 +55,9 @@ jobs:
...
@@ -55,6 +55,9 @@ jobs:
|| github.actor == 'pstjohn'
|| github.actor == 'pstjohn'
|| github.actor == 'vcherepanov-nv'
|| github.actor == 'vcherepanov-nv'
|| github.actor == 'tdophung'
|| github.actor == 'tdophung'
|| github.actor == 'vthumbe1503'
|| github.actor == 'janekb04'
|| github.actor == 'shengfangd'
)
)
steps
:
steps
:
-
name
:
Check if comment is issued by authorized person
-
name
:
Check if comment is issued by authorized person
...
...
.gitmodules
View file @
27ddce40
[submodule "3rdparty/googletest"]
[submodule "3rdparty/googletest"]
path = 3rdparty/googletest
path = 3rdparty/googletest
url = https://github.com/google/googletest.git
url = https://github.com/google/googletest.git
[submodule "3rdparty/cudnn-frontend"]
path = 3rdparty/cudnn-frontend
url = https://github.com/NVIDIA/cudnn-frontend.git
[submodule "3rdparty/hipify_torch"]
[submodule "3rdparty/hipify_torch"]
path = 3rdparty/hipify_torch
path = 3rdparty/hipify_torch
url = https://github.com/ROCm/hipify_torch.git
url = https://github.com/ROCm/hipify_torch.git
cutlass
@
57e3cfb4
Subproject commit 57e3cfb47a2d9e0d46eb6335c3dc411498efa198
README.rst
View file @
27ddce40
...
@@ -176,15 +176,15 @@ For example to use the NGC PyTorch container interactively,
...
@@ -176,15 +176,15 @@ For example to use the NGC PyTorch container interactively,
.. code-block:: bash
.. code-block:: bash
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:25.0
4
-py3
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:25.0
8
-py3
For example to use the NGC JAX container interactively,
For example to use the NGC JAX container interactively,
.. code-block:: bash
.. code-block:: bash
docker run --gpus all -it --rm nvcr.io/nvidia/jax:25.0
4
-py3
docker run --gpus all -it --rm nvcr.io/nvidia/jax:25.0
8
-py3
Where 25.0
4
(corresponding to A
pril
2025 release) is the container version.
Where 25.0
8
(corresponding to A
ugust
2025 release) is the container version.
**Benefits of using NGC containers:**
**Benefits of using NGC containers:**
...
...
build_tools/VERSION.txt
View file @
27ddce40
2.
8
.0.dev0
2.
9
.0.dev0
build_tools/utils.py
View file @
27ddce40
...
@@ -13,7 +13,7 @@ import shutil
...
@@ -13,7 +13,7 @@ import shutil
import
subprocess
import
subprocess
import
sys
import
sys
from
pathlib
import
Path
from
pathlib
import
Path
from
importlib.metadata
import
version
from
importlib.metadata
import
version
as
get_version
from
subprocess
import
CalledProcessError
from
subprocess
import
CalledProcessError
from
typing
import
List
,
Optional
,
Tuple
,
Union
from
typing
import
List
,
Optional
,
Tuple
,
Union
...
@@ -307,7 +307,7 @@ def cuda_version() -> Tuple[int, ...]:
...
@@ -307,7 +307,7 @@ def cuda_version() -> Tuple[int, ...]:
return
tuple
(
int
(
v
)
for
v
in
version
)
return
tuple
(
int
(
v
)
for
v
in
version
)
try
:
try
:
version_str
=
version
(
"nvidia-cuda-runtime-cu12"
)
version_str
=
get_
version
(
"nvidia-cuda-runtime-cu12"
)
version_tuple
=
tuple
(
int
(
part
)
for
part
in
version_str
.
split
(
"."
)
if
part
.
isdigit
())
version_tuple
=
tuple
(
int
(
part
)
for
part
in
version_str
.
split
(
"."
)
if
part
.
isdigit
())
return
version_tuple
return
version_tuple
except
importlib
.
metadata
.
PackageNotFoundError
:
except
importlib
.
metadata
.
PackageNotFoundError
:
...
...
docs/api/pytorch.rst
View file @
27ddce40
...
@@ -49,7 +49,7 @@ pyTorch
...
@@ -49,7 +49,7 @@ pyTorch
.. autoapifunction:: transformer_engine.pytorch.moe_permute
.. autoapifunction:: transformer_engine.pytorch.moe_permute
.. autoapifunction:: transformer_engine.pytorch.moe_permute_with_probs
.. autoapifunction:: transformer_engine.pytorch.moe_permute_with_probs
.. autoapifunction:: transformer_engine.pytorch.moe_unpermute
.. autoapifunction:: transformer_engine.pytorch.moe_unpermute
...
@@ -62,3 +62,6 @@ pyTorch
...
@@ -62,3 +62,6 @@ pyTorch
.. autoapifunction:: transformer_engine.pytorch.initialize_ub
.. autoapifunction:: transformer_engine.pytorch.initialize_ub
.. autoapifunction:: transformer_engine.pytorch.destroy_ub
.. autoapifunction:: transformer_engine.pytorch.destroy_ub
.. autoapiclass:: transformer_engine.pytorch.UserBufferQuantizationMode
:members: FP8, NONE
\ No newline at end of file
docs/examples/attention/attention.ipynb
View file @
27ddce40
...
@@ -390,7 +390,7 @@
...
@@ -390,7 +390,7 @@
"| Attention Backend | Precision | Architecture | Sliding Window Attention | MQA/GQA | Multi-Latent Attention | Context Parallelism | Determinism Possible |\n",
"| Attention Backend | Precision | Architecture | Sliding Window Attention | MQA/GQA | Multi-Latent Attention | Context Parallelism | Determinism Possible |\n",
"| :---------------- | :-------- | :----------- | :----------------------- | :------ | :--------------------- | :------------------ | :------------ |\n",
"| :---------------- | :-------- | :----------- | :----------------------- | :------ | :--------------------- | :------------------ | :------------ |\n",
"| cuDNN attention (all frameworks) | BF16, FP16, FP8 (PyTorch only) | sm80+ | No | Yes | Yes | Yes (`bshd`,`sbhd`, `thd`) | Yes |\n",
"| cuDNN attention (all frameworks) | BF16, FP16, FP8 (PyTorch only) | sm80+ | No | Yes | Yes | Yes (`bshd`,`sbhd`, `thd`) | Yes |\n",
"| flash-attention (PyTorch) | BF16, FP16 | sm80+ | Yes | Yes |
No
| Yes (`bshd`,`thd`) | Yes |\n",
"| flash-attention (PyTorch) | BF16, FP16 | sm80+ | Yes | Yes |
Yes
| Yes (`bshd`,`thd`) | Yes |\n",
"| Framework-native attention | BF16, FP16, FP32 | Any | No, unless used as a mask | Yes | Yes (PyTorch only) | No | Yes |\n",
"| Framework-native attention | BF16, FP16, FP32 | Any | No, unless used as a mask | Yes | Yes (PyTorch only) | No | Yes |\n",
"\n",
"\n",
"Some unit tests are provided to serve as a starting point for integrating such features into users' models. For example,\n",
"Some unit tests are provided to serve as a starting point for integrating such features into users' models. For example,\n",
...
...
docs/examples/onnx/onnx_export.ipynb
View file @
27ddce40
...
@@ -10,7 +10,7 @@
...
@@ -10,7 +10,7 @@
"\n",
"\n",
"<b>Note:</b>\n",
"<b>Note:</b>\n",
"\n",
"\n",
"Currently, export to ONNX is supported only for high precision, FP8 delayed scaling and MXFP8.\n",
"Currently, export to ONNX is supported only for high precision, FP8 delayed
scaling, FP8 current
scaling and MXFP8.\n",
"\n",
"\n",
"</div>\n",
"</div>\n",
"\n",
"\n",
...
...
docs/examples/te_gemma/media/calibration.svg
0 → 100644
View file @
27ddce40
This diff is collapsed.
Click to expand it.
docs/examples/te_gemma/media/calibration_1_half.svg
0 → 100755
View file @
27ddce40
This diff is collapsed.
Click to expand it.
docs/examples/te_gemma/media/calibration_2_half.svg
0 → 100644
View file @
27ddce40
This diff is collapsed.
Click to expand it.
docs/examples/te_gemma/media/fp8_model_init.svg
0 → 100644
View file @
27ddce40
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
width=
"1280"
height=
"379.66562"
overflow=
"hidden"
version=
"1.1"
id=
"svg31"
sodipodi:docname=
"fp8_model_init.svg"
inkscape:version=
"1.4.2 (f4327f4, 2025-05-13)"
xmlns:inkscape=
"http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi=
"http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns=
"http://www.w3.org/2000/svg"
xmlns:svg=
"http://www.w3.org/2000/svg"
>
<sodipodi:namedview
id=
"namedview1"
pagecolor=
"#ffffff"
bordercolor=
"#000000"
borderopacity=
"0.25"
inkscape:showpageshadow=
"2"
inkscape:pageopacity=
"0.0"
inkscape:pagecheckerboard=
"0"
inkscape:deskcolor=
"#d1d1d1"
inkscape:zoom=
"1.8208"
inkscape:cx=
"685.41302"
inkscape:cy=
"184.80888"
inkscape:window-width=
"3440"
inkscape:window-height=
"1369"
inkscape:window-x=
"-8"
inkscape:window-y=
"-8"
inkscape:window-maximized=
"1"
inkscape:current-layer=
"g31"
/>
<defs
id=
"defs31"
>
<clipPath
clipPathUnits=
"userSpaceOnUse"
id=
"clipPath31"
>
<rect
style=
"fill:none"
id=
"rect32"
width=
"1390.9491"
height=
"379.66562"
x=
"-54.734409"
y=
"146.82722"
ry=
"36.489601"
/>
</clipPath>
</defs>
<g
id=
"g31"
clip-path=
"url(#clipPath31)"
transform=
"translate(0,-146.82722)"
>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"24px"
id=
"text1"
x=
"153.29384"
y=
"195.21265"
>
FP32/BF16
</text>
<path
d=
"M 821,170 V 513.312"
stroke=
"#000000"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"none"
fill-rule=
"evenodd"
id=
"path1"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"24px"
id=
"text2"
x=
"616.69165"
y=
"194.66344"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"24px"
id=
"text3"
x=
"908.73199"
y=
"193.56503"
>
FP8 with fp8_model_init()
</text>
<rect
x=
"868"
y=
"326"
width=
"129"
height=
"164"
stroke=
"#042433"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"#e8e8e8"
id=
"rect3"
/>
<rect
x=
"882.45081"
y=
"381.1239"
width=
"101"
height=
"45"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect4"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text4"
x=
"920.40778"
y=
"400.1239"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text5"
x=
"911.3208"
y=
"416.1239"
>
weight
</text>
<rect
x=
"1078.4508"
y=
"381.1239"
width=
"82"
height=
"45"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#c1e5f5"
id=
"rect5"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text6"
x=
"1107.5007"
y=
"400.1239"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text7"
x=
"1098.8308"
y=
"416.1239"
>
GEMM
</text>
<path
d=
"m 983.45079,403.1239 h 89.04001 v 2 h -89.04001 z m 87.71001,-3 8,4 -8,4 z"
id=
"path7"
/>
<path
d=
"M 422,170 V 513.312"
stroke=
"#000000"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"none"
fill-rule=
"evenodd"
id=
"path9"
/>
<rect
x=
"54"
y=
"326"
width=
"129"
height=
"164"
stroke=
"#042433"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"#e8e8e8"
id=
"rect9"
/>
<rect
x=
"67.45079"
y=
"367.47629"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect10"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text10"
x=
"104.84079"
y=
"390.47629"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text11"
x=
"91.087494"
y=
"406.47629"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text12"
x=
"98.087494"
y=
"422.47629"
>
weight
</text>
<rect
x=
"270.45081"
y=
"240.47627"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect12"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text13"
x=
"307.6308"
y=
"263.47629"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text14"
x=
"293.87778"
y=
"279.47629"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text15"
x=
"305.79779"
y=
"295.47629"
>
input
</text>
<rect
x=
"270.45081"
y=
"367.47629"
width=
"103"
height=
"70"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#c1e5f5"
id=
"rect15"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text16"
x=
"307.6308"
y=
"390.47629"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text17"
x=
"293.87778"
y=
"406.47629"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text18"
x=
"301.29779"
y=
"422.47629"
>
GEMM
</text>
<path
d=
"m 170.4572,404.11625 93.11279,-0.59724 -0.0128,-1.99996 -93.11281,0.59725 z m 91.79869,2.41125 7.9742,-4.05123 -8.0255,-3.9486 z"
id=
"path18"
/>
<path
d=
"m 323.45079,311.47627 v 49.395 h -2 v -49.395 z m 3,48.061 -4,8 -4,-8 z"
id=
"path19"
/>
<rect
x=
"447"
y=
"326"
width=
"129"
height=
"164"
stroke=
"#042433"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"#e8e8e8"
id=
"rect19"
/>
<rect
x=
"460.90158"
y=
"368.57471"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect20"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text20"
x=
"497.76358"
y=
"392.57471"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text21"
x=
"484.01059"
y=
"408.57471"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text22"
x=
"491.01059"
y=
"424.57471"
>
weight
</text>
<rect
x=
"604.90161"
y=
"381.57471"
width=
"81"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#fbe3d6"
id=
"rect22"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text23"
x=
"633.21356"
y=
"399.57471"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text24"
x=
"622.71356"
y=
"415.57471"
>
Weight
</text>
<g
id=
"g33"
transform=
"translate(70.847981,7.139719)"
>
<rect
x=
"638.21271"
y=
"302.41418"
width=
"81"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#fbe3d6"
id=
"rect22-2"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text23-7"
x=
"666.52472"
y=
"320.41418"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text24-6"
x=
"662.06604"
y=
"336.96341"
>
Input
</text>
</g>
<rect
x=
"708.90161"
y=
"381.57471"
width=
"82"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#c1e5f5"
id=
"rect26"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text27"
x=
"737.56158"
y=
"399.57471"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text28"
x=
"728.89557"
y=
"415.57471"
>
GEMM
</text>
<path
d=
"m 563.91732,405.21457 34.00266,-0.5351 -0.0314,-1.99976 -34.00273,0.53511 z m 32.71676,2.4855 7.9361,-4.12538 -8.062,-3.87362 z"
id=
"path28"
/>
<path
d=
"m 685.90158,402.57469 h 15.791 v 2 h -15.791 z m 14.458,-3 8,4 -8,4 z"
id=
"path29"
/>
<path
d=
"m 750.90158,284.49209 v 21.98469 h -2 v -21.98469 z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30"
style=
"stroke-width:0.53037"
/>
<path
d=
"m 751.17135,355.90367 v 21.98469 h -2 v -21.98469 z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30-2"
style=
"stroke-width:0.53037"
/>
<rect
x=
"701.05359"
y=
"215.25253"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect23"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text29"
x=
"738.23358"
y=
"238.25253"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text32"
x=
"724.48059"
y=
"254.25255"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text33"
x=
"736.40057"
y=
"270.25253"
>
input
</text>
<g
id=
"g33-9"
transform=
"translate(441.10986,7.0509646)"
>
<rect
x=
"638.21271"
y=
"302.41418"
width=
"81"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#fbe3d6"
id=
"rect22-2-5"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text23-7-4"
x=
"666.52472"
y=
"320.41418"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text24-6-3"
x=
"662.06604"
y=
"336.96341"
>
Input
</text>
</g>
<path
d=
"m 1121.1635,284.40334 v 21.98469 h -2 v -21.98469 z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30-1"
style=
"stroke-width:0.53037"
/>
<path
d=
"m 1121.4332,355.81492 v 21.98469 h -2 v -21.98469 z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30-2-2"
style=
"stroke-width:0.53037"
/>
<rect
x=
"1071.3154"
y=
"215.16379"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect23-3"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text29-3"
x=
"1108.4955"
y=
"238.16379"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text32-4"
x=
"1094.7424"
y=
"254.1638"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text33-1"
x=
"1106.6625"
y=
"270.16376"
>
input
</text>
</g>
</svg>
docs/examples/te_gemma/media/fp8_model_init_1_half.svg
0 → 100644
View file @
27ddce40
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
width=
"960"
height=
"373.58408"
overflow=
"hidden"
version=
"1.1"
id=
"svg23"
sodipodi:docname=
"fp8_model_init_1_half.svg"
inkscape:version=
"1.4.2 (f4327f4, 2025-05-13)"
xmlns:inkscape=
"http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi=
"http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns=
"http://www.w3.org/2000/svg"
xmlns:svg=
"http://www.w3.org/2000/svg"
>
<sodipodi:namedview
id=
"namedview1"
pagecolor=
"#ffffff"
bordercolor=
"#000000"
borderopacity=
"0.25"
inkscape:showpageshadow=
"2"
inkscape:pageopacity=
"0.0"
inkscape:pagecheckerboard=
"0"
inkscape:deskcolor=
"#d1d1d1"
inkscape:zoom=
"3.1237948"
inkscape:cx=
"479.86506"
inkscape:cy=
"186.79204"
inkscape:window-width=
"3440"
inkscape:window-height=
"1369"
inkscape:window-x=
"-8"
inkscape:window-y=
"-8"
inkscape:window-maximized=
"1"
inkscape:current-layer=
"g23"
/>
<defs
id=
"defs23"
>
<clipPath
clipPathUnits=
"userSpaceOnUse"
id=
"clipPath23"
>
<rect
style=
"fill:none"
id=
"rect24"
width=
"997.38257"
height=
"373.58408"
x=
"-11.584002"
y=
"41.702408"
ry=
"36.489601"
/>
</clipPath>
</defs>
<g
id=
"g23"
clip-path=
"url(#clipPath23)"
transform=
"translate(0,-41.702408)"
>
<rect
x=
"0"
y=
"0"
width=
"960"
height=
"480"
fill=
"#ffffff"
id=
"rect1"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"22px"
transform=
"translate(195.4,93)"
id=
"text1"
>
FP32/BF16
</text>
<path
d=
"M 461,61 V 404.312"
stroke=
"#000000"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"none"
fill-rule=
"evenodd"
id=
"path1"
/>
<rect
x=
"92"
y=
"217"
width=
"129"
height=
"164"
stroke=
"#042433"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"#e8e8e8"
id=
"rect2"
/>
<rect
x=
"105.07926"
y=
"266.32938"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect3"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text3"
x=
"142.27226"
y=
"289.32938"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text4"
x=
"128.51926"
y=
"305.32938"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text5"
x=
"135.51926"
y=
"321.32938"
>
weight
</text>
<rect
x=
"308.07925"
y=
"138.32938"
width=
"103"
height=
"72"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect5"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text6"
x=
"345.06326"
y=
"162.32938"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text7"
x=
"331.31027"
y=
"178.32938"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text8"
x=
"343.23026"
y=
"194.32938"
>
input
</text>
<rect
x=
"308.07925"
y=
"266.32938"
width=
"103"
height=
"70"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#c1e5f5"
id=
"rect8"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text9"
x=
"345.06326"
y=
"289.32938"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text10"
x=
"331.30927"
y=
"305.32938"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text11"
x=
"338.72925"
y=
"321.32938"
>
GEMM
</text>
<path
d=
"m 208.08567,302.96936 93.11279,-0.59724 -0.0128,-1.99996 -93.11281,0.59724 z m 91.79869,2.41125 7.9742,-4.05123 -8.0255,-3.9486 z"
id=
"path11"
/>
<path
d=
"m 360.07926,210.32938 v 49.395 h -2 v -49.395 z m 3,48.061 -4,8 -4,-8 z"
id=
"path12"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"22px"
transform=
"translate(645.181,91)"
id=
"text23"
>
FP8
</text>
<rect
x=
"495.63504"
y=
"222.57803"
width=
"129"
height=
"164"
stroke=
"#042433"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"#e8e8e8"
id=
"rect19"
/>
<rect
x=
"509.53662"
y=
"265.15271"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect20"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text20"
x=
"546.39862"
y=
"289.15274"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text21"
x=
"532.64563"
y=
"305.15274"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text22"
x=
"539.64563"
y=
"321.15274"
>
weight
</text>
<rect
x=
"653.53668"
y=
"278.15274"
width=
"81"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#fbe3d6"
id=
"rect22"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text23-3"
x=
"681.84863"
y=
"296.15274"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text24"
x=
"671.34863"
y=
"312.15274"
>
Weight
</text>
<g
id=
"g33"
transform=
"translate(119.48305,-96.282252)"
>
<rect
x=
"638.21271"
y=
"302.41418"
width=
"81"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#fbe3d6"
id=
"rect22-2"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text23-7"
x=
"666.52472"
y=
"320.41418"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text24-6"
x=
"662.06604"
y=
"336.96341"
>
Input
</text>
</g>
<rect
x=
"757.53668"
y=
"278.15274"
width=
"82"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#c1e5f5"
id=
"rect26"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text27"
x=
"786.19666"
y=
"296.15274"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text28"
x=
"777.53064"
y=
"312.15274"
>
GEMM
</text>
<path
d=
"m 612.55239,301.7926 34.00266,-0.5351 -0.0314,-1.99976 -34.00273,0.53511 z m 32.71676,2.4855 7.9361,-4.12538 -8.062,-3.87362 z"
id=
"path28"
/>
<path
d=
"m 734.53665,299.15272 h 15.791 v 2 h -15.791 z m 14.458,-3 8,4 -8,4 z"
id=
"path29"
/>
<path
d=
"m 799.53665,181.07012 v 21.98469 h -2 v -21.98469 z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30"
style=
"stroke-width:0.53037"
/>
<path
d=
"m 799.80642,252.4817 v 21.98469 h -2 V 252.4817 Z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30-2"
style=
"stroke-width:0.53037"
/>
<rect
x=
"749.68866"
y=
"111.83057"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect23"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text29"
x=
"786.86865"
y=
"134.83058"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text32"
x=
"773.11566"
y=
"150.83058"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text33"
x=
"785.03564"
y=
"166.83057"
>
input
</text>
</g>
</svg>
docs/examples/te_gemma/media/fp8_model_init_2_half.svg
0 → 100644
View file @
27ddce40
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
width=
"960"
height=
"379.95526"
overflow=
"hidden"
version=
"1.1"
id=
"svg19"
sodipodi:docname=
"fp8_model_init_2_half.svg"
inkscape:version=
"1.4.2 (f4327f4, 2025-05-13)"
xmlns:inkscape=
"http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi=
"http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns=
"http://www.w3.org/2000/svg"
xmlns:svg=
"http://www.w3.org/2000/svg"
>
<sodipodi:namedview
id=
"namedview1"
pagecolor=
"#ffffff"
bordercolor=
"#000000"
borderopacity=
"0.25"
inkscape:showpageshadow=
"2"
inkscape:pageopacity=
"0.0"
inkscape:pagecheckerboard=
"0"
inkscape:deskcolor=
"#d1d1d1"
inkscape:zoom=
"2.1718178"
inkscape:cx=
"502.34416"
inkscape:cy=
"194.07705"
inkscape:window-width=
"3440"
inkscape:window-height=
"1369"
inkscape:window-x=
"-8"
inkscape:window-y=
"-8"
inkscape:window-maximized=
"1"
inkscape:current-layer=
"svg19"
/>
<defs
id=
"defs19"
>
<clipPath
clipPathUnits=
"userSpaceOnUse"
id=
"clipPath20"
>
<rect
style=
"fill:none"
id=
"rect21"
width=
"1014.7587"
height=
"379.95526"
x=
"-21.430403"
y=
"44.598408"
/>
</clipPath>
</defs>
<g
id=
"g19"
clip-path=
"url(#clipPath20)"
transform=
"translate(-76.837568,-52.086815)"
/>
<path
d=
"M 434.81331,26.957307 V 370.26931"
stroke=
"#000000"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"none"
fill-rule=
"evenodd"
id=
"path1"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"24px"
id=
"text2"
x=
"216.69165"
y=
"33.663437"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"24px"
id=
"text3"
x=
"508.73199"
y=
"32.565033"
>
FP8 with fp8_model_init()
</text>
<rect
x=
"481.81332"
y=
"182.95731"
width=
"129"
height=
"164"
stroke=
"#042433"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"#e8e8e8"
id=
"rect3"
/>
<rect
x=
"496.26413"
y=
"238.08121"
width=
"101"
height=
"45"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect4"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text4"
x=
"534.22107"
y=
"257.08121"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text5"
x=
"525.13409"
y=
"273.08121"
>
weight
</text>
<rect
x=
"692.2641"
y=
"238.08121"
width=
"82"
height=
"45"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#c1e5f5"
id=
"rect5"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text6"
x=
"721.31403"
y=
"257.08121"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text7"
x=
"712.6441"
y=
"273.08121"
>
GEMM
</text>
<path
d=
"m 597.2641,260.08121 h 89.04001 v 2 H 597.2641 Z m 87.71001,-3 8,4 -8,4 z"
id=
"path7"
/>
<rect
x=
"60.813313"
y=
"182.95731"
width=
"129"
height=
"164"
stroke=
"#042433"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"#e8e8e8"
id=
"rect19"
/>
<rect
x=
"74.714897"
y=
"225.53201"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect20"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text20"
x=
"111.5769"
y=
"249.53201"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text21"
x=
"97.823906"
y=
"265.53201"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text22"
x=
"104.82391"
y=
"281.53201"
>
weight
</text>
<rect
x=
"218.71492"
y=
"238.53201"
width=
"81"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#fbe3d6"
id=
"rect22"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text23"
x=
"247.02687"
y=
"256.53201"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text24"
x=
"236.52687"
y=
"272.53201"
>
Weight
</text>
<g
id=
"g33"
transform=
"translate(-315.33871,-135.90297)"
>
<rect
x=
"638.21271"
y=
"302.41418"
width=
"81"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#fbe3d6"
id=
"rect22-2"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text23-7"
x=
"666.52472"
y=
"320.41418"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text24-6"
x=
"662.06604"
y=
"336.96341"
>
Input
</text>
</g>
<rect
x=
"322.71494"
y=
"238.53201"
width=
"82"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#c1e5f5"
id=
"rect26"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text27"
x=
"351.37491"
y=
"256.53201"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text28"
x=
"342.70889"
y=
"272.53201"
>
GEMM
</text>
<path
d=
"m 177.73063,262.17188 34.00266,-0.5351 -0.0314,-1.99976 -34.00273,0.53511 z m 32.71676,2.4855 7.9361,-4.12538 -8.062,-3.87362 z"
id=
"path28"
/>
<path
d=
"m 299.71489,259.532 h 15.791 v 2 h -15.791 z m 14.458,-3 8,4 -8,4 z"
id=
"path29"
/>
<path
d=
"m 364.71489,141.4494 v 21.98469 h -2 V 141.4494 Z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30"
style=
"stroke-width:0.53037"
/>
<path
d=
"m 364.98466,212.86098 v 21.98469 h -2 v -21.98469 z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30-2"
style=
"stroke-width:0.53037"
/>
<rect
x=
"314.86691"
y=
"72.209839"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect23"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text29"
x=
"352.04691"
y=
"95.209839"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text32"
x=
"338.29391"
y=
"111.20985"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text33"
x=
"350.2139"
y=
"127.20984"
>
input
</text>
<g
id=
"g33-9"
transform=
"translate(54.923173,-135.99173)"
>
<rect
x=
"638.21271"
y=
"302.41418"
width=
"81"
height=
"44"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#fbe3d6"
id=
"rect22-2-5"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text23-7-4"
x=
"666.52472"
y=
"320.41418"
>
FP8
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text24-6-3"
x=
"662.06604"
y=
"336.96341"
>
Input
</text>
</g>
<path
d=
"m 734.97681,141.36065 v 21.98469 h -2 v -21.98469 z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30-1"
style=
"stroke-width:0.53037"
/>
<path
d=
"m 735.24651,212.77223 v 21.98469 h -2 v -21.98469 z m 3,21.60945 -4,2.25033 -4,-2.25033 z"
id=
"path30-2-2"
style=
"stroke-width:0.53037"
/>
<rect
x=
"685.12872"
y=
"72.121094"
width=
"103"
height=
"71"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect23-3"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text29-3"
x=
"722.30878"
y=
"95.121094"
>
High
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text32-4"
x=
"708.55573"
y=
"111.12111"
>
precision
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"13px"
id=
"text33-1"
x=
"720.47577"
y=
"127.12106"
>
input
</text>
</svg>
docs/examples/te_gemma/media/generation_animation.gif
0 → 100644
View file @
27ddce40
132 KB
docs/examples/te_gemma/media/graphs.svg
0 → 100644
View file @
27ddce40
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
width=
"1280"
height=
"303.21127"
overflow=
"hidden"
version=
"1.1"
id=
"svg12"
xmlns=
"http://www.w3.org/2000/svg"
xmlns:svg=
"http://www.w3.org/2000/svg"
>
<defs
id=
"defs12"
>
<clipPath
clipPathUnits=
"userSpaceOnUse"
id=
"clipPath16"
>
<rect
style=
"fill:none;stroke-width:0.96471"
id=
"rect16"
width=
"1344.0338"
height=
"303.21124"
x=
"-32.356411"
y=
"174.8833"
/>
</clipPath>
</defs>
<g
id=
"g12"
transform=
"translate(1.1556091e-7,-174.8833)"
clip-path=
"url(#clipPath16)"
>
<rect
x=
"0"
y=
"0"
width=
"1280"
height=
"720"
fill=
"#ffffff"
id=
"rect1"
/>
<path
d=
"M 645,209 V 446.818"
stroke=
"#000000"
stroke-width=
"2"
stroke-miterlimit=
"8"
fill=
"none"
fill-rule=
"evenodd"
id=
"path1"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"24px"
transform=
"translate(201.111,246)"
id=
"text1"
>
Without CUDA Graphs
</text>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"700"
font-size=
"24px"
transform=
"translate(855.749,246)"
id=
"text2"
>
With CUDA Graphs
</text>
<rect
x=
"64"
y=
"319"
width=
"91"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#f2f2f2"
id=
"rect2"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(75.6135,349)"
id=
"text3"
>
Launch 1
</text>
<rect
x=
"155"
y=
"371"
width=
"90"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect3"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(169.288,401)"
id=
"text4"
>
Kernel 1
</text>
<rect
x=
"245"
y=
"319"
width=
"91"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#f2f2f2"
id=
"rect4"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(256.462,349)"
id=
"text5"
>
Launch 2
</text>
<rect
x=
"336"
y=
"371"
width=
"90"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect5"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(350.136,401)"
id=
"text6"
>
Kernel 2
</text>
<rect
x=
"426"
y=
"319"
width=
"91"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#f2f2f2"
id=
"rect6"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(437.31,349)"
id=
"text7"
>
Launch 3
</text>
<rect
x=
"517"
y=
"371"
width=
"90"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect7"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(530.984,401)"
id=
"text8"
>
Kernel 3
</text>
<path
d=
"m 47,368 h 574.291 v 4 H 47 Z m 572.291,-4 12,6 -12,6 z"
id=
"path8"
/>
<rect
x=
"680"
y=
"319"
width=
"145"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#f2f2f2"
id=
"rect8"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(694.058,349)"
id=
"text9"
>
Launch Graph 1
</text>
<rect
x=
"830"
y=
"370"
width=
"91"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect9"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(844.463,400)"
id=
"text10"
>
Kernel 1
</text>
<rect
x=
"924"
y=
"370"
width=
"90"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect10"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(938.451,400)"
id=
"text11"
>
Kernel 2
</text>
<rect
x=
"1018"
y=
"370"
width=
"90"
height=
"49"
stroke=
"#000000"
stroke-width=
"2"
stroke-linejoin=
"round"
stroke-miterlimit=
"10"
fill=
"#d9f2d0"
id=
"rect11"
/>
<text
font-family=
"'NVIDIA Sans', 'NVIDIA Sans_MSFontService', sans-serif"
font-weight=
"400"
font-size=
"16px"
transform=
"translate(1032.44,400)"
id=
"text12"
>
Kernel 3
</text>
<path
d=
"m 663,368 h 574.29 v 4 H 663 Z m 572.29,-4 12,6 -12,6 z"
id=
"path12"
/>
</g>
</svg>
docs/examples/te_gemma/media/transformer_cuda_graphed.png
0 → 100644
View file @
27ddce40
361 KB
docs/examples/te_gemma/requirements.txt
0 → 100755
View file @
27ddce40
transformers==4.55.0
accelerate==1.10.0
datasets==4.0.0
sentencepiece==0.2.1
docs/examples/te_gemma/te_gemma.py
0 → 100755
View file @
27ddce40
This diff is collapsed.
Click to expand it.
Prev
1
2
3
4
5
…
11
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment