Unverified Commit 48b31ca9 authored by galagam's avatar galagam Committed by GitHub
Browse files

ONNX export test - BF16 support (#256)



* add bf16 subgraph tests
Signed-off-by: default avatarAsfiya Baig <asfiyab@nvidia.com>

* changes:
1. Add normal mode BF16 tests for all subgraphs
2. Add fake BF16 tests for low-level subgraphs
3. Separate IO serialization from validation
Signed-off-by: default avatarAsfiya Baig <asfiyab@nvidia.com>

* ONNX export test - BF16 support part 1

TE infer returns torch.tensor, to support output of bf16 which is
currently not supported in numpy
Signed-off-by: default avatarGal Hubara Agam <ghubaraagam@nvidia.com>

* ONNX export test - BF16 support part 2

- Separate TE infer from serialize
- Fix serialize function to use full path
- Set unique filenames for fake bf16 (avoid overriding standard bf16)
- Remove overwriting fake_bf16_io value
Signed-off-by: default avatarGal Hubara Agam <ghubaraagam@nvidia.com>

* Export test: Slight tolerance increase in test_export_gpt_generation

Causes sporadic failures ~1% of all runs
Signed-off-by: default avatarGal Hubara Agam <ghubaraagam@nvidia.com>

* Remove GEMM fake-bf16 export test and patch to enable it
Signed-off-by: default avatarGal Hubara Agam <ghubaraagam@nvidia.com>

* Review
Signed-off-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>

* fix
Signed-off-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>

---------
Signed-off-by: default avatarAsfiya Baig <asfiyab@nvidia.com>
Signed-off-by: default avatarGal Hubara Agam <ghubaraagam@nvidia.com>
Signed-off-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>
Co-authored-by: default avatarAsfiya Baig <asfiyab@nvidia.com>
Co-authored-by: default avatarKirthi Shankar Sivamani <ksivamani@nvidia.com>
parent d7704b98
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment