"src/targets/gpu/vscode:/vscode.git/clone" did not exist on "ca8a54fe732e725f0e22ebc09187bd71faf131a5"
- 06 Jun, 2023 2 commits
-
-
Umang Yadav authored
-
Umang Yadav authored
Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf
-
- 05 Jun, 2023 1 commit
-
-
Charlie Lin authored
Changed the doc for find_permutation(shape) to be more clear that it is finding the permutation that would make the shape standard
-
- 04 Jun, 2023 1 commit
-
-
Igor Mirosavljevic authored
-
- 02 Jun, 2023 1 commit
-
-
Chris Austen authored
-
- 01 Jun, 2023 1 commit
-
-
Umang Yadav authored
By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy. By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.
-
- 31 May, 2023 2 commits
-
-
Paul Fultz II authored
-
Umang Yadav authored
partially solves #1656 This PR only handles compilation part of multitarget.
-
- 30 May, 2023 2 commits
-
-
Paul Fultz II authored
Use generate_argument instead of generate_literal for python output as generate_literal doesnt exists Shorten the names for variables from the main module Use prefix p_ for parameters Use shorter variable m for main module in python
-
Paul Fultz II authored
-
- 29 May, 2023 2 commits
-
-
Pavle Jacovic authored
-
Chris Austen authored
-
- 28 May, 2023 1 commit
-
-
Paul Fultz II authored
* Allow quantizing for both int8 and fp16
-
- 25 May, 2023 1 commit
-
-
Ted Themistokleous authored
Use std::numeric_limits::min/max() functions plus the appropriate value to encode -inf/inf
-
- 24 May, 2023 2 commits
-
-
Paul Fultz II authored
Enable retrieving the code object to do tuning in the future.
-
kahmed10 authored
Refactor supported gfx archs
-
- 23 May, 2023 2 commits
-
-
Umang Yadav authored
back out changes for rocm-5.5
-
Djordje Petrovic authored
-
- 20 May, 2023 1 commit
-
-
Umang Yadav authored
* use half hip functions to compute max and min * add verify test for min and max
-
- 19 May, 2023 3 commits
-
-
Chris Austen authored
-
Zhuoran Yin authored
Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
Chris Austen authored
Co-authored-by:
Sam Wu <sam.wu2@amd.com> Co-authored-by:
Paul <pfultz2@yahoo.com>
-
- 18 May, 2023 1 commit
-
-
Umang Yadav authored
-
- 17 May, 2023 2 commits
-
-
Chris Austen authored
Move CI to support the rocm5.5 release
-
shivadbhavsar authored
Adding support for broadcasted scalars to unsqueeze op. Specifying steps other than 1 is disallowed in this implementation since we want the output the always be a tensor. We can support varying step sizes if we allow a broadcasted scalar output from this op.
-
- 11 May, 2023 1 commit
-
-
github-actions[bot] authored
Co-authored-by:causten <causten@users.noreply.github.com>
-
- 09 May, 2023 1 commit
-
-
Chris Austen authored
-
- 08 May, 2023 2 commits
-
-
Umang Yadav authored
-
Charlie Lin authored
Example of using the C++ API to run an ONNX model with dynamic batch
-
- 06 May, 2023 1 commit
-
-
Chris Austen authored
Remove various file not required for what we use Github runners for
-
- 05 May, 2023 3 commits
-
-
Charlie Lin authored
Python API with documentation updates
-
Manupa Karunaratne authored
Adds support for slice,transpose,contigous and reshape fusions into input tensors for a fused mlir kernel.
-
kahmed10 authored
add option to print tf supported ops sort both onnx and tf ops alphabetically
-
- 04 May, 2023 2 commits
-
-
Paul Fultz II authored
When multiplying either the input or output across the K dimensions then the multiple can be applied to the constant which can then be folded with propagate_const.
-
Zhuoran Yin authored
Exposed the mlir_enabled() call the decide for lowering pipeline's enablement Disabled the rewrite quantization pipeline in mlir compilation Added quant convolution as anchor ops Fixed the return type expectations Added the fall back hip implementation for quantizelinear and dequantizelinear Will need advises to improve the implementation for quantizelinear
-
- 03 May, 2023 1 commit
-
-
Charlie Lin authored
Relies on Removed split_single_dyn_dim compile flag #1711 Exposes dynamic_dimension as a opaque object with dynamic_dimensions and optimals Exposes ONNX dyn_input_dims and default_dyn_dim to run with dynamic batch Updates api.py to be able to create objects from aggregate initialization (used for dynamic_dimension) Uses offload copy for now
-
- 02 May, 2023 1 commit
-
-
Paul Fultz II authored
Improves the constant propagation for bert models. Larger batch size no longer use as large of constants. Also improves the speed of model compilation
-
- 01 May, 2023 2 commits
-
-
Pavle Jacovic authored
-
Chris Austen authored
-
- 28 Apr, 2023 1 commit
-
-
Charlie Lin authored
-