Commits · 7c4dc99a8aa7332ec4e73fc9c6cd75724eee6e40 · gaoqiong / MIGraphX

27 Mar, 2023 1 commit

[MLIR] add dot offloads with manual tuning support (#1631) · 7c4dc99a

Manupa Karunaratne authored Mar 27, 2023

* [MLIR] add dot offloads with manual tuning support
* This commit adds dot + pointwise fusion support
along with manual tuning using rocMLIR.

7c4dc99a

18 Mar, 2023 1 commit
- Dynamically plug-in backend target libs (#1608) · 7a7040aa
  Umang Yadav authored Mar 18, 2023
```
Fixes #1595
```
  7a7040aa
31 Jan, 2023 1 commit

hipRTC fixes (#1531) · 91cc7242

Umang Yadav authored Jan 31, 2023

Added CMakeFlag for hipRTC. MIGRAPHX_USE_HIPRTC.
Added stages in Jenkins for hipRTC.
Fixes for some of the pending issues from hipRTC.

91cc7242

06 Dec, 2022 2 commits

Add tupleVisitor for from_gpu (#1465) · a4c2b889

Ted Themistokleous authored Dec 06, 2022

Need this for when we debug and use MIGRAPHX_TRACE_EVAL() to show tuples
Without this we break when reading our buffer due to the use of visit()
This came up as part of #1283 debugging.

a4c2b889

Update MLIR integration (#1451) · be70702d

jungpark-mlir authored Dec 06, 2022

Update dialect registration interface
Update 2nd build pipeline call and use full arch name

be70702d

27 Oct, 2022 1 commit

Upgrade CI environment to 5.3.0 (#1198) · 4b1c1c41

Chris Austen authored Oct 27, 2022

Upgraded Dockerfiles and fixed tidy issues to make Ubuntu 20.04 and ROCm 5.3.0 the default

4b1c1c41

18 Oct, 2022 1 commit

Add support in mlir for transposed and broadcasted shaped (#1378) · c3e02b18

Paul Fultz II authored Oct 18, 2022



* Enable non-standard shape
* Use perfdb for non xdlops
* Fix transpose+broadcast strides
Co-authored-by: jungpark-mlir <jungwook.park@amd.com>

c3e02b18

13 Oct, 2022 1 commit

Refactor dynamic padding mode (#1387) · 32f6388c

Charlie Lin authored Oct 13, 2022

Removes use_dynamic_same_auto_pad
Change padding_mode to be used for dynamic padding
Move compute_padded_shape to pad_calc.cpp as it will be used in other dynamic padding cases
Fix same_lower compute_padded_shape bug and add a test.

32f6388c

04 Oct, 2022 1 commit
- Stream sync Changset (#1358) · f7d987ba
  Ted Themistokleous authored Oct 04, 2022
```
Stream sync changes and associated API level changes
```
  f7d987ba
29 Sep, 2022 1 commit

Use find_2.0 API for the convolution (#1346) · e19f78ae

Umang Yadav authored Sep 29, 2022

Improvements/Additions to be made:

changes for the quant_convolution,
changes for the deconvolution,
Macros for MIOpen status checks

e19f78ae

28 Sep, 2022 1 commit

Add compute_fp32 flag for quant_gemm tests (#1360) · 70e63960

Umang Yadav authored Sep 28, 2022

test_gpu_pack_int8_args fails on gfx908 machine, because it doesn't set compute_fp32 flag correctly. This PR fixes the test such that it checks for the device-name, and rocblas-versions and sets this flag accordingly.

70e63960

27 Sep, 2022 1 commit
- Add onnx mod operator gpu cpu (#1306) · 40118191
  Ted Themistokleous authored Sep 26, 2022
```
Implement operator for CPU and GPU implementations
```
  40118191
23 Sep, 2022 1 commit
- Remove unused device functions (#1394) · 8ea8473d
  Paul Fultz II authored Sep 23, 2022
```
* Remove device functions
* Update tests
```
  8ea8473d
16 Sep, 2022 1 commit
- Fix typo for add_sigmoid (#1385) · 10f37f49
  Umang Yadav authored Sep 16, 2022
```
* fix typo for add_sigmoid
```
  10f37f49
15 Sep, 2022 1 commit

[mlir] Replaced `find_library` with `find_package` to locate MLIR static library (#1373) · e1e36cdc

Lixun Zhang authored Sep 15, 2022

* Replaced `find_library` with `find_package` to locate MLIR static library
* Unified the include dir for headers and remove backward compatibility
* Embedded the external/include dir into the exported library

e1e36cdc

04 Aug, 2022 1 commit

Dynamic ref convolution op (#1224) · 67f77ac1

Charlie Lin authored Aug 04, 2022



* Dynamic shape handling in shape object

* rewrite empty lens multibroadcast test

* Shape class changes to handle dynamic
* More throw errors for functions that don't make sense for dynamic shape
* Print output changes
* Serialization changes

* Fixing serialization errors

* Remove const on dyn_dim copy getters

* Dynamic shape tests

* Fix serialize errors

* Add dyn_data struct to avoid ambiguous constructor

* Tidy fix: emplace_back() over for loop

* Tidy fix: use move

* Use std::initializer_list in constructor
Reverts the dyn_data struct change
Should get around the ambiguous braced initialization list error

* avoid typedef

* element_space, min,max,opt _lens change

* formatting

* Comments fix

* dynamic bytes() test

* Seralize and reflect changes

* formatting

* Test the dynamic lens functions

* progress

* Formatting

* Dynamic conv draft progress

* Add operator<< tests for coverage

* Coverage update

* Add to conv dynamic batch test

* Dynamic image size test

* Dynamic weight handling

* Dyn image shape test change, fix dyn weight cond

* Comment update

* Dynamic weights shape test and fix

* Use ternary operator

* Tidy fixes

* Handle dynamic graph input shapes in ONNX parser

* Formatting

* Handle dynamic shape for convolution

* formatting

* cppcheck fixes

* Add onnx test files

* Fix typo

* Disable auto_pad for dynamic input shape

* check_shapes object checks for allowing dynamic shapes

* Fix any_of

* Change to maintain const objectness

* Formatting

* Check shapes allow dynamic

* Refactor compute_shape() call into op.compute()
Allows for per operator differences with handling dynamic shape
Fix operation.hpp change to use the generator

* Comment fix

* Refactor normalize_attributes() calls to use max_lens()

* Comment addition

* Update other normalize_attributes() calls

* Change to using constructor and add tests

* Use const member function

* Add more dynamic shape support

* Add tests for error code coverage

* Fix opt shape bug and add shape tests

* capture all by ref

* Fix typo with img shape calculation

* Add more tests

* dynamic auto pad attempt
Linker error with pad_calc.cpp

* Fix parse dyn auto_pad
Should only need to use dynamic auto pad when the image shape or kernel
shape are dynamic. For a dynamic batch size, the auto pad calculation is
the same.

* Fix linking error

* Fix auto_pad bug
Fixed input tensor with auto_pad setting on

* auto_pad onnx tests

* Fix auto_pad calculation, evaluate in ref_conv
add ref_ops tests

* Add shape tests, fix bugs

* Refactor first two output dynamic len calculation

* Conv MLIR test update

* i64 MLIR test fix

* Fix MLIR test typo
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

67f77ac1

29 Jul, 2022 1 commit

Avoid registering host buffer ptr multiple times during hip copies (#1245) · 7596f3f1

Umang Yadav authored Jul 29, 2022

Currently, while copying a host buffer to the device, it first registers/maps the host buffer pointer to address space of the device.

If the host buffer has been allocated by the hipHostMalloc then, it is implicitly registered to the device's address space, and no need to register again. This PR adds a check for the same.

7596f3f1

03 Jul, 2022 1 commit

Add mlir fusion (#1251) · ca8a54fe

Paul Fultz II authored Jul 03, 2022

* Add mlir c api

* Formatting

* Create a type attribute

* Formatting

* Parse module

* Formatting

* Add mlir dump function

* Add test case

* Formatting

* Fix tidy issues

* Update mlit version

* Update to newer mlir

* Format

* Move mlir to the gpu and update the test

* Formatting

* Fix bug when appending module

* Format

* Remove old cmake flag

* Update message

* Add return

* Format

* Add mlir_compile

* Format

* Register dialect

* Handle unsinged integers

* Dont provide output for return instruction

* Format

* Add code to insert memrefs

* Format

* Add mlir verification

* Formatting

* Enable pointwise_fusion

* Disable eliminate_data_type

* Set kernal name

* Format

* Fix device name

* Formatting

* Fix output arg

* Format

* Updates

* Upate hash

* Add fuse_mlir pass

* Format

* Add fuse mlir

* Format

* Update mlir

* Sort parameter names

* Format

* Reenable disabled passes

* Remove old mlir conv

* Remove asym default padding

* Add more verbose tracing

* Format

* Fix compilation errors

* Format

* Whitelist operators

* Format

* Add namespace

* Format

* Update triple

* Format

* Use func dialect

* Format

* Use func.return

* Format

* Upgrade mlir version

* Add comment

* Handle symetrical padding

* Format

* Cleanup debug output

* Format

* List failed tests

* Move mlir compile to jit pipeline

* Format

* Update version

* Add source locations

* Format

* Correctly add module

* Format

* Update failed tests

* Fix failures when mlir is disabled

* Format

* Update mlir version

* Check type for fp32

* Format

* Remove failed test

* Update mlir in driver

* Tidy fixes

* Foramt

* Tidy fixes

* Format

* Fix const

* Remove from requirements

* Fix cmake version

* Fix tidy warning

* Use another ifdef

* Fix tidy

* Other tidy fix

* Format

* Update hash

* Add missing license files

* Format

* Format

* Fix fnction name

ca8a54fe

22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
17 Jun, 2022 1 commit

Create allocate op and replace_allocate pass (#1183) · add6fb3b

kahmed10 authored Jun 17, 2022



* add allocate op header

* formatting

* add replace_allocate pass

* formatting

* move output param to remove_allocate pass

* formatting

* fix bugs in replace_allocate pass

* formatting

* fix verify if tests

* formatting

* move if op logic

* formatting

* cleanup lowering

* cleanup lowering

* formatting

* fix tidy

* formatting

* fix tidy

* add cpu allocate check

* formatting

* change cpu allocate in pass

* formatting

* add some tests for replace_allocate pass

* formatting

* pass by ref

* fix run_pass

* formatting

* update variable name for module

* update dce to use contains() and fix tidy

* formatting

* update cppcheck

* add if test

* formatting

* add if test

* rename var to mod_output_names

* formatting

* remove conditional

* update allocate op and tests

* formatting

* update replace_allocate tests

* update create_output_names() and conditional in replace_allocate

* formatting

* remove extra variable in replace_allocate

* update tools script for allocation_model
Co-authored-by: Umang Yadav <29876643+umangyadav@users.noreply.github.com>
Co-authored-by: Chris Austen <causten@users.noreply.github.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

add6fb3b

06 May, 2022 1 commit
- Add compile tests for gpu math functions (#1182) · 6a5cda96
  Paul Fultz II authored May 06, 2022
```
Add compile tests for gpu math functions
```
  6a5cda96
29 Mar, 2022 1 commit

Refactor runtime compiled kernels to use the same compile_ops pipeline (#1125) · 661046c6

Paul Fultz II authored Mar 29, 2022

This adds the infrastructure so we can compile everything in parallel, whereas before only pointwise kernels were compiled in parallel. This will also directly integrate with lowering and the gpu-driver. The kernels for pointwise and roialign are using this infrastructure. Scatternd is not since it does require standard shape.

This also makes it easier to add new runtime compiled kernels in the future.

661046c6

25 Feb, 2022 1 commit
- Add get_queue to context to get the current stream (#1097) · e5242676
  Paul Fultz II authored Feb 24, 2022
```
wrapped in a any_ptr class so the type can be checked at runtime for a mismatch.
```
  e5242676
09 Feb, 2022 1 commit
- Enable pointwise fusion by default (#1082) · c7419a9c
  Paul Fultz II authored Feb 09, 2022
```
There is now a MIGRAPHX_DISABLE_POINTWISE_FUSION to disable it
```
  c7419a9c
17 Nov, 2021 1 commit

Handle removing contiguous on operators that use modules (#1005) · 785307c3

Paul Fultz II authored Nov 17, 2021

Currently, eliminate_contiguous will never remove contiguous for operators that use module inputs due to the fact that it doesn't pass the module inputs to compute_shape.

- Update to pass the module inputs correctly to compute_shape
- Fix the overloads of compute_shape so that when passed an empty vector of module inputs it will call the overload without module inputs
- Add tests with contiguous and pointwise module function.
- Move add_pointwise function to a seperate header to reuse across different tests

785307c3

08 Oct, 2021 1 commit

Remove alpha and beta from `dot` and `quant_dot` (#961) · 21193e87

Umang Yadav authored Oct 08, 2021

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.

21193e87

07 Sep, 2021 1 commit

qdq for quantization and include subgraph (#891) · b45f7239

Shucai Xiao authored Sep 07, 2021



Add operators, refactor parsers, add rewrite passes, add tests
Add ref implementations
Move broadcasting of scales and zero points to onnx parser
Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type
fp16 and fp8 quantization to include subgraph and parameters
fix unit test to use qdq operators for int8 quantization
Co-authored-by: turneram <alturner@amd.com>

b45f7239

31 Aug, 2021 1 commit

Fix debug assert (#930) · bd85a76c

Shucai Xiao authored Aug 31, 2021

* fix two asserts for debug build

* add unit test for copy parameters

* clang format

* add a unit test for reorder_dims

* change tranpose to always require perm not be empty

* clang format

* remove an unnecessary line

* fix tidy error

* fix review comments

bd85a76c

24 Aug, 2021 1 commit

Change attributes names to be more consistent and reflect better meaning (#916) · 0d2606bb

Umang Yadav authored Aug 24, 2021

* rename broadcast and multibroadcast output_lens attribute to out_lens attribute, and change tests and source code to reflect the same

* change the reshape attribute from dims to out_lens

* change transpose attribute's name from dims to perm to reflect better meaning

* use permutation instead of perm for transpose

clang formaating

* use dims instead of out_lens for reshape

clang formatting

0d2606bb

19 Aug, 2021 1 commit
- Enable warnings when jit compiling (#913) · ccff6beb
  Paul Fultz II authored Aug 19, 2021
```
* Enable warnings when jit compiling

* Formatting
```
  ccff6beb
10 Aug, 2021 1 commit

Add option to compile with hiprtc (#892) · 91c9ebbc

Paul Fultz II authored Aug 10, 2021

* Add hiprtc compile option
* Add cross compile test
* Update error reporting
* Add tests for errors and warnings
* Fix tidy warning
* Add comment to ifdefs
* Skip null character at end of log
* Assert there is null at the end

91c9ebbc

05 Aug, 2021 1 commit

Add gpu driver and improvements to pointwise codegen (#851) · 29fa2666

Paul Fultz II authored Aug 05, 2021



* Add method to compile pointwise

* Formatting

* Add lambda

* Add semicolon

* Rename variable

* Add driver to run jit kernels

* Formatting

* Add context

* Formatting

* Make seperate driver folder

* Add more general gpu driver

* Formatting

* Print out wll time

* Formatting

* Run multiple times and skip first run

* Formatting

* Seperate time_op

* Run an op for comparison

* Formatting

* Add debug asserts

* Formatting

* Change parameer name

* Formatting

* Fix argument order

* Formatting

* Add preloading

* Formatting

* Allow a different data type

* Formatting

* Pipeline transformations

* Formatting

* Add vectorization

* Formatting

* Reduce dims

* Formatting

* Compile with launch params as constant

* Formatting

* Make sure buffer can be vecotrized

* Formatting

* Enable vectorization and preloading

* Formatting

* Add print header

* Formatting

* Avoid allocating to large of LDS

* Formatting

* Add some vec functions to a seperate header

* Formatting

* Add stride loops

* Formatting

* Improve the transform pipeline

* Formatting

* Add const

* Fix shape check

* Formatting

* Just check stride axis is zero

* Remove extra finc_vector_axis overload

* Simplify some mroe functions

* Formatting

* Remove some more extra functions

* Formatting

* Simplify more decltypes

* Add another const

* Fix test

* Get buffer pointer different for older compilers
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

29fa2666

14 Jul, 2021 1 commit

Use the same device name function in the unit tests (#881) · 0b04fc80

Paul Fultz II authored Jul 14, 2021



* Unify device_name function

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0b04fc80

15 Jun, 2021 1 commit

Int8 gemm support (#811) · 39bc6161

Shucai Xiao authored Jun 15, 2021



* add a flag to indicate int8x4 input format

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* remove log info

* remove unnecessary changes

* fix cppcheck error

* add unit tests to have more code coverage

* clang format

* add debug info

* remove log info

* fix cppcheck error

* clang format

* clang format

* add one more unit tests for more scenarios

* fix cppcheck error

* clang format

* fix review comments

* clang format

* rename p to m

* fix review comments

* refine unit tests

* clang format

* refine unit tests and fixed a bug

* clang format

* fix build error related to rocm4.2

* fix a bug related to alpha and beta

* refine two unit tests related to int8_gemm

* fix cppcheck error

* refine unit test to pass on mi100

* add unit test for packing int8 args

* clang format

* change unit tests back

* disable some unit tests for gpu

* clang format

* refine unit tests to run on mi100

* clang format

* refine unit tests

* refine unit tests

* clang format

* change back a unit test
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

39bc6161

03 May, 2021 1 commit

Reduce types generated for hip kernels (#814) · 3becd974

Paul Fultz II authored May 03, 2021



* Remove unused data types

* Formatting

* Reduce types generated for hip kernels

* Formatting

* Fix onnx tests

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

3becd974

22 Apr, 2021 1 commit

Cpu fusions using post_ops (#781) · f7befe50

Paul Fultz II authored Apr 22, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test

* Add layernorm matcher

* Add gelu_erf matcher

* Formatting

* Add gelu_tanh matcher

* Formatting

* Remove match namespace

* Formatting

* Use matcher instead of string

* Formatting

* Add fusions

* Formatting

* Add post op field

* Formatting

* Make post_ops serializable

* Formatting

* Add eltwise fusions

* Formatting

* Fix null conversions

* Formatting

* Add fuse_ops source files

* Formatting

* Set binary post op index correctly

* Formatting

* Fix serialization bugs

* Check if used once

* Formatting

* Fix error in get_primitive_attr

* Formatting

* Add compile function

* Formatting

* Limit fusions

* Formatting

* Disable with env variable instead of using compile arg

* Formatting

* Fix implicit conversion to bool

* Declar on seperate lines

* Formatting

* Fix cppcheck issues

* Fix ICE in pack_join

* Formatting

* Use const ref

* Make enum hashable

* Formatting

* Add explicit this

* Fix merge issues

* Fix dangling ref

* Formatting

* Add test for compile

* Formatting

* Add more value tests

* Formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

f7befe50

19 Apr, 2021 1 commit

Add code generation for pointwise operators (#780) · 35d1bcc2

Paul Fultz II authored Apr 19, 2021

* Add definitions for all pointwise operators

* Formatting

* Add cpp generator class

* Formatting

* Move compilation to core

* Formatting

* Add clock to tmp name

* Add dynamic loader

* Formatting

* Add tests for code gen

* Formatting

* Add test for literals

* Formatting

* Use with_char

* Add missing header

* Fix mismerge

* Ignore tidy warning

* Fxx gcc 5 errors

* Apply fixits

* Skip signed bitwise of status

* Remove unused parameters

* Explicitly add c++14 flag

* Fix tidy warning

* Remove .o files

35d1bcc2

26 Mar, 2021 1 commit

Add initial code generation (#762) · 581d31b0

Paul Fultz II authored Mar 26, 2021



* Add code object op

* Formattting

* Add more value tests

* Formatting

* Fix from_value conversion from binary

* Formatting

* Dont use offload copy

* Remove iostream header

* Fix compilation errors

* Formatting

* Rename var

* Add missing files

* Formatting

* Remove duplicate variable

* Remove comment

* Template the function so sfinae will work

* Formatting

* Use template specialization since ADL is broken on hcc

* Formatting

* Annotate the constructor with HD for hcc

* Make variable const
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

581d31b0

26 Feb, 2021 1 commit

Add more supported operators and optimizations for the cpu backend (#746) · a0b570b2

Paul Fultz II authored Feb 26, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a0b570b2

25 Feb, 2021 1 commit

Add code object custom op (#744) · 7220dd18

Paul Fultz II authored Feb 24, 2021



* Add code object op

* Formattting

* Add more value tests

* Formatting

* Fix from_value conversion from binary

* Formatting

* Dont use offload copy

* Remove iostream header
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

7220dd18