Commits · 87fb7ed05be1af47688162f7f77fc72b2f3f6390 · OpenDAS / dgl

15 Mar, 2023 1 commit

[Config] Enable libxsmm by default for AVX cpu (#5165) · 87fb7ed0

Daniil Sizov authored Mar 15, 2023

* Enable AVX by default

* Fix linting errors

* Fix win64 build (libxsmm not linked)

Libxsmm on Win64 is not linked, should be disabled by default

* Fix clang format issues

* Change lower supported cpu version to LIBXSMM_X86_AVX2

Change lower supported cpu version to LIBXSMM_X86_AVX2 to address https://github.com/dmlc/dgl/issues/3459

 issue

* Fix unit test

Remove assumption that libxsmm is enabled in the config by default (only true for intel CPUs with AVX2 instructions)

---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-15-137.us-west-2.compute.internal>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

87fb7ed0

01 Mar, 2023 1 commit
- removed pragma omp for (#5334) · 308bd6f5
  Kacper Pietkun authored Mar 01, 2023
  
  308bd6f5
23 Feb, 2023 1 commit
- [Bugfix] fixed leak in SpMMCreateBlocks (#5210) · 99937422
  Kacper Pietkun authored Feb 23, 2023
```
* fixed leak in SpMMCreateBlocks

* clang format
```
  99937422
21 Feb, 2023 1 commit
- [Enhancement] Change id hash map (#5304) · ed2e5409
  peizhou001 authored Feb 21, 2023
```
* change concurrent id hash map
```
  ed2e5409
16 Feb, 2023 1 commit
- [Misc] Fix build warnings (#5303) · 1329be96
  Songqing Zhang authored Feb 16, 2023
```
Co-authored-by: songqing.zhang <songqing.zhang@shopee.com>
```
  1329be96
09 Feb, 2023 1 commit
- [Performance]Add concurrent cpu id hashmap (#5241) · f0b7cc96
  peizhou001 authored Feb 09, 2023
```
Add Id hash map
```
  f0b7cc96
06 Jan, 2023 1 commit
- [Performance] Fix for number of threads in COOToCSR (#5017) · 6069f34c
  Andrzej Kotłowski authored Jan 06, 2023
```
Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
```
  6069f34c
01 Dec, 2022 1 commit

[Feature] replace dgl PRNG with pcg32 (#4807) · b1e2695f

Muhammed Fatih BALIN authored Nov 30, 2022



* replace dgl PRNG with pcg32

* remove pcg submodule, add a simple implementation

* replace pcg32 with std::mt19937_64

* fix include order

* change RandomEngine to pcg32

* Remove custom pcg32 implementation, use the submodule provided by the original author.

* minor bug

* move include for linting

* include pcg for tests too
Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>

b1e2695f

22 Nov, 2022 1 commit

[Feature] (La)yer-Neigh(bor) sampling implementation (#4668) · bf264d00

Muhammed Fatih BALIN authored Nov 21, 2022



* adding LABOR sampling

* add ladies and pladies samplers

* fix compile error after rebase

* add reference for ladies sampler

* Improve ladies implementation.

* weighted labor sampling initial implementation draft
fix indentation and small bug in ladies script

* importance_sampling currently doesn't work with weights

* fix weighted importance sampling

* move labor example into its own folder

* lint fixes

* Improve documentation

* remove examples from the main PR

* fix linting by not using c++17 features

* fix documentation of labor_sampler.py

* update documentation for labor.py

* reformat the labor.py file with black

* fix linting errors

* replace exception use with if

* fix typo in error comment

* fixing win64 build for ci

* fixing weighted implementation, works now.

* fix bug in the weighted case and importance_sampling==0

* address part of the reviews

* remove unused code paths from cuda

* remove unused code path from cpu side

* remove extra features of labor making use of random seed.

* fix exclude_edges bug

* remove pcg and seed logic from cpu implementation, seed logic should still work for cuda.

* minor style change

* refactor CPU implementation, take out the importance_sampling probability computation into a function.

* improve CUDAWorkspaceAllocator

* refactor importance_sampling part out to a function

* minor optimization

* fix linting issue

* Revert "remove pcg and seed logic from cpu implementation, seed logic should still work for cuda."

This reverts commit c250e07ac6d7e13f57e79e8a2c2f098d777378c2.

* Revert "remove extra features of labor making use of random seed."

This reverts commit 7f99034353080308f4783f27d9a08bea343fb796.

* fix the documentation

* disable NIDs

* improve the documentation in the code

* use the stream argument in pcg32 instead of skipping ahead t times, can discard the use of hashmap now since it is faster this way.

* fix linting issue

* address another round of reviews

* further optimize CPU LABOR sampling implementation

* fix linting error

* update the comment

* reformat

* rename and rephrase comment

* fix formatting according to new linting specs

* fix compile error due to renaming, fix linting.

* lint

* rename DGLHeteroGraph to DGLGraph to match master

* replace other occurrences of DGLHeteroGraph to DGLGraph
Co-authored-by: Muhammed Fatih BALIN <m.f.balin@gmail.com>
Co-authored-by: Kaan Sancak <kaansnck@gmail.com>
Co-authored-by: Quan Gan <coin2028@hotmail.com>

bf264d00

15 Nov, 2022 4 commits
- Revert "[Kernel] Parallel find edges (#4878)" (#4899) · ca144886
  Quan (Andy) Gan authored Nov 15, 2022
```
This reverts commit 00c27cb2.
```
  ca144886
- Revert "[Performance] Make IdHashMap parallel (#4881)" (#4898) · 5b193f9b
  Quan (Andy) Gan authored Nov 15, 2022
```
This reverts commit 56962858.
```
  5b193f9b
- [Performance] Make IdHashMap parallel (#4881) · 56962858
  Quan (Andy) Gan authored Nov 15, 2022
```
* make IdHashMap parallel

* fix

* Update array_utils.h
```
  56962858
- [Kernel] Parallel find edges (#4878) · 00c27cb2
  Quan (Andy) Gan authored Nov 15, 2022
```
* use runtime parallel_for

* grain size

* Update array_index_select.cc
```
  00c27cb2
08 Nov, 2022 1 commit

[Misc] Add // NOLINT for the very long code. (#4834) · 0d687968

Hongzhi (Steve), Chen authored Nov 08, 2022



* alternative

* fix

* remove_todo

* blabl

* ablabl
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

0d687968

07 Nov, 2022 3 commits

[Misc] Minor code style fix. (#4825) · df089424

Hongzhi (Steve), Chen authored Nov 07, 2022



* blabla

* more

* blabla

* blabla

* ablabla

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

df089424

[Misc] clang-format auto fix. (#4824) · 8ac27dad

Hongzhi (Steve), Chen authored Nov 07, 2022



* [Misc] clang-format auto fix.

* blabla

* ablabla

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

8ac27dad

[Misc] Replace /*! with /**. (#4823) · bcd37684

Hongzhi (Steve), Chen authored Nov 07, 2022



* replace

* blabla

* balbla

* blabla
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

bcd37684

06 Nov, 2022 2 commits

[Misc] Replace \xxx with @XXX in structured comment. (#4822) · 619d735d

Hongzhi (Steve), Chen authored Nov 07, 2022



* param

* brief

* note

* return

* tparam

* brief2

* file

* return2

* return

* blabla

* all
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

619d735d

[Feature] Add bfloat16 (bf16) support (#4648) · 96297fb8

Xin Yao authored Nov 06, 2022

* add bf16 specializations

* remove SWITCH_BITS

* enable amp for bf16

* remove SWITCH_BITS for cpu kernels

* enbale bf16 based on CUDART

* fix compiling for sm<80

* fix cpu build

* enable unit tests

* update doc

* disable test for CUDA < 11.0

* address comments

* address comments

96297fb8

03 Nov, 2022 1 commit

[Misc] clang-format auto fix. (#4804) · 8ae50c42

Hongzhi (Steve), Chen authored Nov 03, 2022



* [Misc] clang-format auto fix.

* manual

* manual

* manual

* manual

* todo

* fix
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

8ae50c42

02 Nov, 2022 1 commit

[Misc] clang-format auto fix. (#4803) · b2d38ca8

Hongzhi (Steve), Chen authored Nov 02, 2022



* [Misc] clang-format auto fix.

* manual
Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

b2d38ca8

29 Oct, 2022 1 commit

[Sampling] Enable sampling with edge masks in sample_etype_neighbors (#4749) · 2bca4759

Quan (Andy) Gan authored Oct 29, 2022

* sample neighbors with masks

* oops

* refactor again

* remove

* remove debug code

* rename macro

* address comments

* more stuff

* remove

* fix

* try fix unit test

* oops

* fix test

* oops

* change name

* rename a lot of stuff

* oops

* ugh

* misc fixes

* lint

* address a lot of comments

* lint

* lint

* fix

* that was silly

* fix

* fix

* fix

* oops

2bca4759

28 Oct, 2022 1 commit

[Sampling] Enable sampling with edge masks on homogeneous graph (#4748) · 72781efb

Quan (Andy) Gan authored Oct 28, 2022

* sample neighbors with masks

* oops

* refactor again

* remove

* remove debug code

* rename macro

* address comments

* address comment

* address comments

* rename a lot of stuff

* oops

72781efb

13 Oct, 2022 1 commit
- [Sampling] handle fanout=-1 differently from fanout>0 in sample_etype_neighbors() (#4716) · a5d21c2b
  Rhett Ying authored Oct 13, 2022
  
  a5d21c2b
21 Sep, 2022 1 commit
- [Fix] Enable lint check for cuh files and fix compiler warnings (#4585) · 880b3b1f
  Xin Yao authored Sep 21, 2022
```
* disable warning for tensorpipe

* fix warning

* enable lint check for cuh files

* resolve comments
```
  880b3b1f
19 Sep, 2022 1 commit

[Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454) · cded5b80

Xin Yao authored Sep 19, 2022

* rename `DLContext` to `DGLContext`

* rename `kDLGPU` to `kDLCUDA`

* replace DLTensor with DGLArray

* fix linting

* Unify DGLType and DLDataType to DGLDataType

* Fix FFI

* rename DLDeviceType to DGLDeviceType

* decouple dlpack from the core library

* fix bug

* fix lint

* fix merge

* fix build

* address comments

* rename dl_converter to dlpack_convert

* remove redundant comments

cded5b80

05 Sep, 2022 1 commit

[Bug] Enable turn on/off libxsmm at runtime (#4455) · 62af41c2

peizhou001 authored Sep 05, 2022



* enable turn on/off libxsmm at runtime by adding a global config and related API
Co-authored-by: Ubuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>

62af41c2

01 Jul, 2022 2 commits
- [BugFix] check whether etype sorted when sampling (#4198) · dcf16992
  Rhett Ying authored Jul 01, 2022
  
  dcf16992
- [Feature] extend sort_csr/csc_by_tag to edge (#4164) · 6a6597a0
  Rhett Ying authored Jul 01, 2022
```
* [Feature] extend sort_csr/csc_by_tag to edge

* fix test ffailure in tensorflow

* refine sorting by edges

* fix docstring

* remove unnecessary mem
Co-authored-by: Xin Yao <xiny@nvidia.com>
```
  6a6597a0
23 Jun, 2022 1 commit

[Fix] Fix compiler warnings - part 1 (#4051) · 1ad65879

Triston authored Jun 22, 2022



* Fix a cub compile error for CUDA 11.5

* Fix comparison of integer expressions of different signedness in coo_sort.cu file

* Fix comparison of integer expressions of different signedness in cuda_compact_graph.cu file

* Remove never referenced variable in spmm.cu

* Fix comparison of integer expressions of different signedness in rowwise_pick.h file

* Fix comparison of integer expressions of different signedness in choice.cc file

* Remove never referenced variable col_data in spat_op_impl_coo.cc

* Remove never referenced variable allowed in global_uniform.cc

* Fix comparison of integer expressions of different signedness in graph.cc

* Fix comparison of integer expressions of different signedness in graph_apis.cc

* Fix the un-used ctx variable in ndarray_partition.cc file for cpu only build

* Fix comparison of integer expressions of different signedness in libra_partition.cc

* Fix comparison of integer expressions of different signedness in graph_op.cc
Co-authored-by: Triston Cao <tristonc@nvidia.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

1ad65879

06 Jun, 2022 1 commit
- parallelize csr2coo (#4081) · 31a81438
  Quan (Andy) Gan authored Jun 06, 2022
```
Co-authored-by: Xin Yao <xiny@nvidia.com>
```
  31a81438
28 May, 2022 1 commit
- add sanity check (#4050) · c577dc9f
  Quan (Andy) Gan authored May 28, 2022
  
  c577dc9f
26 Apr, 2022 1 commit

[Performance][GPU] Improving Disjoint Union kernel for Graph Dataloaders (#3895) · 6e46bbf5

ayasar70 authored Apr 26, 2022



* Based on issue #3436. Improving _SegmentCopyKernel s GPU utilization by switching to nonzero based thread assignment

* fixing lint issues

* Update cub for cuda 11.5 compatibility (#3468)

* fixing type mismatch

* tx guaranteed to be smaller than nnz. Hence removing last check

* minor: updating comment

* adding three unit tests for csr slice method to cover some corner cases

* timing repeatkernel

* clean

* clean

* clean

* updating _SegmentMaskColKernel

* Working on requests: removing sorted array check and adding comments to utility functions

* fixing lint issue

* Optimizing disjoint union kernel

* Trying to resolve compilation issue on CI

* [EMPTY] Relevant commit message here

* applying revision requests on cpu/disjoint_union.cc

* removing unnecessary casts

* remove extra space
Co-authored-by: Abdurrahman Yasar <ayasar@nvidia.com>
Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

6e46bbf5

23 Feb, 2022 2 commits

Fixes the bug when total_nnz is > integer limit (#3766) · e7ad4c9c
sanchit-misra authored Feb 24, 2022

e7ad4c9c

[NN] Rework RelGraphConv and HGTConv (#3742) · 0227ddfb

Minjie Wang authored Feb 23, 2022

* WIP: TypedLinear and new RelGraphConv

* wip

* further simplify RGCN

* a bunch of tweak for performance; add basic cpu support

* update on segmm

* wip: segment.cu

* new backward kernel works

* fix a bunch of bugs in kernel; leave idx_a for future

* add nn test for typed_linear

* rgcn nn test

* bugfix in corner case; update RGCN README

* doc

* fix cpp lint

* fix lint

* fix ut

* wip: hgtconv; presorted flag for rgcn

* hgt code and ut; WIP: some fix on reorder graph

* better typed linear init

* fix ut

* fix lint; add docstring

0227ddfb

15 Feb, 2022 1 commit

[Feature] Gather mm (#3641) · b3d3a2c4

Israt Nisa authored Feb 14, 2022



* init

* init

* working cublasGemm

* benchmark high-mem/low-mem, err gather_mm output

* cuda kernel for bmm like kernel

* removed cpu copy for E_per_Rel

* benchmark code from Minjie

* fixed cublas results in gathermm sorted

* use GPU shared mem in unsorted gather mm

* minor

* Added an optimal version of gather_mm_unsorted

* lint

* init gather_mm_scatter

* cublas transpose added

* fixed h_offset for multiple rel

* backward unittest

* cublas support to transpose W

* adding missed file

* forgot to add header file

* lint

* lint

* cleanup

* lint

* docstring

* lint

* added unittest

* lint

* lint

* unittest

* changed err type

* skip cpu test

* skip CPU code

* move in-len loop inside

* lint

* added check different dim length for B

* w_per_len is optional now

* moved gather_mm to pytorch/backend with backward support

* removed a_/b_trans support

* transpose op inside GEMM call

* removed out alloc from API, changed W 2D to 3D

* Added se_gather_mm, Separate API for sortedE

* Fixed gather_mm (unsorted) user interface

* unsorted gmm backward + separate CAPI for un/sorted A

* typecast to float to support atomicAdd

* lint typecast

* lint

* added gather_mm_scatter

* minor

* const

* design changes

* Added idx_a, idx_b support gmm_scatter

* dgl doc

* lint

* adding gather_mm in ops

* lint

* lint

* minor

* removed benchmark files

* minor

* empty commit
Co-authored-by: Israt Nisa <nisisrat@amazon.com>

b3d3a2c4

11 Feb, 2022 1 commit

New fused edge_softmax op (#3650) · bc8f8b0b

ranzhejiang authored Feb 11, 2022



* [feature] edge softmax refact.

* delete file

* fix backward and cmake version

* fix backward

* format function

* fix setting

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* refix

* add cuda kernel for backward and rename some function

* add benchmark for edge_softmax

* fix format

* remove cuda_backwrd

* fix code format and add comment for op on CPU

* fix lint
Co-authored-by: Jinjing Zhou <VoVAllen@users.noreply.github.com>

bc8f8b0b

17 Jan, 2022 1 commit
- [Bugfix] Fixes the redundancy parameter being used wrong in global negative sampling (#3657) · 77f4287a
  Quan (Andy) Gan authored Jan 17, 2022
```
* oops

* test
```
  77f4287a
11 Jan, 2022 1 commit

Pass the std:min argument's type, to avoid the compilation error. (#3637) · b002f8f9

MaoYuan Xian authored Jan 11, 2022



* Pass the std:min argument's type, to avoid the compilation error.

* Update parallel_for.h

* Update negative_sampling.cc
Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com>

b002f8f9

07 Jan, 2022 1 commit

[Feature] Negative sampling (#3599) · 90f10b31

Quan (Andy) Gan authored Jan 07, 2022

* first commit

* a bunch of fixes

* add unique

* lint

* lint

* lint

* address comments

* Update negative_sampler.py

* fix

* description

* address comments and fix

* fix

* replace unique with replace

* test pylint

* Update negative_sampler.py

90f10b31