"src/libtorchaudio/sox/pybind/pybind.cpp" did not exist on "463a8b2c83653ce01698f542e2ff07f4947dce7e"
  1. 06 Sep, 2022 1 commit
    • Chang Liu's avatar
      [Feature] Unify the cuda stream used in core library (#4480) · 1c9d2a03
      Chang Liu authored
      
      
      * Use an internal cuda stream for CopyDataFromTo
      
      * small fix white space
      
      * Fix to compile
      
      * Make stream optional in copydata for compile
      
      * fix lint issue
      
      * Update cub functions to use internal stream
      
      * Lint check
      
      * Update CopyTo/CopyFrom/CopyFromTo to use internal stream
      
      * Address comments
      
      * Fix backward CUDA stream
      
      * Avoid overloading CopyFromTo()
      
      * Minor comment update
      
      * Overload copydatafromto in cuda device api
      Co-authored-by: default avatarxiny <xiny@nvidia.com>
      1c9d2a03
  2. 27 Jun, 2022 1 commit
    • ndickson-nvidia's avatar
      [Bug][Feature] Added more missing FP16 specializations (#4140) · a5d8460c
      ndickson-nvidia authored
      * * Added missing specializations for `__half` of `DLDataTypeTraits`, `IndexSelect`, `Full`, `Scatter_`, `CSRGetData`, `CSRMM`, `CSRSum`, `IndexSelectCPUFromGPU`
      * Fixed casting issue in `_LinearSearchKernel` that was preventing it from supporting `__half`
      * Added `#if`'d out specializations of `CSRGEMM`, `CSRGEAM`, and `Xgeam`, which would require functions that aren't currently provided by cublas
      
      * * Added more specific error messages for unimplemented FP16 specializations of Xgeam, CSRGEMM, and CSRGEAM
      
      * * Added missing instantiation of DLDataTypeTraits<__half>::dtype
      
      * * Fixed linter error
      * Added clearer comment explaining why the cast to long long is necessary
      
      * * Worked around a compile error in some particular setup, where __half can't be constructed on the host side
      
      * * Fixed linter formatting errors
      
      * * Changes to comments as recommended
      
      * * Made recommended changes to logging errors in FP16 specializations
      * Also changed the existing Xgeam function for unsupported data types from LOG(INFO) to LOG(FATAL)
      a5d8460c
  3. 21 Feb, 2022 1 commit
    • Quan (Andy) Gan's avatar
      [Bugfix] Bug fixes in new dataloader (#3727) · 3f138eba
      Quan (Andy) Gan authored
      
      
      * fixes
      
      * fix
      
      * more fixes
      
      * update
      
      * oops
      
      * lint?
      
      * temporarily revert - will fix in another PR
      
      * more fixes
      
      * skipping mxnet test
      
      * address comments
      
      * fix DDP
      
      * fix edge dataloader exclusion problems
      
      * stupid bug
      
      * fix
      
      * use_uvm option
      
      * fix
      
      * fixes
      
      * fixes
      
      * fixes
      
      * fixes
      
      * add evaluation for cluster gcn and ddp
      
      * stupid bug again
      
      * fixes
      
      * move sanity checks to only support DGLGraphs
      
      * pytorch lightning compatibility fixes
      
      * remove
      
      * poke
      
      * more fixes
      
      * fix
      
      * fix
      
      * disable test
      
      * docstrings
      
      * why is it getting a memory leak?
      
      * fix
      
      * update
      
      * updates and temporarily disable forkingpickler
      
      * update
      
      * fix?
      
      * fix?
      
      * oops
      
      * oops
      
      * fix
      
      * lint
      
      * huh
      
      * uh
      
      * update
      
      * fix
      
      * made it memory efficient
      
      * refine exclude interface
      
      * fix tutorial
      
      * fix tutorial
      
      * fix graph duplication in CPU dataloader workers
      
      * lint
      
      * lint
      
      * Revert "lint"
      
      This reverts commit 805484dd553695111b5fb37f2125214a6b7276e9.
      
      * Revert "lint"
      
      This reverts commit 0bce411b2b415c2ab770343949404498436dc8b2.
      
      * Revert "fix graph duplication in CPU dataloader workers"
      
      This reverts commit 9e3a8cf34c175d3093c773f6bb023b155f2bd27f.
      Co-authored-by: default avatarxiny <xiny@nvidia.com>
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      3f138eba
  4. 09 Feb, 2022 1 commit
    • Xin Yao's avatar
      [Feature] CUDA UVA sampling for MultiLayerNeighborSampler (#3674) · 738e8318
      Xin Yao authored
      
      
      * implement pin_memory/unpin_memory/is_pinned for dgl.graph
      
      * update python docstring
      
      * update c++ docstring
      
      * add test
      
      * fix the broken UnifiedTensor
      
      * XPU_SWITCH for kDLCPUPinned
      
      * a rough version ready for testing
      
      * eliminate extra context parameter for pin/unpin
      
      * update train_sampling
      
      * fix linting
      
      * fix typo
      
      * multi-gpu uva sampling case
      
      * disable new format materialization for pinned graphs
      
      * update python doc for pin_memory_
      
      * fix unit test
      
      * UVA sampling for link prediction
      
      * dispatch most csr ops
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update graphsage example to combine uva sampling and UnifiedTensor
      
      * update doc
      
      * update examples
      
      * change unitgraph and heterograph's PinMemory to in-place
      
      * update examples for multi-gpu uva sampling
      
      * update doc
      
      * fix linting
      
      * fix cpu build
      
      * fix is_pinned for DistGraph
      
      * fix is_pinned for DistGraph
      
      * update graphsage unsupervised example
      
      * update doc for gpu sampling
      
      * update some check for sampling device switching
      
      * fix linting
      
      * adapt for new dataloader
      
      * fix linting
      
      * fix
      
      * fix some name issue
      
      * adjust device check
      
      * add unit test for uva sampling & fix some zero_copy bug
      
      * fix linting
      
      * update num_threads in graphsage examples
      Co-authored-by: default avatarQuan (Andy) Gan <coin2028@hotmail.com>
      Co-authored-by: default avatarJinjing Zhou <VoVAllen@users.noreply.github.com>
      738e8318
  5. 15 Oct, 2021 1 commit
  6. 20 May, 2021 1 commit
    • nv-dlasalle's avatar
      [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings... · ae8dbe6d
      nv-dlasalle authored
      
      [Feature][Performance] Implement NCCL wrapper for communicating NodeEmbeddings and sparse gradients. (#2825)
      
      * Split NCCL wrapper from sparse optimizer and sparse embedding
      
      * Add more unit tests for single node nccl
      
      * Fix unit test for tf
      
      * Switch to device histogram
      
      * Fix histgram issues
      
      * Finish migration to histogram
      
      * Handle cases with zero send/recieve data
      
      * Start on partition object
      
      * Get compiling
      
      * Updates
      
      * Add unit tests
      
      * Switch to partition object
      
      * Fix linting issues
      
      * Rename partition file
      
      * Add python doc
      
      * Fix python assert and finish doxygen comments
      
      * Remove stubs for range based partition to satisfy pylint
      
      * Wrap unit test in GPU only
      
      * Wrap explicit cuda call in ifdef
      
      * Merge with partition.py
      
      * update docstrings
      
      * Cleanup partition_op
      
      * Add Workspace object
      
      * Switch to using workspace object
      
      * Move last remainder based function out of nccl_api
      
      * Add error messages
      
      * Update docs with examples
      
      * Fix linting erros
      Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
      ae8dbe6d
  7. 10 Sep, 2020 1 commit
  8. 28 Jun, 2020 1 commit
    • Minjie Wang's avatar
      [CUDA][Kernel] More CUDA kernels; Standardize the behavior for sorted COO/CSR (#1704) · 870da747
      Minjie Wang authored
      * add cub; array cumsum
      
      * CSRSliceRows
      
      * fix warning
      
      * operator << for ndarray; CSRSliceRows
      
      * add CSRIsSorted
      
      * add csr_sort
      
      * inplace coosort and outplace csrsort
      
      * WIP: coo is sorted
      
      * mv cuda_utils
      
      * add AllTrue utility
      
      * csr sort
      
      * coo sort
      
      * coo2csr for sorted coo arrays
      
      * CSRToCOO from sorted
      
      * pass tests for the new kernel changes
      
      * cannot use inplace sort
      
      * lint
      
      * try fix msvc error
      
      * Fix g.copy_to and g.asnumbits; ToBlock no longer uses CSC
      
      * stash
      
      * revert some hack
      
      * revert some changes
      
      * address comments
      
      * fix
      
      * fix to_block unittest
      
      * add todo note
      870da747
  9. 19 Jun, 2020 1 commit
    • Minjie Wang's avatar
      [CUDA] Many CUDA operators; Prepare for DGLGraph on CUDA (#1660) · f1b19a6b
      Minjie Wang authored
      * add cuda utils; change g.to; add g.device
      
      * split array.h into several headers
      
      * cuda index select
      
      * file
      
      * three cuda kernels
      
      * add cuda elementwise arith and several others
      
      * cuda CSRIsNonZero
      
      * fix lint
      
      * lint
      
      * lint
      
      * fix bug in changing ctx to property
      
      * address comments
      
      * remove unused codes
      
      * address comments
      f1b19a6b