[Enhancement] Update AtomicAdd functions for BFLOAT16 in common.h (#297)
- Added conditional compilation for BFLOAT16 atomic operations to ensure compatibility with CUDA architectures greater than 7.5. - Improved code clarity by organizing the AtomicAdd functions and adding relevant comments for better understanding.
Showing
Please register or sign in to comment