• Evan Pretti's avatar
    GPU implementation of L-BFGS (#5198) · 4ab645ea
    Evan Pretti authored
    * Make reference/CPU minimizer into a kernel
    
    * Add per-platform support for GPU minimization
    
    * Initial implementation of GPU minimization
    
    * Fixes
    
    * Increase robustness when initial gradient is huge
    
    * Handle overflow leading to non-finite values gracefully
    
    * Handle large forces in single precision more robustly
    
    * Optimize kernels
    
    * Fix kernel launch size
    
    * Update banner years
    
    * Don't create MinimizeKernel until first minimization requested
    
    * Make some compile-time constants into kernel arguments
    
    * Consolidate scale calculation kernel
    
    * Condense alpha/beta reduction kernels using atomics
    
    * Condense line search dot kernels with reductions
    
    * Remove a download, and download grad norm separately
    
    * Asynchronously check lbfgs convergence condition
    
    * Restructure line search to avoid download waiting
    
    * Start line search preemptively in case CPU evaluation is not needed
    
    * In rare cases, constraint error might not decrease after one optimization round
    
    * Better handling of unsupported 64-bit atomics, use FLT_MAX
    
    * Pick gradient mode based on GPU vs. CPU evaluation
    
    * Rework getDiff/getScale reduction, remove reduceBuffer
    
    * Older CUDA might not like float hex literals
    
    * Fix error in a comment
    4ab645ea
CommonMinimizeKernel.h 5.89 KB