• ashok-ponnuswami-msft's avatar
    LightGBM would hang on MPI run if only some nodes have a fatal error. (#2600) · 61292080
    ashok-ponnuswami-msft authored
    * LightGBM would hang on MPI run if only some nodes have a fatal error. The issue is that the destructor of Application calls MPI_Finalize(), which will cause the program to hand and prevent from exiting. So we move the network finalization out of the destructor and call MPI_Finalize() or MPI_Abort() based on whether there was an unhandled exception.
    
    * Minor updates: Remove excess debug logging, whitespaces.
    
    * Add comments for new functions.
    61292080
main.cpp 912 Bytes