Made find_max_global() and its overloads measure the execution speed of

the user provided objective function. If it's faster than the LIPO upper bounding Monte Carlo model then we skip or limit the Monte Carlo stuff and just fun the objective function more often. Previously, find_max_global() simply assumed the objective functions were really expensive to invoke. TLDR: this change makes find_max_global() run a lot faster on objective functions that are themselves very fast to execute, since it will skip the expensive Monte Carlo modeling and just call the objective function a bunch instead.

Made find_max_global() and its overloads measure the execution speed of
the user provided objective function. If it's faster than the LIPO upper bounding Monte Carlo model then we skip or limit the Monte Carlo stuff and just fun the objective function more often. Previously, find_max_global() simply assumed the objective functions were really expensive to invoke. TLDR: this change makes find_max_global() run a lot faster on objective functions that are themselves very fast to execute, since it will skip the expensive Monte Carlo modeling and just call the objective function a bunch instead.
501d17f6 · Davis King · e82e2ceb · 501d17f6
Commit 501d17f6 authored Nov 28, 2019 by Davis King
Hide whitespace changes
Inline Side-by-side

Showing with 39 additions and 3 deletions

dlib/global_optimization/find_max_global.h dlib/global_optimization/find_max_global.h +39 -3

No files found.
--- a/dlib/global_optimization/find_max_global.h
+++ b/dlib/global_optimization/find_max_global.h
@@ -9,7 +9,9 @@
 #include <utility>
 #include <chrono>
 #include <memory>
+#include <thread>
 #include "../threads/thread_pool_extension.h"
+#include "../statistics/statistics.h"
 namespace dlib
 {
@@ -157,13 +159,20 @@ template <typename T> static auto go(T&& f, const matrix<double, 0, 1>& a) -> de
            global_function_search opt(specs);
            opt.set_solver_epsilon(solver_epsilon);
-            const auto time_to_stop = std::chrono::steady_clock::now() + max_runtime;
+            running_stats_decayed<double> objective_funct_eval_time(functions.size()*5);
+            std::mutex eval_time_mutex;
+            using namespace std::chrono;
+            const auto time_to_stop = steady_clock::now() + max_runtime;
            // Now run the main solver loop.
-            for (size_t i = 0; i < num.max_calls && std::chrono::steady_clock::now() < time_to_stop; ++i)
+            for (size_t i = 0; i < num.max_calls && steady_clock::now() < time_to_stop; ++i)
            {
+                const auto get_next_x_start_time = steady_clock::now();
                auto next = std::make_shared<function_evaluation_request>(opt.get_next_x());
-                auto execute_call = [&functions,&ymult,&log_scale,next]() {
+                const auto get_next_x_runtime = steady_clock::now() - get_next_x_start_time;
+                auto execute_call = [&functions,&ymult,&log_scale,&eval_time_mutex,&objective_funct_eval_time,next]() {
                    matrix<double,0,1> x = next->x();
                    // Undo any log-scaling that was applied to the variables before we pass them
                    // to the functions being optimized.
@@ -172,11 +181,38 @@ template <typename T> static auto go(T&& f, const matrix<double, 0, 1>& a) -> de
                        if (log_scale[next->function_idx()][j])
                            x(j) = std::exp(x(j));
                    }
+                    const auto funct_eval_start = steady_clock::now();
                    double y = ymult*call_function_and_expand_args(functions[next->function_idx()], x);
+                    const double funct_eval_runtime = duration_cast<nanoseconds>(steady_clock::now() - funct_eval_start).count();
                    next->set(y);
+                    std::lock_guard<std::mutex> lock(eval_time_mutex);
+                    objective_funct_eval_time.add(funct_eval_runtime);
                };
                tp.add_task_by_value(execute_call);
+                std::lock_guard<std::mutex> lock(eval_time_mutex);
+                // If calling opt.get_next_x() is taking a long time relative to how long it takes
+                // to evaluate the objective function then we should spend less time grinding on the
+                // internal details of the optimizer and more time running the actual objective
+                // function.  E.g. if we could just run 2x more objective function calls in the same
+                // amount of time then we should just do that.  The main slowness in the solver is
+                // from the Monte Carlo sampling, which we can turn down if the objective function
+                // is really fast to evaluate.  This is because the point of the Monte Carlo part is
+                // to try really hard to avoid calls to really expensive objective functions.  But
+                // if the objective function is not expensive then we should just call it.
+                if (objective_funct_eval_time.current_n() > functions.size()*5 && 
+                    objective_funct_eval_time.mean()/std::max(1ul,tp.num_threads_in_pool()) < duration_cast<nanoseconds>(get_next_x_runtime).count()) {
+                    const size_t new_val = static_cast<size_t>(std::floor(opt.get_monte_carlo_upper_bound_sample_num()*0.8));
+                    opt.set_monte_carlo_upper_bound_sample_num(std::max<size_t>(1, new_val));
+                    // At this point just disable the upper bounding Monte Carlo search stuff and
+                    // use only pure random search since the objective function is super cheap to
+                    // evaluate, making this more fancy search a waste of time.
+                    if (opt.get_monte_carlo_upper_bound_sample_num() == 1) {
+                        opt.set_pure_random_search_probability(1);
+                    }
+                }
            }
            tp.wait_for_all_tasks();