Unverified Commit e676af23 authored by Guolin Ke's avatar Guolin Ke Committed by GitHub
Browse files

Code refactoring for ranking objective & Faster ndcg_xendcg (#2801)

* code refactoring

* update vcproject

* refine

* fix test

* Update tests/python_package_test/test_sklearn.py

* fix test
parent d20ceac7
...@@ -20,7 +20,7 @@ test_that("learning-to-rank with lgb.train() works as expected", { ...@@ -20,7 +20,7 @@ test_that("learning-to-rank with lgb.train() works as expected", {
objective = "lambdarank" objective = "lambdarank"
, metric = "ndcg" , metric = "ndcg"
, ndcg_at = ndcg_at , ndcg_at = ndcg_at
, max_position = 3L , lambdarank_truncation_level = 3L
, learning_rate = 0.001 , learning_rate = 0.001
) )
model <- lgb.train( model <- lgb.train(
...@@ -67,7 +67,7 @@ test_that("learning-to-rank with lgb.cv() works as expected", { ...@@ -67,7 +67,7 @@ test_that("learning-to-rank with lgb.cv() works as expected", {
objective = "lambdarank" objective = "lambdarank"
, metric = "ndcg" , metric = "ndcg"
, ndcg_at = ndcg_at , ndcg_at = ndcg_at
, max_position = 3L , lambdarank_truncation_level = 3L
, label_gain = "0,1,3" , label_gain = "0,1,3"
) )
nfold <- 4L nfold <- 4L
......
...@@ -99,7 +99,9 @@ Core Parameters ...@@ -99,7 +99,9 @@ Core Parameters
- ``lambdarank``, `lambdarank <https://papers.nips.cc/paper/2971-learning-to-rank-with-nonsmooth-cost-functions.pdf>`__ objective. `label_gain <#label_gain>`__ can be used to set the gain (weight) of ``int`` label and all values in ``label`` must be smaller than number of elements in ``label_gain`` - ``lambdarank``, `lambdarank <https://papers.nips.cc/paper/2971-learning-to-rank-with-nonsmooth-cost-functions.pdf>`__ objective. `label_gain <#label_gain>`__ can be used to set the gain (weight) of ``int`` label and all values in ``label`` must be smaller than number of elements in ``label_gain``
- ``rank_xendcg``, `XE_NDCG_MART <https://arxiv.org/abs/1911.09798>`__ ranking objective function. To obtain reproducible results, you should disable parallelism by setting ``num_threads`` to 1, aliases: ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart`` - ``rank_xendcg``, `XE_NDCG_MART <https://arxiv.org/abs/1911.09798>`__ ranking objective function. aliases: ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart``.
- ``rank_xendcg`` is faster than ``lambdarank`` and achieves the similar performance as ``lambdarank``
- label should be ``int`` type, and larger number represents the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect) - label should be ``int`` type, and larger number represents the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect)
...@@ -801,6 +803,12 @@ Convert Parameters ...@@ -801,6 +803,12 @@ Convert Parameters
Objective Parameters Objective Parameters
-------------------- --------------------
- ``objective_seed`` :raw-html:`<a id="objective_seed" title="Permalink to this parameter" href="#objective_seed">&#x1F517;&#xFE0E;</a>`, default = ``5``, type = int
- random seed for objectives, if random process is needed
- used in ``rank_xendcg``
- ``num_class`` :raw-html:`<a id="num_class" title="Permalink to this parameter" href="#num_class">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``num_classes``, constraints: ``num_class > 0`` - ``num_class`` :raw-html:`<a id="num_class" title="Permalink to this parameter" href="#num_class">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``num_classes``, constraints: ``num_class > 0``
- used only in ``multi-class`` classification application - used only in ``multi-class`` classification application
...@@ -873,19 +881,19 @@ Objective Parameters ...@@ -873,19 +881,19 @@ Objective Parameters
- set this closer to ``1`` to shift towards a **Poisson** distribution - set this closer to ``1`` to shift towards a **Poisson** distribution
- ``max_position`` :raw-html:`<a id="max_position" title="Permalink to this parameter" href="#max_position">&#x1F517;&#xFE0E;</a>`, default = ``20``, type = int, constraints: ``max_position > 0`` - ``lambdarank_truncation_level`` :raw-html:`<a id="lambdarank_truncation_level" title="Permalink to this parameter" href="#lambdarank_truncation_level">&#x1F517;&#xFE0E;</a>`, default = ``20``, type = int, constraints: ``lambdarank_truncation_level > 0``
- used only in ``lambdarank`` application - used only in ``lambdarank`` application
- optimizes `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__ at this position - used for truncating the max_ndcg, refer to "truncation level" in the Sec.3 of `LambdaMART paper <https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/MSR-TR-2010-82.pdf>`__ .
- ``lambdamart_norm`` :raw-html:`<a id="lambdamart_norm" title="Permalink to this parameter" href="#lambdamart_norm">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool - ``lambdarank_norm`` :raw-html:`<a id="lambdarank_norm" title="Permalink to this parameter" href="#lambdarank_norm">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool
- used only in ``lambdarank`` application - used only in ``lambdarank`` application
- set this to ``true`` to normalize the lambdas for different queries, and improve the performance for unbalanced data - set this to ``true`` to normalize the lambdas for different queries, and improve the performance for unbalanced data
- set this to ``false`` to enforce the original lambdamart algorithm - set this to ``false`` to enforce the original lambdarank algorithm
- ``label_gain`` :raw-html:`<a id="label_gain" title="Permalink to this parameter" href="#label_gain">&#x1F517;&#xFE0E;</a>`, default = ``0,1,3,7,15,31,63,...,2^30-1``, type = multi-double - ``label_gain`` :raw-html:`<a id="label_gain" title="Permalink to this parameter" href="#label_gain">&#x1F517;&#xFE0E;</a>`, default = ``0,1,3,7,15,31,63,...,2^30-1``, type = multi-double
...@@ -895,12 +903,6 @@ Objective Parameters ...@@ -895,12 +903,6 @@ Objective Parameters
- separate by ``,`` - separate by ``,``
- ``objective_seed`` :raw-html:`<a id="objective_seed" title="Permalink to this parameter" href="#objective_seed">&#x1F517;&#xFE0E;</a>`, default = ``5``, type = int
- used only in the ``rank_xendcg`` objective
- random seed for objectives
Metric Parameters Metric Parameters
----------------- -----------------
......
...@@ -128,7 +128,8 @@ struct Config { ...@@ -128,7 +128,8 @@ struct Config {
// descl2 = label is anything in interval [0, 1] // descl2 = label is anything in interval [0, 1]
// desc = ranking application // desc = ranking application
// descl2 = ``lambdarank``, `lambdarank <https://papers.nips.cc/paper/2971-learning-to-rank-with-nonsmooth-cost-functions.pdf>`__ objective. `label_gain <#label_gain>`__ can be used to set the gain (weight) of ``int`` label and all values in ``label`` must be smaller than number of elements in ``label_gain`` // descl2 = ``lambdarank``, `lambdarank <https://papers.nips.cc/paper/2971-learning-to-rank-with-nonsmooth-cost-functions.pdf>`__ objective. `label_gain <#label_gain>`__ can be used to set the gain (weight) of ``int`` label and all values in ``label`` must be smaller than number of elements in ``label_gain``
// descl2 = ``rank_xendcg``, `XE_NDCG_MART <https://arxiv.org/abs/1911.09798>`__ ranking objective function. To obtain reproducible results, you should disable parallelism by setting ``num_threads`` to 1, aliases: ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart`` // descl2 = ``rank_xendcg``, `XE_NDCG_MART <https://arxiv.org/abs/1911.09798>`__ ranking objective function. aliases: ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart``.
// descl2 = ``rank_xendcg`` is faster than ``lambdarank`` and achieves the similar performance as ``lambdarank``
// descl2 = label should be ``int`` type, and larger number represents the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect) // descl2 = label should be ``int`` type, and larger number represents the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect)
std::string objective = "regression"; std::string objective = "regression";
...@@ -705,6 +706,10 @@ struct Config { ...@@ -705,6 +706,10 @@ struct Config {
#pragma region Objective Parameters #pragma region Objective Parameters
// desc = random seed for objectives, if random process is needed
// desc = used in ``rank_xendcg``
int objective_seed = 5;
// check = >0 // check = >0
// alias = num_classes // alias = num_classes
// desc = used only in ``multi-class`` classification application // desc = used only in ``multi-class`` classification application
...@@ -763,13 +768,13 @@ struct Config { ...@@ -763,13 +768,13 @@ struct Config {
// check = >0 // check = >0
// desc = used only in ``lambdarank`` application // desc = used only in ``lambdarank`` application
// desc = optimizes `NDCG <https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG>`__ at this position // desc = used for truncating the max_ndcg, refer to "truncation level" in the Sec.3 of `LambdaMART paper <https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/MSR-TR-2010-82.pdf>`__ .
int max_position = 20; int lambdarank_truncation_level = 20;
// desc = used only in ``lambdarank`` application // desc = used only in ``lambdarank`` application
// desc = set this to ``true`` to normalize the lambdas for different queries, and improve the performance for unbalanced data // desc = set this to ``true`` to normalize the lambdas for different queries, and improve the performance for unbalanced data
// desc = set this to ``false`` to enforce the original lambdamart algorithm // desc = set this to ``false`` to enforce the original lambdarank algorithm
bool lambdamart_norm = true; bool lambdarank_norm = true;
// type = multi-double // type = multi-double
// default = 0,1,3,7,15,31,63,...,2^30-1 // default = 0,1,3,7,15,31,63,...,2^30-1
...@@ -778,10 +783,6 @@ struct Config { ...@@ -778,10 +783,6 @@ struct Config {
// desc = separate by ``,`` // desc = separate by ``,``
std::vector<double> label_gain; std::vector<double> label_gain;
// desc = used only in the ``rank_xendcg`` objective
// desc = random seed for objectives
int objective_seed = 5;
#pragma endregion #pragma endregion
#pragma region Metric Parameters #pragma region Metric Parameters
......
...@@ -257,6 +257,7 @@ const std::unordered_set<std::string>& Config::parameter_set() { ...@@ -257,6 +257,7 @@ const std::unordered_set<std::string>& Config::parameter_set() {
"output_result", "output_result",
"convert_model_language", "convert_model_language",
"convert_model", "convert_model",
"objective_seed",
"num_class", "num_class",
"is_unbalance", "is_unbalance",
"scale_pos_weight", "scale_pos_weight",
...@@ -267,10 +268,9 @@ const std::unordered_set<std::string>& Config::parameter_set() { ...@@ -267,10 +268,9 @@ const std::unordered_set<std::string>& Config::parameter_set() {
"fair_c", "fair_c",
"poisson_max_delta_step", "poisson_max_delta_step",
"tweedie_variance_power", "tweedie_variance_power",
"max_position", "lambdarank_truncation_level",
"lambdamart_norm", "lambdarank_norm",
"label_gain", "label_gain",
"objective_seed",
"metric", "metric",
"metric_freq", "metric_freq",
"is_provide_training_metric", "is_provide_training_metric",
...@@ -513,6 +513,8 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str ...@@ -513,6 +513,8 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str
GetString(params, "convert_model", &convert_model); GetString(params, "convert_model", &convert_model);
GetInt(params, "objective_seed", &objective_seed);
GetInt(params, "num_class", &num_class); GetInt(params, "num_class", &num_class);
CHECK(num_class >0); CHECK(num_class >0);
...@@ -541,17 +543,15 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str ...@@ -541,17 +543,15 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str
CHECK(tweedie_variance_power >=1.0); CHECK(tweedie_variance_power >=1.0);
CHECK(tweedie_variance_power <2.0); CHECK(tweedie_variance_power <2.0);
GetInt(params, "max_position", &max_position); GetInt(params, "lambdarank_truncation_level", &lambdarank_truncation_level);
CHECK(max_position >0); CHECK(lambdarank_truncation_level >0);
GetBool(params, "lambdamart_norm", &lambdamart_norm); GetBool(params, "lambdarank_norm", &lambdarank_norm);
if (GetString(params, "label_gain", &tmp_str)) { if (GetString(params, "label_gain", &tmp_str)) {
label_gain = Common::StringToArray<double>(tmp_str, ','); label_gain = Common::StringToArray<double>(tmp_str, ',');
} }
GetInt(params, "objective_seed", &objective_seed);
GetInt(params, "metric_freq", &metric_freq); GetInt(params, "metric_freq", &metric_freq);
CHECK(metric_freq >0); CHECK(metric_freq >0);
...@@ -675,6 +675,7 @@ std::string Config::SaveMembersToString() const { ...@@ -675,6 +675,7 @@ std::string Config::SaveMembersToString() const {
str_buf << "[output_result: " << output_result << "]\n"; str_buf << "[output_result: " << output_result << "]\n";
str_buf << "[convert_model_language: " << convert_model_language << "]\n"; str_buf << "[convert_model_language: " << convert_model_language << "]\n";
str_buf << "[convert_model: " << convert_model << "]\n"; str_buf << "[convert_model: " << convert_model << "]\n";
str_buf << "[objective_seed: " << objective_seed << "]\n";
str_buf << "[num_class: " << num_class << "]\n"; str_buf << "[num_class: " << num_class << "]\n";
str_buf << "[is_unbalance: " << is_unbalance << "]\n"; str_buf << "[is_unbalance: " << is_unbalance << "]\n";
str_buf << "[scale_pos_weight: " << scale_pos_weight << "]\n"; str_buf << "[scale_pos_weight: " << scale_pos_weight << "]\n";
...@@ -685,10 +686,9 @@ std::string Config::SaveMembersToString() const { ...@@ -685,10 +686,9 @@ std::string Config::SaveMembersToString() const {
str_buf << "[fair_c: " << fair_c << "]\n"; str_buf << "[fair_c: " << fair_c << "]\n";
str_buf << "[poisson_max_delta_step: " << poisson_max_delta_step << "]\n"; str_buf << "[poisson_max_delta_step: " << poisson_max_delta_step << "]\n";
str_buf << "[tweedie_variance_power: " << tweedie_variance_power << "]\n"; str_buf << "[tweedie_variance_power: " << tweedie_variance_power << "]\n";
str_buf << "[max_position: " << max_position << "]\n"; str_buf << "[lambdarank_truncation_level: " << lambdarank_truncation_level << "]\n";
str_buf << "[lambdamart_norm: " << lambdamart_norm << "]\n"; str_buf << "[lambdarank_norm: " << lambdarank_norm << "]\n";
str_buf << "[label_gain: " << Common::Join(label_gain, ",") << "]\n"; str_buf << "[label_gain: " << Common::Join(label_gain, ",") << "]\n";
str_buf << "[objective_seed: " << objective_seed << "]\n";
str_buf << "[metric_freq: " << metric_freq << "]\n"; str_buf << "[metric_freq: " << metric_freq << "]\n";
str_buf << "[is_provide_training_metric: " << is_provide_training_metric << "]\n"; str_buf << "[is_provide_training_metric: " << is_provide_training_metric << "]\n";
str_buf << "[eval_at: " << Common::Join(eval_at, ",") << "]\n"; str_buf << "[eval_at: " << Common::Join(eval_at, ",") << "]\n";
......
...@@ -7,7 +7,6 @@ ...@@ -7,7 +7,6 @@
#include "binary_objective.hpp" #include "binary_objective.hpp"
#include "multiclass_objective.hpp" #include "multiclass_objective.hpp"
#include "rank_objective.hpp" #include "rank_objective.hpp"
#include "rank_xendcg_objective.hpp"
#include "regression_objective.hpp" #include "regression_objective.hpp"
#include "xentropy_objective.hpp" #include "xentropy_objective.hpp"
......
/*! /*!
* Copyright (c) 2016 Microsoft Corporation. All rights reserved. * Copyright (c) 2020 Microsoft Corporation. All rights reserved.
* Licensed under the MIT License. See LICENSE file in the project root for license information. * Licensed under the MIT License. See LICENSE file in the project root for
* license information.
*/ */
#ifndef LIGHTGBM_OBJECTIVE_RANK_OBJECTIVE_HPP_ #ifndef LIGHTGBM_OBJECTIVE_RANK_OBJECTIVE_HPP_
#define LIGHTGBM_OBJECTIVE_RANK_OBJECTIVE_HPP_ #define LIGHTGBM_OBJECTIVE_RANK_OBJECTIVE_HPP_
...@@ -8,29 +9,102 @@ ...@@ -8,29 +9,102 @@
#include <LightGBM/metric.h> #include <LightGBM/metric.h>
#include <LightGBM/objective_function.h> #include <LightGBM/objective_function.h>
#include <limits>
#include <string>
#include <algorithm> #include <algorithm>
#include <cmath> #include <cmath>
#include <cstdio> #include <cstdio>
#include <cstring> #include <cstring>
#include <limits>
#include <string>
#include <vector> #include <vector>
namespace LightGBM { namespace LightGBM {
/*!
* \brief Objective function for Ranking
*/
class RankingObjective : public ObjectiveFunction {
public:
explicit RankingObjective(const Config& config)
: seed_(config.objective_seed) {}
explicit RankingObjective(const std::vector<std::string>&) : seed_(0) {}
~RankingObjective() {}
void Init(const Metadata& metadata, data_size_t num_data) override {
num_data_ = num_data;
// get label
label_ = metadata.label();
// get weights
weights_ = metadata.weights();
// get boundries
query_boundaries_ = metadata.query_boundaries();
if (query_boundaries_ == nullptr) {
Log::Fatal("Ranking tasks require query information");
}
num_queries_ = metadata.num_queries();
}
void GetGradients(const double* score, score_t* gradients,
score_t* hessians) const override {
#pragma omp parallel for schedule(guided)
for (data_size_t i = 0; i < num_queries_; ++i) {
const data_size_t start = query_boundaries_[i];
const data_size_t cnt = query_boundaries_[i + 1] - query_boundaries_[i];
GetGradientsForOneQuery(i, cnt, label_ + start, score + start,
gradients + start, hessians + start);
if (weights_ != nullptr) {
for (data_size_t j = 0; j < cnt; ++j) {
gradients[start + j] =
static_cast<score_t>(gradients[start + j] * weights_[start + j]);
hessians[start + j] =
static_cast<score_t>(hessians[start + j] * weights_[start + j]);
}
}
}
}
virtual void GetGradientsForOneQuery(data_size_t query_id, data_size_t cnt,
const label_t* label,
const double* score, score_t* lambdas,
score_t* hessians) const = 0;
virtual const char* GetName() const override = 0;
std::string ToString() const override {
std::stringstream str_buf;
str_buf << GetName();
return str_buf.str();
}
bool NeedAccuratePrediction() const override { return false; }
protected:
int seed_;
data_size_t num_queries_;
/*! \brief Number of data */
data_size_t num_data_;
/*! \brief Pointer of label */
const label_t* label_;
/*! \brief Pointer of weights */
const label_t* weights_;
/*! \brief Query boundries */
const data_size_t* query_boundaries_;
};
/*! /*!
* \brief Objective function for Lambdrank with NDCG * \brief Objective function for Lambdrank with NDCG
*/ */
class LambdarankNDCG: public ObjectiveFunction { class LambdarankNDCG : public RankingObjective {
public: public:
explicit LambdarankNDCG(const Config& config) { explicit LambdarankNDCG(const Config& config)
sigmoid_ = static_cast<double>(config.sigmoid); : RankingObjective(config),
norm_ = config.lambdamart_norm; sigmoid_(config.sigmoid),
norm_(config.lambdarank_norm),
truncation_level_(config.lambdarank_truncation_level) {
label_gain_ = config.label_gain; label_gain_ = config.label_gain;
// initialize DCG calculator // initialize DCG calculator
DCGCalculator::DefaultLabelGain(&label_gain_); DCGCalculator::DefaultLabelGain(&label_gain_);
DCGCalculator::Init(label_gain_); DCGCalculator::Init(label_gain_);
// will optimize NDCG@optimize_pos_at_
optimize_pos_at_ = config.max_position;
sigmoid_table_.clear(); sigmoid_table_.clear();
inverse_max_dcgs_.clear(); inverse_max_dcgs_.clear();
if (sigmoid_ <= 0.0) { if (sigmoid_ <= 0.0) {
...@@ -38,31 +112,20 @@ class LambdarankNDCG: public ObjectiveFunction { ...@@ -38,31 +112,20 @@ class LambdarankNDCG: public ObjectiveFunction {
} }
} }
explicit LambdarankNDCG(const std::vector<std::string>&) { explicit LambdarankNDCG(const std::vector<std::string>& strs)
} : RankingObjective(strs) {}
~LambdarankNDCG() {}
~LambdarankNDCG() {
}
void Init(const Metadata& metadata, data_size_t num_data) override { void Init(const Metadata& metadata, data_size_t num_data) override {
num_data_ = num_data; RankingObjective::Init(metadata, num_data);
// get label
label_ = metadata.label();
DCGCalculator::CheckLabel(label_, num_data_); DCGCalculator::CheckLabel(label_, num_data_);
// get weights
weights_ = metadata.weights();
// get boundries
query_boundaries_ = metadata.query_boundaries();
if (query_boundaries_ == nullptr) {
Log::Fatal("Lambdarank tasks require query information");
}
num_queries_ = metadata.num_queries();
// cache inverse max DCG, avoid computation many times
inverse_max_dcgs_.resize(num_queries_); inverse_max_dcgs_.resize(num_queries_);
#pragma omp parallel for schedule(static) #pragma omp parallel for schedule(static)
for (data_size_t i = 0; i < num_queries_; ++i) { for (data_size_t i = 0; i < num_queries_; ++i) {
inverse_max_dcgs_[i] = DCGCalculator::CalMaxDCGAtK(optimize_pos_at_, inverse_max_dcgs_[i] = DCGCalculator::CalMaxDCGAtK(
label_ + query_boundaries_[i], truncation_level_, label_ + query_boundaries_[i],
query_boundaries_[i + 1] - query_boundaries_[i]); query_boundaries_[i + 1] - query_boundaries_[i]);
if (inverse_max_dcgs_[i] > 0.0) { if (inverse_max_dcgs_[i] > 0.0) {
inverse_max_dcgs_[i] = 1.0f / inverse_max_dcgs_[i]; inverse_max_dcgs_[i] = 1.0f / inverse_max_dcgs_[i];
...@@ -72,39 +135,25 @@ class LambdarankNDCG: public ObjectiveFunction { ...@@ -72,39 +135,25 @@ class LambdarankNDCG: public ObjectiveFunction {
ConstructSigmoidTable(); ConstructSigmoidTable();
} }
void GetGradients(const double* score, score_t* gradients, inline void GetGradientsForOneQuery(data_size_t query_id, data_size_t cnt,
score_t* hessians) const override { const label_t* label, const double* score,
#pragma omp parallel for schedule(guided) score_t* lambdas,
for (data_size_t i = 0; i < num_queries_; ++i) { score_t* hessians) const override {
GetGradientsForOneQuery(score, gradients, hessians, i);
}
}
inline void GetGradientsForOneQuery(const double* score,
score_t* lambdas, score_t* hessians, data_size_t query_id) const {
// get doc boundary for current query
const data_size_t start = query_boundaries_[query_id];
const data_size_t cnt =
query_boundaries_[query_id + 1] - query_boundaries_[query_id];
// get max DCG on current query // get max DCG on current query
const double inverse_max_dcg = inverse_max_dcgs_[query_id]; const double inverse_max_dcg = inverse_max_dcgs_[query_id];
// add pointers with offset
const label_t* label = label_ + start;
score += start;
lambdas += start;
hessians += start;
// initialize with zero // initialize with zero
for (data_size_t i = 0; i < cnt; ++i) { for (data_size_t i = 0; i < cnt; ++i) {
lambdas[i] = 0.0f; lambdas[i] = 0.0f;
hessians[i] = 0.0f; hessians[i] = 0.0f;
} }
// get sorted indices for scores // get sorted indices for scores
std::vector<data_size_t> sorted_idx; std::vector<data_size_t> sorted_idx(cnt);
for (data_size_t i = 0; i < cnt; ++i) { for (data_size_t i = 0; i < cnt; ++i) {
sorted_idx.emplace_back(i); sorted_idx[i] = i;
} }
std::stable_sort(sorted_idx.begin(), sorted_idx.end(), std::stable_sort(
[score](data_size_t a, data_size_t b) { return score[a] > score[b]; }); sorted_idx.begin(), sorted_idx.end(),
[score](data_size_t a, data_size_t b) { return score[a] > score[b]; });
// get best and worst score // get best and worst score
const double best_score = score[sorted_idx[0]]; const double best_score = score[sorted_idx[0]];
data_size_t worst_idx = cnt - 1; data_size_t worst_idx = cnt - 1;
...@@ -118,20 +167,25 @@ class LambdarankNDCG: public ObjectiveFunction { ...@@ -118,20 +167,25 @@ class LambdarankNDCG: public ObjectiveFunction {
const data_size_t high = sorted_idx[i]; const data_size_t high = sorted_idx[i];
const int high_label = static_cast<int>(label[high]); const int high_label = static_cast<int>(label[high]);
const double high_score = score[high]; const double high_score = score[high];
if (high_score == kMinScore) { continue; } if (high_score == kMinScore) {
continue;
}
const double high_label_gain = label_gain_[high_label]; const double high_label_gain = label_gain_[high_label];
const double high_discount = DCGCalculator::GetDiscount(i); const double high_discount = DCGCalculator::GetDiscount(i);
double high_sum_lambda = 0.0; double high_sum_lambda = 0.0;
double high_sum_hessian = 0.0; double high_sum_hessian = 0.0;
for (data_size_t j = 0; j < cnt; ++j) { for (data_size_t j = 0; j < cnt; ++j) {
// skip same data // skip same data
if (i == j) { continue; } if (i == j) {
continue;
}
const data_size_t low = sorted_idx[j]; const data_size_t low = sorted_idx[j];
const int low_label = static_cast<int>(label[low]); const int low_label = static_cast<int>(label[low]);
const double low_score = score[low]; const double low_score = score[low];
// only consider pair with different label // only consider pair with different label
if (high_label <= low_label || low_score == kMinScore) { continue; } if (high_label <= low_label || low_score == kMinScore) {
continue;
}
const double delta_score = high_score - low_score; const double delta_score = high_score - low_score;
...@@ -144,7 +198,7 @@ class LambdarankNDCG: public ObjectiveFunction { ...@@ -144,7 +198,7 @@ class LambdarankNDCG: public ObjectiveFunction {
// get delta NDCG // get delta NDCG
double delta_pair_NDCG = dcg_gap * paired_discount * inverse_max_dcg; double delta_pair_NDCG = dcg_gap * paired_discount * inverse_max_dcg;
// regular the delta_pair_NDCG by score distance // regular the delta_pair_NDCG by score distance
if (norm_ && high_label != low_label && best_score != worst_score) { if (norm_ && best_score != worst_score) {
delta_pair_NDCG /= (0.01f + fabs(delta_score)); delta_pair_NDCG /= (0.01f + fabs(delta_score));
} }
// calculate lambda for this pair // calculate lambda for this pair
...@@ -171,25 +225,18 @@ class LambdarankNDCG: public ObjectiveFunction { ...@@ -171,25 +225,18 @@ class LambdarankNDCG: public ObjectiveFunction {
hessians[i] = static_cast<score_t>(hessians[i] * norm_factor); hessians[i] = static_cast<score_t>(hessians[i] * norm_factor);
} }
} }
// if need weights
if (weights_ != nullptr) {
for (data_size_t i = 0; i < cnt; ++i) {
lambdas[i] = static_cast<score_t>(lambdas[i] * weights_[start + i]);
hessians[i] = static_cast<score_t>(hessians[i] * weights_[start + i]);
}
}
} }
inline double GetSigmoid(double score) const { inline double GetSigmoid(double score) const {
if (score <= min_sigmoid_input_) { if (score <= min_sigmoid_input_) {
// too small, use lower bound // too small, use lower bound
return sigmoid_table_[0]; return sigmoid_table_[0];
} else if (score >= max_sigmoid_input_) { } else if (score >= max_sigmoid_input_) {
// too big, use upper bound // too large, use upper bound
return sigmoid_table_[_sigmoid_bins - 1]; return sigmoid_table_[_sigmoid_bins - 1];
} else { } else {
return sigmoid_table_[static_cast<size_t>((score - min_sigmoid_input_) * sigmoid_table_idx_factor_)]; return sigmoid_table_[static_cast<size_t>((score - min_sigmoid_input_) *
sigmoid_table_idx_factor_)];
} }
} }
...@@ -200,7 +247,7 @@ class LambdarankNDCG: public ObjectiveFunction { ...@@ -200,7 +247,7 @@ class LambdarankNDCG: public ObjectiveFunction {
sigmoid_table_.resize(_sigmoid_bins); sigmoid_table_.resize(_sigmoid_bins);
// get score to bin factor // get score to bin factor
sigmoid_table_idx_factor_ = sigmoid_table_idx_factor_ =
_sigmoid_bins / (max_sigmoid_input_ - min_sigmoid_input_); _sigmoid_bins / (max_sigmoid_input_ - min_sigmoid_input_);
// cache // cache
for (size_t i = 0; i < _sigmoid_bins; ++i) { for (size_t i = 0; i < _sigmoid_bins; ++i) {
const double score = i / sigmoid_table_idx_factor_ + min_sigmoid_input_; const double score = i / sigmoid_table_idx_factor_ + min_sigmoid_input_;
...@@ -208,41 +255,20 @@ class LambdarankNDCG: public ObjectiveFunction { ...@@ -208,41 +255,20 @@ class LambdarankNDCG: public ObjectiveFunction {
} }
} }
const char* GetName() const override { const char* GetName() const override { return "lambdarank"; }
return "lambdarank";
}
std::string ToString() const override {
std::stringstream str_buf;
str_buf << GetName();
return str_buf.str();
}
bool NeedAccuratePrediction() const override { return false; }
private: private:
/*! \brief Gains for labels */
std::vector<double> label_gain_;
/*! \brief Cache inverse max DCG, speed up calculation */
std::vector<double> inverse_max_dcgs_;
/*! \brief Simgoid param */ /*! \brief Simgoid param */
double sigmoid_; double sigmoid_;
/*! \brief Normalize the lambdas or not */ /*! \brief Normalize the lambdas or not */
bool norm_; bool norm_;
/*! \brief Optimized NDCG@ */ /*! \brief truncation position for max ndcg */
int optimize_pos_at_; int truncation_level_;
/*! \brief Number of queries */ /*! \brief Cache inverse max DCG, speed up calculation */
data_size_t num_queries_; std::vector<double> inverse_max_dcgs_;
/*! \brief Number of data */
data_size_t num_data_;
/*! \brief Pointer of label */
const label_t* label_;
/*! \brief Pointer of weights */
const label_t* weights_;
/*! \brief Query boundries */
const data_size_t* query_boundaries_;
/*! \brief Cache result for sigmoid transform to speed up */ /*! \brief Cache result for sigmoid transform to speed up */
std::vector<double> sigmoid_table_; std::vector<double> sigmoid_table_;
std::vector<double> label_gain_;
/*! \brief Number of bins in simoid table */ /*! \brief Number of bins in simoid table */
size_t _sigmoid_bins = 1024 * 1024; size_t _sigmoid_bins = 1024 * 1024;
/*! \brief Minimal input of sigmoid table */ /*! \brief Minimal input of sigmoid table */
...@@ -253,5 +279,82 @@ class LambdarankNDCG: public ObjectiveFunction { ...@@ -253,5 +279,82 @@ class LambdarankNDCG: public ObjectiveFunction {
double sigmoid_table_idx_factor_; double sigmoid_table_idx_factor_;
}; };
/*!
* \brief Implementation of the learning-to-rank objective function, XE_NDCG
* [arxiv.org/abs/1911.09798].
*/
class RankXENDCG : public RankingObjective {
public:
explicit RankXENDCG(const Config& config) : RankingObjective(config) {}
explicit RankXENDCG(const std::vector<std::string>& strs)
: RankingObjective(strs) {}
~RankXENDCG() {}
void Init(const Metadata& metadata, data_size_t num_data) override {
RankingObjective::Init(metadata, num_data);
for (data_size_t i = 0; i < num_queries_; ++i) {
rands_.emplace_back(seed_ + i);
}
}
inline void GetGradientsForOneQuery(data_size_t query_id, data_size_t cnt,
const label_t* label, const double* score,
score_t* lambdas,
score_t* hessians) const override {
// Turn scores into a probability distribution using Softmax.
std::vector<double> rho(cnt, 0.0);
Common::Softmax(score, rho.data(), cnt);
// used for Phi and L1
std::vector<double> l1s(cnt);
double sum_labels = 0;
for (data_size_t i = 0; i < cnt; ++i) {
l1s[i] = Phi(label[i], rands_[query_id].NextFloat());
sum_labels += l1s[i];
}
// sum_labels will always be positive number
sum_labels = std::max<double>(kEpsilon, sum_labels);
// Approximate gradients and inverse Hessian.
// First order terms.
double sum_l1 = 0.0f;
for (data_size_t i = 0; i < cnt; ++i) {
l1s[i] = -l1s[i] / sum_labels + rho[i];
sum_l1 += l1s[i];
}
if (cnt <= 1) {
// when cnt <= 1, the l2 and l3 are zeros
for (data_size_t i = 0; i < cnt; ++i) {
lambdas[i] = static_cast<score_t>(l1s[i]);
hessians[i] = static_cast<score_t>(rho[i] * (1.0 - rho[i]));
}
} else {
// Second order terms.
std::vector<double> l2s(cnt, 0.0);
double sum_l2 = 0.0;
for (data_size_t i = 0; i < cnt; ++i) {
l2s[i] = (sum_l1 - l1s[i]) / (1 - rho[i]);
sum_l2 += l2s[i];
}
for (data_size_t i = 0; i < cnt; ++i) {
auto l3 = (sum_l2 - l2s[i]) / (1 - rho[i]);
lambdas[i] = static_cast<score_t>(l1s[i] + rho[i] * l2s[i] +
rho[i] * rho[i] * l3);
hessians[i] = static_cast<score_t>(rho[i] * (1.0 - rho[i]));
}
}
}
double Phi(const label_t l, double g) const {
return Common::Pow(2, static_cast<int>(l)) - g;
}
const char* GetName() const override { return "rank_xendcg"; }
private:
mutable std::vector<Random> rands_;
};
} // namespace LightGBM } // namespace LightGBM
#endif // LightGBM_OBJECTIVE_RANK_OBJECTIVE_HPP_ #endif // LightGBM_OBJECTIVE_RANK_OBJECTIVE_HPP_
/*!
* Copyright (c) 2019 Microsoft Corporation. All rights reserved.
* Licensed under the MIT License. See LICENSE file in the project root for license information.
*/
#ifndef LIGHTGBM_OBJECTIVE_RANK_XENDCG_OBJECTIVE_HPP_
#define LIGHTGBM_OBJECTIVE_RANK_XENDCG_OBJECTIVE_HPP_
#include <LightGBM/objective_function.h>
#include <LightGBM/utils/common.h>
#include <LightGBM/utils/random.h>
#include <string>
#include <vector>
namespace LightGBM {
/*!
* \brief Implementation of the learning-to-rank objective function, XE_NDCG [arxiv.org/abs/1911.09798].
*/
class RankXENDCG: public ObjectiveFunction {
public:
explicit RankXENDCG(const Config& config) {
rand_ = new Random(config.objective_seed);
}
explicit RankXENDCG(const std::vector<std::string>&) {
rand_ = new Random();
}
~RankXENDCG() {
}
void Init(const Metadata& metadata, data_size_t) override {
// get label
label_ = metadata.label();
// get boundries
query_boundaries_ = metadata.query_boundaries();
if (query_boundaries_ == nullptr) {
Log::Fatal("RankXENDCG tasks require query information");
}
num_queries_ = metadata.num_queries();
}
void GetGradients(const double* score, score_t* gradients,
score_t* hessians) const override {
#pragma omp parallel for schedule(guided)
for (data_size_t i = 0; i < num_queries_; ++i) {
GetGradientsForOneQuery(score, gradients, hessians, i);
}
}
inline void GetGradientsForOneQuery(
const double* score,
score_t* lambdas, score_t* hessians, data_size_t query_id) const {
// get doc boundary for current query
const data_size_t start = query_boundaries_[query_id];
const data_size_t cnt =
query_boundaries_[query_id + 1] - query_boundaries_[query_id];
// add pointers with offset
const label_t* label = label_ + start;
score += start;
lambdas += start;
hessians += start;
// Turn scores into a probability distribution using Softmax.
std::vector<double> rho(cnt);
Common::Softmax(score, &rho[0], cnt);
// Prepare a vector of gammas, a parameter of the loss.
std::vector<double> gammas(cnt);
for (data_size_t i = 0; i < cnt; ++i) {
gammas[i] = rand_->NextFloat();
}
// Skip query if sum of labels is 0.
float sum_labels = 0;
for (data_size_t i = 0; i < cnt; ++i) {
sum_labels += static_cast<float>(phi(label[i], gammas[i]));
}
if (std::fabs(sum_labels) < kEpsilon) {
return;
}
// Approximate gradients and inverse Hessian.
// First order terms.
std::vector<double> L1s(cnt);
for (data_size_t i = 0; i < cnt; ++i) {
L1s[i] = -phi(label[i], gammas[i])/sum_labels + rho[i];
}
// Second-order terms.
std::vector<double> L2s(cnt);
for (data_size_t i = 0; i < cnt; ++i) {
for (data_size_t j = 0; j < cnt; ++j) {
if (i == j) continue;
L2s[i] += L1s[j] / (1 - rho[j]);
}
}
// Third-order terms.
std::vector<double> L3s(cnt);
for (data_size_t i = 0; i < cnt; ++i) {
for (data_size_t j = 0; j < cnt; ++j) {
if (i == j) continue;
L3s[i] += rho[j] * L2s[j] / (1 - rho[j]);
}
}
// Finally, prepare lambdas and hessians.
for (data_size_t i = 0; i < cnt; ++i) {
lambdas[i] = static_cast<score_t>(
L1s[i] + rho[i]*L2s[i] + rho[i]*L3s[i]);
hessians[i] = static_cast<score_t>(rho[i] * (1.0 - rho[i]));
}
}
double phi(const label_t l, double g) const {
return Common::Pow(2, static_cast<int>(l)) - g;
}
const char* GetName() const override {
return "rank_xendcg";
}
std::string ToString() const override {
std::stringstream str_buf;
str_buf << GetName();
return str_buf.str();
}
bool NeedAccuratePrediction() const override { return false; }
private:
/*! \brief Number of queries */
data_size_t num_queries_;
/*! \brief Pointer of label */
const label_t* label_;
/*! \brief Query boundries */
const data_size_t* query_boundaries_;
/*! \brief Pseudo-random number generator */
Random* rand_;
};
} // namespace LightGBM
#endif // LightGBM_OBJECTIVE_RANK_XENDCG_OBJECTIVE_HPP_
...@@ -129,8 +129,8 @@ class TestSklearn(unittest.TestCase): ...@@ -129,8 +129,8 @@ class TestSklearn(unittest.TestCase):
eval_metric='ndcg', eval_metric='ndcg',
callbacks=[lgb.reset_parameter(learning_rate=lambda x: max(0.01, 0.1 - 0.01 * x))]) callbacks=[lgb.reset_parameter(learning_rate=lambda x: max(0.01, 0.1 - 0.01 * x))])
self.assertLessEqual(gbm.best_iteration_, 24) self.assertLessEqual(gbm.best_iteration_, 24)
self.assertGreater(gbm.best_score_['valid_0']['ndcg@1'], 0.6559) self.assertGreater(gbm.best_score_['valid_0']['ndcg@1'], 0.6382)
self.assertGreater(gbm.best_score_['valid_0']['ndcg@3'], 0.6421) self.assertGreater(gbm.best_score_['valid_0']['ndcg@3'], 0.6319)
def test_regression_with_custom_objective(self): def test_regression_with_custom_objective(self):
X, y = load_boston(True) X, y = load_boston(True)
......
...@@ -243,7 +243,6 @@ ...@@ -243,7 +243,6 @@
<ClInclude Include="..\src\network\socket_wrapper.hpp" /> <ClInclude Include="..\src\network\socket_wrapper.hpp" />
<ClInclude Include="..\src\objective\binary_objective.hpp" /> <ClInclude Include="..\src\objective\binary_objective.hpp" />
<ClInclude Include="..\src\objective\rank_objective.hpp" /> <ClInclude Include="..\src\objective\rank_objective.hpp" />
<ClInclude Include="..\src\objective\rank_xendcg_objective.hpp" />
<ClInclude Include="..\src\objective\regression_objective.hpp" /> <ClInclude Include="..\src\objective\regression_objective.hpp" />
<ClInclude Include="..\src\objective\multiclass_objective.hpp" /> <ClInclude Include="..\src\objective\multiclass_objective.hpp" />
<ClInclude Include="..\src\objective\xentropy_objective.hpp" /> <ClInclude Include="..\src\objective\xentropy_objective.hpp" />
...@@ -291,4 +290,4 @@ ...@@ -291,4 +290,4 @@
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" /> <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets"> <ImportGroup Label="ExtensionTargets">
</ImportGroup> </ImportGroup>
</Project> </Project>
\ No newline at end of file
...@@ -84,9 +84,6 @@ ...@@ -84,9 +84,6 @@
<ClInclude Include="..\src\objective\rank_objective.hpp"> <ClInclude Include="..\src\objective\rank_objective.hpp">
<Filter>src\objective</Filter> <Filter>src\objective</Filter>
</ClInclude> </ClInclude>
<ClInclude Include="..\src\objective\rank_xendcg_objective.hpp">
<Filter>src\objective</Filter>
</ClInclude>
<ClInclude Include="..\src\objective\regression_objective.hpp"> <ClInclude Include="..\src\objective\regression_objective.hpp">
<Filter>src\objective</Filter> <Filter>src\objective</Filter>
</ClInclude> </ClInclude>
...@@ -312,4 +309,4 @@ ...@@ -312,4 +309,4 @@
<Filter>src\io</Filter> <Filter>src\io</Filter>
</ClCompile> </ClCompile>
</ItemGroup> </ItemGroup>
</Project> </Project>
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment