[ci][docs] fix link checking action by switching from linkchecker to lychee...

[ci][docs] fix link checking action by switching from linkchecker to lychee and update some links (#7027)

[ci][docs] fix link checking action by switching from linkchecker to lychee...
[ci][docs] fix link checking action by switching from linkchecker to lychee and update some links (#7027)
6368375b · Nikita Titov · GitHub · ce3e3121 · 6368375b · 6368375b
Unverified Commit 6368375b authored Oct 11, 2025 by Nikita Titov Committed by GitHub Oct 11, 2025
20 changed files
--- a/.ci/test-docs.sh
+++ b/.ci/test-docs.sh
@@ -15,6 +15,15 @@ make -C docs html || exit 1
 if [[ $TASK == "check-links" ]]; then
    # check docs for broken links
-    pip install 'linkchecker>=10.5.0'
+    conda install -y -n test-env 'lychee>=0.20.1'
-    linkchecker --config=./docs/.linkcheckerrc ./docs/_build/html/*.html || exit 1
+    # to see all gained files add "--dump-inputs" flag
+    # to see all gained links add "--dump" flag
+    lychee \
+        "--config=./docs/.lychee.toml" \
+        "--" \
+        "**/*.rst" \
+        "**/*.md" \
+        "./R-package/**/*.Rd" \
+        "./docs/_build/html/*.html" \
+    || exit 1
 fi
--- a/.github/workflows/linkchecker.yml
+++ b/.github/workflows/linkchecker.yml
@@ -11,6 +11,7 @@ env:
  COMPILER: gcc
  OS_NAME: 'linux'
  TASK: 'check-links'
+  GITHUB_TOKEN: ${{ github.token }}
 jobs:
  check-links:

--- a/R-package/DESCRIPTION
+++ b/R-package/DESCRIPTION
@@ -27,7 +27,7 @@ Authors@R: c(
    person("Michael", "Mayer", role = c("ctb"))
    )
 Description: Tree based algorithms can be improved by introducing boosting frameworks.
-    'LightGBM' is one such framework, based on Ke, Guolin et al. (2017) <https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision>.
+    'LightGBM' is one such framework, based on Ke, Guolin et al. (2017) <https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html>.
    This package offers an R interface to work with it.
    It is designed to be distributed and efficient with the following advantages:
        1. Faster training speed and higher efficiency.

--- a/R-package/R/lightgbm.R
+++ b/R-package/R/lightgbm.R
@@ -281,7 +281,7 @@ lightgbm <- function(data,
 #' https://archive.ics.uci.edu/ml/datasets/Mushroom
 #'
 #' Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository
-#' [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California,
+#' [https://archive.ics.uci.edu/ml]. Irvine, CA: University of California,
 #' School of Information and Computer Science.
 #'
 #' @docType data
@@ -305,7 +305,7 @@ NULL
 #' https://archive.ics.uci.edu/ml/datasets/Mushroom
 #'
 #' Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository
-#' [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California,
+#' [https://archive.ics.uci.edu/ml]. Irvine, CA: University of California,
 #' School of Information and Computer Science.
 #'
 #' @docType data
@@ -324,7 +324,7 @@ NULL
 #'              randomly selected from 3 (older version of this dataset with less inputs).
 #'
 #' @references
-#' http://archive.ics.uci.edu/ml/datasets/Bank+Marketing
+#' https://archive.ics.uci.edu/ml/datasets/Bank+Marketing
 #'
 #' S. Moro, P. Cortez and P. Rita. (2014)
 #' A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems

--- a/R-package/README.md
+++ b/R-package/README.md
@@ -20,7 +20,6 @@
    - [Code Coverage](#code-coverage)
 * [Updating Documentation](#updating-documentation)
 * [Preparing a CRAN Package](#preparing-a-cran-package)
-* [External Repositories](#external-unofficial-repositories)
 * [Known Issues](#known-issues)
 Installation
@@ -77,7 +76,7 @@ CXX=g++-8
 CXX11=g++-8
 ```
-### Installing from Source with CMake <a name="install"></a>
+### Installing from Source with CMake <a id="install"></a>
 You need to install git and [CMake](https://cmake.org/) first.
@@ -215,7 +214,7 @@ These packages do not require compilation, so they will be faster and easier to
 CRAN does not prepare precompiled binaries for Linux, and as of this writing neither does this project.
-### Installing from a Pre-compiled lib_lightgbm <a name="lib_lightgbm"></a>
+### Installing from a Pre-compiled lib_lightgbm <a id="lib_lightgbm"></a>
 Previous versions of LightGBM offered the ability to first compile the C++ library (`lib_lightgbm.{dll,dylib,so}`) and then build an R-package that wraps it.

--- a/R-package/cran-comments.md
+++ b/R-package/cran-comments.md
@@ -702,11 +702,11 @@ Responded to CRAN with the following:
 The paper citation has been adjusted as requested. We were using 'glmnet' as a  guide on how to include the URL but maybe they are no longer in compliance with CRAN policies: https://github.com/cran/glmnet/blob/b1a4b50de01e0cd24343959d7cf86452bac17b26/DESCRIPTION
-All authors from the original LightGBM paper have been added to Authors@R as `"aut"`. We have also added Microsoft and DropBox, Inc. as `"cph"` (copyright holders). These roles were chosen based on the guidance in https://journal.r-project.org/archive/2012-1/RJournal_2012-1_Hornik~et~al.pdf.
+All authors from the original LightGBM paper have been added to Authors@R as `"aut"`. We have also added Microsoft and DropBox, Inc. as `"cph"` (copyright holders). These roles were chosen based on the guidance in https://journal.r-project.org/archive/2012/RJ-2012-009/index.html.
 lightgbm's code does use `<<-`, but it does not modify the global environment. The uses of `<<-` in R/lgb.interprete.R and R/callback.R are in functions which are called in an environment created by the lightgbm functions that call them, and this operator is used to reach one level up into the calling function's environment.
-We chose to wrap our examples in `\donttest{}` because we found, through testing on https://builder.r-hub.io/ and in our own continuous integration environments, that their run time varies a lot between platforms, and we cannot guarantee that all examples will run in under 5 seconds. We intentionally chose `\donttest{}` over `\donttest{}` because this item in the R 4.0.0 changelog (https://cran.r-project.org/doc/manuals/r-devel/NEWS.html) seems to indicate that \donttest will be ignored by CRAN's automated checks:
+We chose to wrap our examples in `\donttest{}` because we found, through testing on https://r-hub.github.io/rhub/ and in our own continuous integration environments, that their run time varies a lot between platforms, and we cannot guarantee that all examples will run in under 5 seconds. We intentionally chose `\donttest{}` over `\donttest{}` because this item in the R 4.0.0 changelog (https://cran.r-project.org/doc/manuals/r-devel/NEWS.html) seems to indicate that \donttest will be ignored by CRAN's automated checks:
 > "`R CMD check --as-cran` now runs \donttest examples (which are run by example()) instead of instructing the tester to do so. This can be temporarily circumvented during development by setting environment variable `_R_CHECK_DONTTEST_EXAMPLES_` to a false value."
@@ -813,7 +813,7 @@ YEAR: 2016
 COPYRIGHT HOLDER: Microsoft Corporation
 ```
-Added a citation and link for [the main paper](https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision) in `DESCRIPTION`.
+Added a citation and link for [the main paper](https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html) in `DESCRIPTION`.
 ## v3.0.0-1 - Submission 3 - (August 12, 2020)

--- a/R-package/man/agaricus.test.Rd
+++ b/R-package/man/agaricus.test.Rd
@@ -25,7 +25,7 @@ This data set is originally from the Mushroom data set,
 https://archive.ics.uci.edu/ml/datasets/Mushroom
 Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository
-[http://archive.ics.uci.edu/ml]. Irvine, CA: University of California,
+[https://archive.ics.uci.edu/ml]. Irvine, CA: University of California,
 School of Information and Computer Science.
 }
 \keyword{datasets}
--- a/R-package/man/agaricus.train.Rd
+++ b/R-package/man/agaricus.train.Rd
@@ -25,7 +25,7 @@ This data set is originally from the Mushroom data set,
 https://archive.ics.uci.edu/ml/datasets/Mushroom
 Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository
-[http://archive.ics.uci.edu/ml]. Irvine, CA: University of California,
+[https://archive.ics.uci.edu/ml]. Irvine, CA: University of California,
 School of Information and Computer Science.
 }
 \keyword{datasets}
--- a/R-package/man/bank.Rd
+++ b/R-package/man/bank.Rd
@@ -18,7 +18,7 @@ This data set is originally from the Bank Marketing data set,
             randomly selected from 3 (older version of this dataset with less inputs).
 }
 \references{
-http://archive.ics.uci.edu/ml/datasets/Bank+Marketing
+https://archive.ics.uci.edu/ml/datasets/Bank+Marketing
 S. Moro, P. Cortez and P. Rita. (2014)
 A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems

--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@ Light Gradient Boosting Machine
 [![Azure Pipelines Build Status](https://lightgbm-ci.visualstudio.com/lightgbm-ci/_apis/build/status/Microsoft.LightGBM?branchName=master)](https://lightgbm-ci.visualstudio.com/lightgbm-ci/_build/latest?definitionId=1)
 [![Appveyor Build Status](https://ci.appveyor.com/api/projects/status/1ys5ot401m0fep6l/branch/master?svg=true)](https://ci.appveyor.com/project/guolinke/lightgbm/branch/master)
 [![Documentation Status](https://readthedocs.org/projects/lightgbm/badge/?version=latest)](https://lightgbm.readthedocs.io/)
-[![Link checks](https://github.com/microsoft/LightGBM/actions/workflows/linkchecker.yml/badge.svg?branch=master)](https://github.com/microsoft/LightGBM/actions/workflows/linkchecker.yml)
+[![Link checks](https://github.com/microsoft/LightGBM/actions/workflows/lychee.yml/badge.svg?branch=master)](https://github.com/microsoft/LightGBM/actions/workflows/lychee.yml)
 [![License](https://img.shields.io/github/license/microsoft/lightgbm.svg)](https://github.com/microsoft/LightGBM/blob/master/LICENSE)
 [![EffVer Versioning](https://img.shields.io/badge/version_scheme-EffVer-0097a7)](https://jacobtomlinson.dev/effver)
 [![StackOverflow questions](https://img.shields.io/stackexchange/stackoverflow/t/lightgbm?logo=stackoverflow&logoColor=white&label=StackOverflow%20questions)](https://stackoverflow.com/questions/tagged/lightgbm?sort=votes)
@@ -168,9 +168,9 @@ This project has adopted the [Microsoft Open Source Code of Conduct](https://ope
 Reference Papers
 ----------------
-Yu Shi, Guolin Ke, Zhuoming Chen, Shuxin Zheng, Tie-Yan Liu. "Quantized Training of Gradient Boosting Decision Trees" ([link](https://papers.nips.cc/paper_files/paper/2022/hash/77911ed9e6e864ca1a3d165b2c3cb258-Abstract.html)). Advances in Neural Information Processing Systems 35 (NeurIPS 2022), pp. 18822-18833.
+Yu Shi, Guolin Ke, Zhuoming Chen, Shuxin Zheng, Tie-Yan Liu. "Quantized Training of Gradient Boosting Decision Trees" ([link](https://proceedings.neurips.cc/paper/2022/hash/77911ed9e6e864ca1a3d165b2c3cb258-Abstract.html)). Advances in Neural Information Processing Systems 35 (NeurIPS 2022), pp. 18822-18833.
-Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu. "[LightGBM: A Highly Efficient Gradient Boosting Decision Tree](https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree)". Advances in Neural Information Processing Systems 30 (NIPS 2017), pp. 3149-3157.
+Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu. "[LightGBM: A Highly Efficient Gradient Boosting Decision Tree](https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html)". Advances in Neural Information Processing Systems 30 (NIPS 2017), pp. 3149-3157.
 Qi Meng, Guolin Ke, Taifeng Wang, Wei Chen, Qiwei Ye, Zhi-Ming Ma, Tie-Yan Liu. "[A Communication-Efficient Parallel Algorithm for Decision Tree](https://proceedings.neurips.cc/paper/2016/hash/10a5ab2db37feedfdeaab192ead4ac0e-Abstract.html)". Advances in Neural Information Processing Systems 29 (NIPS 2016), pp. 1279-1287.

--- a/docs/.linkcheckerrc
+++ b/docs/.linkcheckerrc
-[checking]
-maxrequestspersecond=0.1
-recursionlevel=1
-anchors=1
-sslverify=0
-threads=4
-[filtering]
-ignore=
-  pythonapi/lightgbm\..*\.html.*
-  http.*amd.com/.*
-  https.*dl.acm.org/doi/.*
-  https.*tandfonline.com/.*
-ignorewarnings=http-redirected,http-robots-denied,https-certificate-error
-checkextern=1
-[output]
-# Set to 1 if you want see the full output, not only warnings and errors
-verbose=0
-[AnchorCheck]
--- a/docs/.lychee.toml
+++ b/docs/.lychee.toml
+verbose = "info"
+no_progress = false
+cache = false
+scheme = ["http", "https", "file"]
+include_mail = false
+include_fragments = true
+no_ignore = true
+insecure = false
+require_https = true
+accept = ["100..=103", "200..=299"]
+user_agent = "curl/7.88.1"
+header = {"User-Agent" = "curl/7.88.1"}
+timeout = 30
+retry_wait_time = 10
+max_concurrency = 10
+# remove anchors from GitHub URLs to overcome https://github.com/lycheeverse/lychee/issues/1729
+remap = [
+    '(?P<host>^https://github\.com)/(?P<path>.*)#(?P<anchor>.*)$ $host/$path/',
+]
+exclude = [
+    '^https://www\.swig\.org/download\.html$',
+    '^https://proceedings\.neurips\.cc/.*',
+    '^https://www\.amd\.com/en/support\.html$',
+    '^https://www\.jstor\.org/stable/2281952$',
+    '^https://dl\.acm\.org/doi/10\.1145/3298689\.3347033$',
+    '^https://packages\.ubuntu\.com/search.*',
+    '^https://stackoverflow\.com/.*',
+    '^https://.*\.stackexchange\.com/.*',
+]
+exclude_path = [
+    "(^|/)docs/.*\\.rst",
+]
--- a/docs/Advanced-Topics.rst
+++ b/docs/Advanced-Topics.rst
@@ -16,7 +16,7 @@ Categorical Feature Support
 ---------------------------
 -  LightGBM offers good accuracy with integer-encoded categorical features. LightGBM applies
-   `Fisher (1958) <https://www.tandfonline.com/doi/abs/10.1080/01621459.1958.10501479>`_
+   `Fisher (1958) <https://www.jstor.org/stable/2281952>`_
   to find the optimal split over categories as
   `described here <./Features.rst#optimal-split-for-categorical-features>`_. This often performs better than one-hot encoding.
@@ -46,7 +46,7 @@ LambdaRank
 Cost Efficient Gradient Boosting
 --------------------------------
-`Cost Efficient Gradient Boosting <https://papers.nips.cc/paper/6753-cost-efficient-gradient-boosting.pdf>`_ (CEGB)  makes it possible to penalise boosting based on the cost of obtaining feature values.
+`Cost Efficient Gradient Boosting <https://proceedings.neurips.cc/paper/2017/hash/4fac9ba115140ac4f1c22da82aa0bc7f-Abstract.html>`_ (CEGB)  makes it possible to penalise boosting based on the cost of obtaining feature values.
 CEGB penalises learning in the following ways:
 - Each time a tree is split, a penalty of ``cegb_penalty_split`` is applied.
@@ -112,4 +112,4 @@ Currently, implemented is an approach to model position bias by using an idea of
 During the training, the compound scoring function ``s(x, pos)`` is fit with a standard ranking algorithm (e.g., LambdaMART) which boils down to jointly learning the relevance component ``f(x)`` (it is later returned as an unbiased model) and the position factors ``g(pos)`` that help better explain the observed (biased) labels.
 Similar score decomposition ideas have previously been applied for classification & pointwise ranking tasks with assumptions of binary labels and binary relevance (a.k.a. "two-tower" models, refer to the papers: `Towards Disentangling Relevance and Bias in Unbiased Learning to Rank <https://arxiv.org/abs/2212.13937>`_, `PAL: a position-bias aware learning framework for CTR prediction in live recommender systems <https://dl.acm.org/doi/10.1145/3298689.3347033>`_, `A General Framework for Debiasing in CTR Prediction <https://arxiv.org/abs/2112.02767>`_).
 In LightGBM, we adapt this idea to general pairwise Lerarning-to-Rank with arbitrary ordinal relevance labels.
-Besides, GAMs have been used in the context of explainable ML (`Accurate Intelligible Models with Pairwise Interactions <https://www.cs.cornell.edu/~yinlou/papers/lou-kdd13.pdf>`_) to linearly decompose the contribution of each feature (and possibly their pairwise interactions) to the overall score, for subsequent analysis and interpretation of their effects in the trained models.
+Besides, GAMs have been used in the context of explainable ML (`Accurate Intelligible Models with Pairwise Interactions <https://www.cs.cornell.edu/~yinlou/projects/gam/>`_) to linearly decompose the contribution of each feature (and possibly their pairwise interactions) to the overall score, for subsequent analysis and interpretation of their effects in the trained models.
--- a/docs/Experiments.rst
+++ b/docs/Experiments.rst
@@ -23,7 +23,7 @@ We used 5 datasets to conduct our comparison experiments. Details of data are li
 +===========+=======================+=================================================================================+=============+==========+==============================================+
 | Higgs     | Binary classification | `link <https://archive.ics.uci.edu/dataset/280/higgs>`__                        | 10,500,000  | 28       | last 500,000 samples were used as test set   |
 +-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
-| Yahoo LTR | Learning to rank      | `link <https://webscope.sandbox.yahoo.com/catalog.php?datatype=c>`__            | 473,134     | 700      | set1.train as train, set1.test as test       |
+| Yahoo LTR | Learning to rank      | `link <https://proceedings.mlr.press/v14/chapelle11a.html>`__                   | 473,134     | 700      | set1.train as train, set1.test as test       |
 +-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
 | MS LTR    | Learning to rank      | `link <https://www.microsoft.com/en-us/research/project/mslr/>`__               | 2,270,296   | 137      | {S1,S2,S3} as train set, {S5} as test set    |
 +-----------+-----------------------+---------------------------------------------------------------------------------+-------------+----------+----------------------------------------------+
@@ -250,4 +250,4 @@ Refer to `GPU Performance <./GPU-Performance.rst>`__.
 .. _xgboost: https://github.com/dmlc/xgboost
-.. _link: http://labs.criteo.com/2013/12/download-terabyte-click-logs/
+.. _link: https://ailab.criteo.com/download-criteo-1tb-click-logs-dataset/
--- a/docs/Features.rst
+++ b/docs/Features.rst
@@ -287,11 +287,11 @@ References
 [11] Huan Zhang, Si Si and Cho-Jui Hsieh. "`GPU Acceleration for Large-scale Tree Boosting`_." SysML Conference, 2018.
-.. _LightGBM\: A Highly Efficient Gradient Boosting Decision Tree: https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree.pdf
+.. _LightGBM\: A Highly Efficient Gradient Boosting Decision Tree: https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html
-.. _On Grouping for Maximum Homogeneity: https://www.tandfonline.com/doi/abs/10.1080/01621459.1958.10501479
+.. _On Grouping for Maximum Homogeneity: https://www.jstor.org/stable/2281952
-.. _Optimization of collective communication operations in MPICH: https://web.cels.anl.gov/~thakur/papers/ijhpca-coll.pdf
+.. _Optimization of collective communication operations in MPICH: https://www.mpich.org/2012/10/24/optimization-of-collective-communication-operations-in-mpich/
 .. _A Communication-Efficient Parallel Algorithm for Decision Tree: https://proceedings.neurips.cc/paper/2016/hash/10a5ab2db37feedfdeaab192ead4ac0e-Abstract.html

--- a/docs/GPU-Performance.rst
+++ b/docs/GPU-Performance.rst
@@ -200,7 +200,7 @@ Huan Zhang, Si Si and Cho-Jui Hsieh. `GPU Acceleration for Large-scale Tree Boos
 .. _link3: https://www.kaggle.com/c/bosch-production-line-performance/data
-.. _link4: https://webscope.sandbox.yahoo.com/catalog.php?datatype=c
+.. _link4: https://proceedings.mlr.press/v14/chapelle11a.html
 .. _link5: https://www.microsoft.com/en-us/research/project/mslr/

--- a/docs/GPU-Targets.rst
+++ b/docs/GPU-Targets.rst
@@ -171,4 +171,4 @@ Known issues:
 .. _GPUCapsViewer: https://www.ozone3d.net/gpu_caps_viewer/
-.. _PoCL: http://portablecl.org/
+.. _PoCL: https://portablecl.org/
--- a/docs/GPU-Tutorial.rst
+++ b/docs/GPU-Tutorial.rst
@@ -183,7 +183,7 @@ Huan Zhang, Si Si and Cho-Jui Hsieh. "`GPU Acceleration for Large-scale Tree Boo
 .. _Microsoft Azure cloud computing platform: https://azure.microsoft.com/
-.. _AMDGPU-Pro: https://www.amd.com/en/support
+.. _AMDGPU-Pro: https://www.amd.com/en/support.html
 .. _Python-package Examples: https://github.com/microsoft/LightGBM/tree/master/examples/python-guide

--- a/docs/Parameters.rst
+++ b/docs/Parameters.rst
@@ -162,7 +162,7 @@ Core Parameters
   -  ranking application
-      -  ``lambdarank``, `lambdarank <https://proceedings.neurips.cc/paper_files/paper/2006/file/af44c4c56f385c43f2529f9b1b018f6a-Paper.pdf>`__ objective. `label_gain <#label_gain>`__ can be used to set the gain (weight) of ``int`` label and all values in ``label`` must be smaller than number of elements in ``label_gain``
+      -  ``lambdarank``, `lambdarank <https://proceedings.neurips.cc/paper/2006/hash/af44c4c56f385c43f2529f9b1b018f6a-Abstract.html>`__ objective. `label_gain <#label_gain>`__ can be used to set the gain (weight) of ``int`` label and all values in ``label`` must be smaller than number of elements in ``label_gain``
      -  ``rank_xendcg``, `XE_NDCG_MART <https://arxiv.org/abs/1911.09798>`__ ranking objective function, aliases: ``xendcg``, ``xe_ndcg``, ``xe_ndcg_mart``, ``xendcg_mart``
@@ -491,7 +491,7 @@ Learning Control Parameters
 -  ``linear_lambda`` :raw-html:`<a id="linear_lambda" title="Permalink to this parameter" href="#linear_lambda">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, constraints: ``linear_lambda >= 0.0``
-   -  linear tree regularization, corresponds to the parameter ``lambda`` in Eq. 3 of `Gradient Boosting with Piece-Wise Linear Regression Trees <https://arxiv.org/pdf/1802.05640.pdf>`__
+   -  linear tree regularization, corresponds to the parameter ``lambda`` in Eq. 3 of `Gradient Boosting with Piece-Wise Linear Regression Trees <https://arxiv.org/abs/1802.05640>`__
 -  ``min_gain_to_split`` :raw-html:`<a id="min_gain_to_split" title="Permalink to this parameter" href="#min_gain_to_split">&#x1F517;&#xFE0E;</a>`, default = ``0.0``, type = double, aliases: ``min_split_gain``, constraints: ``min_gain_to_split >= 0.0``
@@ -845,7 +845,7 @@ Dataset Parameters
 -  ``enable_bundle`` :raw-html:`<a id="enable_bundle" title="Permalink to this parameter" href="#enable_bundle">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool, aliases: ``is_enable_bundle``, ``bundle``
-   -  set this to ``false`` to disable Exclusive Feature Bundling (EFB), which is described in `LightGBM: A Highly Efficient Gradient Boosting Decision Tree <https://papers.nips.cc/paper_files/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html>`__
+   -  set this to ``false`` to disable Exclusive Feature Bundling (EFB), which is described in `LightGBM: A Highly Efficient Gradient Boosting Decision Tree <https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html>`__
   -  **Note**: disabling this may cause the slow training speed for sparse datasets
@@ -1192,7 +1192,7 @@ Objective Parameters
   -  used only in ``lambdarank`` application
-   -  controls the number of top-results to focus on during training, refer to "truncation level" in the Sec. 3 of `LambdaMART paper <https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/MSR-TR-2010-82.pdf>`__
+   -  controls the number of top-results to focus on during training, refer to "truncation level" in the Sec. 3 of `LambdaMART paper <https://www.microsoft.com/en-us/research/publication/from-ranknet-to-lambdarank-to-lambdamart-an-overview/>`__
   -  this parameter is closely related to the desirable cutoff ``k`` in the metric **NDCG@k** that we aim at optimizing the ranker for. The optimal setting for this parameter is likely to be slightly higher than ``k`` (e.g., ``k + 3``) to include more pairs of documents to train on, but perhaps not too high to avoid deviating too much from the desired target metric **NDCG@k**
@@ -1265,7 +1265,7 @@ Metric Parameters
      -  ``binary_error``, for one sample: ``0`` for correct classification, ``1`` for error classification
-      -  ``auc_mu``, `AUC-mu <http://proceedings.mlr.press/v97/kleiman19a/kleiman19a.pdf>`__
+      -  ``auc_mu``, `AUC-mu <https://proceedings.mlr.press/v97/kleiman19a.html>`__
      -  ``multi_logloss``, log loss for multi-class classification, aliases: ``multiclass``, ``softmax``, ``multiclassova``, ``multiclass_ova``, ``ova``, ``ovr``

--- a/examples/README.md
+++ b/examples/README.md