Commit 332be2d0 authored by Carlos Riquelme's avatar Carlos Riquelme
Browse files

Readme changes.

parent e925af84
...@@ -201,8 +201,8 @@ The Deep Bayesian Bandits library includes the following algorithms (see the ...@@ -201,8 +201,8 @@ The Deep Bayesian Bandits library includes the following algorithms (see the
neural networks (or more generally, models) that map contexts to rewards, neural networks (or more generally, models) that map contexts to rewards,
consists in randomly perturbing a point estimate trained by Stochastic consists in randomly perturbing a point estimate trained by Stochastic
Gradient Descent on the data. The Parameter-Noise algorithm uses a heuristic Gradient Descent on the data. The Parameter-Noise algorithm uses a heuristic
to control the amount of noise ![equation](https://latex.codecogs.com/gif.download?%5Cinline%20%5Csigma_t%5E2) it adds independently to the to control the amount of noise &sigma;<sub>t</sub><sup>2</sup> it adds independently to the
parameters representing a neural network: ![equation](https://latex.codecogs.com/gif.download?%5Cinline%20%5Cbar%7B%5Ctheta%7D_t%20%3D%20%5Ctheta_t%20+%20%5Cepsilon) where ![equation](https://latex.codecogs.com/gif.download?%5Cinline%20%5Cepsilon%20%5Csim%20%5Cmathcal%7BN%7D%280%2C%20%5Csigma_t%5E2%20%5C%20%5Cmathrm%7BId%7D%29). parameters representing a neural network: &theta;<sub>t</sub><sup>'</sup> where ![equation](https://latex.codecogs.com/gif.download?%5Cinline%20%5Cepsilon%20%5Csim%20%5Cmathcal%7BN%7D%280%2C%20%5Csigma_t%5E2%20%5C%20%5Cmathrm%7BId%7D%29).
After using ![equation](https://latex.codecogs.com/gif.download?%5Cinline%20%5Cbar%7B%5Ctheta%7D_t) for decision making, the following SGD After using ![equation](https://latex.codecogs.com/gif.download?%5Cinline%20%5Cbar%7B%5Ctheta%7D_t) for decision making, the following SGD
training steps start again from ![equation](https://latex.codecogs.com/gif.download?%5Cinline%20%5Ctheta_t). The key hyperparameters to set training steps start again from ![equation](https://latex.codecogs.com/gif.download?%5Cinline%20%5Ctheta_t). The key hyperparameters to set
are those controlling the noise heuristic. are those controlling the noise heuristic.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment