"These regularized models don't perform very differently than the base model. Let's look ar the models' weight distributions to better see the effect of the regularization:"
"These regularized models don't perform much better than the base model. Let's look at the model's weight distributions to better see the effect of the regularization:"
]
]
},
},
{
{
...
@@ -1332,7 +1332,7 @@
...
@@ -1332,7 +1332,7 @@
},
},
"cell_type": "markdown",
"cell_type": "markdown",
"source": [
"source": [
"The models have many zero-vlaued weights caused by unused hash bins (There are many more hash bins than categories in some columns). We will mask these weights when viewing the weight distributions:"
"The models have many zero-valued weights caused by unused hash bins (there are many more hash bins than categories in some columns). We can mask these weights when viewing the weight distributions:"
]
]
},
},
{
{
...
@@ -1396,7 +1396,7 @@
...
@@ -1396,7 +1396,7 @@
},
},
"cell_type": "markdown",
"cell_type": "markdown",
"source": [
"source": [
"Both types of regularization squeeze the distribution of weights towards zero. L2 regularization has a greater effect in the tails of the distribution eliminating extreme weights. L1 regularization produces more exactly-zero values (In this case it sets ~200 to zero)."
"Both types of regularization squeeze the distribution of weights towards zero. L2 regularization has a greater effect in the tails of the distribution eliminating extreme weights. L1 regularization produces more exactly-zero values, in this case it sets ~200 to zero."