Commit 66df4538 authored by Mark Daoust's avatar Mark Daoust
Browse files

Remove Regularization section.

I can't find values of regularization strength that matter for this problem.
parent 7a210706
...@@ -6,10 +6,10 @@ ...@@ -6,10 +6,10 @@
"name": "wide.ipynb", "name": "wide.ipynb",
"version": "0.3.2", "version": "0.3.2",
"provenance": [], "provenance": [],
"toc_visible": true,
"collapsed_sections": [ "collapsed_sections": [
"MWW1TyjaecRh" "MWW1TyjaecRh"
] ],
"toc_visible": true
}, },
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3",
...@@ -33,7 +33,6 @@ ...@@ -33,7 +33,6 @@
"id": "mOtR1FzCef-u", "id": "mOtR1FzCef-u",
"colab_type": "code", "colab_type": "code",
"colab": {} "colab": {}
}, },
"cell_type": "code", "cell_type": "code",
"source": [ "source": [
...@@ -1077,9 +1076,8 @@ ...@@ -1077,9 +1076,8 @@
" [age_buckets, 'education', 'occupation'], hash_bucket_size=1000),\n", " [age_buckets, 'education', 'occupation'], hash_bucket_size=1000),\n",
"]\n", "]\n",
"\n", "\n",
"model_dir = tempfile.mkdtemp()\n",
"model = tf.estimator.LinearClassifier(\n", "model = tf.estimator.LinearClassifier(\n",
" model_dir=model_dir, feature_columns=base_columns + crossed_columns)" " model_dir=tempfile.mkdtemp(), feature_columns=base_columns + crossed_columns)"
], ],
"execution_count": 0, "execution_count": 0,
"outputs": [] "outputs": []
...@@ -1138,8 +1136,8 @@ ...@@ -1138,8 +1136,8 @@
"source": [ "source": [
"results = model.evaluate(test_inpf)\n", "results = model.evaluate(test_inpf)\n",
"clear_output()\n", "clear_output()\n",
"for key in sorted(results):\n", "for key,value in sorted(result.items()):\n",
" print('%s: %0.2f' % (key, results[key]))" " print('%s: %0.2f' % (key, value))"
], ],
"execution_count": 0, "execution_count": 0,
"outputs": [] "outputs": []
...@@ -1199,67 +1197,7 @@ ...@@ -1199,67 +1197,7 @@
"source": [ "source": [
"If you'd like to see a working end-to-end example, you can download our\n", "If you'd like to see a working end-to-end example, you can download our\n",
"[example code](https://github.com/tensorflow/models/tree/master/official/wide_deep/census_main.py)\n", "[example code](https://github.com/tensorflow/models/tree/master/official/wide_deep/census_main.py)\n",
"and set the `model_type` flag to `wide`.\n", "and set the `model_type` flag to `wide`."
"\n",
"## Adding Regularization to Prevent Overfitting\n",
"\n",
"Regularization is a technique used to avoid **overfitting**. Overfitting happens\n",
"when your model does well on the data it is trained on, but worse on test data\n",
"that the model has not seen before, such as live traffic. Overfitting generally\n",
"occurs when a model is excessively complex, such as having too many parameters\n",
"relative to the number of observed training data. Regularization allows for you\n",
"to control your model's complexity and makes the model more generalizable to\n",
"unseen data.\n",
"\n",
"In the Linear Model library, you can add L1 and L2 regularizations to the model\n",
"as:"
]
},
{
"metadata": {
"id": "cVv2HsqocYxO",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"#TODO(markdaoust): is the regularization strength here not working?\n",
"model = tf.estimator.LinearClassifier(\n",
" model_dir=model_dir, feature_columns=base_columns + crossed_columns,\n",
" optimizer=tf.train.FtrlOptimizer(\n",
" learning_rate=0.1,\n",
" l1_regularization_strength=0.1,\n",
" l2_regularization_strength=0.1))\n",
"\n",
"model.train(train_inpf)\n",
"\n",
"results = model.evaluate(test_inpf)\n",
"clear_output()\n",
"for key in sorted(results):\n",
" print('%s: %0.2f' % (key, results[key]))"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "5AqvPEQwcYxU",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"One important difference between L1 and L2 regularization is that L1\n",
"regularization tends to make model weights stay at zero, creating sparser\n",
"models, whereas L2 regularization also tries to make the model weights closer to\n",
"zero but not necessarily zero. Therefore, if you increase the strength of L1\n",
"regularization, you will have a smaller model size because many of the model\n",
"weights will be zero. This is often desirable when the feature space is very\n",
"large but sparse, and when there are resource constraints that prevent you from\n",
"serving a model that is too large.\n",
"\n",
"In practice, you should try various combinations of L1, L2 regularization\n",
"strengths and find the best parameters that best control overfitting and give\n",
"you a desirable model size."
] ]
}, },
{ {
...@@ -1332,4 +1270,4 @@ ...@@ -1332,4 +1270,4 @@
] ]
} }
] ]
} }
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment