" * The first four fields are *[features](https://developers.google.com/machine-learning/glossary/#feature)*: these are characteristics of an example. Here, the fields hold float numbers representing flower measurements.\n",
" * The first four fields are *[features](https://developers.google.com/machine-learning/glossary/#feature)*: these are characteristics of an example. Here, the fields hold float numbers representing flower measurements.\n",
" * The last column is the *[label](https://developers.google.com/machine-learning/glossary/#label)*: this is the value we want to predict. For this dataset, it's an integer value of 0, 1, or 2 that corresponds to a flower name.\n",
" * The last column is the *[label](https://developers.google.com/machine-learning/glossary/#label)*: this is the value we want to predict. For this dataset, it's an integer value of 0, 1, or 2 that corresponds to a flower name.\n",
"Each label is associated with string name (for example, \"setosa\"), but machine learning typically relies on numeric values. The label numbers are mapped to a named representation, such as:\n",
"Each label is associated with string name (for example, \"setosa\"), but machine learning typically relies on numeric values. The label numbers are mapped to a named representation, such as:\n",
"\n",
"\n",
"* `0`: Iris setosa\n",
"* `0`: Iris setosa\n",
...
@@ -319,6 +315,19 @@
...
@@ -319,6 +315,19 @@
"For more information about features and labels, see the [ML Terminology section of the Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/framing/ml-terminology)."
"For more information about features and labels, see the [ML Terminology section of the Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/framing/ml-terminology)."
"TensorFlow's [Dataset API](https://www.tensorflow.org/programmers_guide/datasets) handles many common cases for feeding data into a model. This is a high-level API for reading data and transforming it into a form used for training. See the [Datasets Quick Start guide](https://www.tensorflow.org/get_started/datasets_quickstart) for more information.\n",
"\n",
"\n",
"\n",
"Since our dataset is a CSV-formatted text file, we'll parse the feature and label values into a format our Python model can use. Each line—or row—in the file is passed to the `parse_csv` function which grabs the first four feature fields and combines them into a single tensor. Then, the last field is parsed as the label. The function returns *both* the `features` and `label` tensors:"
"Since our dataset is a CSV-formatted text file, we'll use the the [`make_csv_dataset`](https://www.tensorflow.org/api_docs/python/tf/contrib/data/make_csv_dataset) function to easily parse the data into a suitable format. This function is meant to generate fata for training models so the default behavior is to shuffle the data (`shuffle=True, shuffle_buffer_size=10000`), and repeat the dataset forever (`num_epochs=None`). Note the [batch size](https://developers.google.com/machine-learning/glossary/#batch_size) parameter."
"This function returns a `tf.data.Dataset` of `(features, label)` pairs, where `features` is a `{'column_name': value}` dictionary.\n",
"\n",
"TensorFlow's [Dataset API](https://www.tensorflow.org/programmers_guide/datasets) handles many common cases for feeding data into a model. This is a high-level API for reading data and transforming it into a form used for training. See the [Datasets Quick Start guide](https://www.tensorflow.org/get_started/datasets_quickstart) for more information.\n",
"\n",
"This program uses [tf.data.TextLineDataset](https://www.tensorflow.org/api_docs/python/tf/data/TextLineDataset) to load a CSV-formatted text file and is parsed with our `parse_csv` function. A [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) represents an input pipeline as a collection of elements and a series of transformations that act on those elements. Transformation methods are chained together or called sequentially—just make sure to keep a reference to the returned `Dataset` object.\n",
"\n",
"\n",
"Training works best if the examples are in random order. Use `tf.data.Dataset.shuffle` to randomize entries, setting `buffer_size` to a value larger than the number of examples (120 in this case). To train the model faster, the dataset's *[batch size](https://developers.google.com/machine-learning/glossary/#batch_size)* is set to `32` examples to train at once."
"With eager execution enabled, these `Dataset` objects are iterable. Let's look at a batch of features:"
"To simplify the model building, let's repackage the features dictionary into an array with whape ``(batch_size,num_features)`.\n",
"\n",
"\n",
"# View a single example entry from a batch\n",
"To do this we'll write a simple function using the [`tf.stack`](https://www.tensorflow.org/api_docs/python/tf/data/dataset/map) method to pack the features into a single array. Then we'll use the [`tf.data.Dataset.map`](https://www.tensorflow.org/api_docs/python/tf/data/dataset/map) method to apply this function to each `(features,label)` pair in the dataset. :\n"
"features, label = iter(train_dataset).next()\n",
]
"print(\"example features:\", features[0])\n",
},
"print(\"example label:\", label[0])"
{
"metadata": {
"id": "jm932WINcaGU",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"def pack_features_vector(features,labels):\n",
" features = tf.stack([features[name] for name in feature_names],\n",
"The features of this dataset arrays with shape `(batch_size, num_features)`. Let's look at the first 10 examples:"
]
},
{
"metadata": {
"id": "kex9ibEek6Tr",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"features,labels = next(iter(train_dataset))\n",
" \n",
"features[:10]"
],
],
"execution_count": 0,
"execution_count": 0,
"outputs": []
"outputs": []
...
@@ -440,21 +486,16 @@
...
@@ -440,21 +486,16 @@
"source": [
"source": [
"### Create a model using Keras\n",
"### Create a model using Keras\n",
"\n",
"\n",
"The TensorFlow [tf.keras](https://www.tensorflow.org/api_docs/python/tf/keras) API is the preferred way to create models and layers. This makes it easy to build models and experiment while Keras handles the complexity of connecting everything together. See the [Keras documentation](https://keras.io/) for details.\n",
"The TensorFlow [tf.keras](https://www.tensorflow.org/api_docs/python/tf/keras) API is the preferred way to create models and layers. This makes it easy to build models and experiment while Keras handles the complexity of connecting everything together.\n",
"\n",
"\n",
"The [tf.keras.Sequential](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential) model is a linear stack of layers. Its constructor takes a list of layer instances, in this case, two [Dense](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense) layers with 10 nodes each, and an output layer with 3 nodes representing our label predictions. The first layer's `input_shape` parameter corresponds to the amount of features from the dataset, and is required."
"The [tf.keras.Sequential](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential) model is a linear stack of layers. Its constructor takes a list of layer instances, in this case, two [Dense](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense) layers with 10 nodes each, and an output layer with 3 nodes representing our label predictions. The first layer's `input_shape` parameter corresponds to the number of features from the dataset, and is required."
]
]
},
},
{
{
"metadata": {
"metadata": {
"id": "2fZ6oL2ig3ZK",
"id": "2fZ6oL2ig3ZK",
"colab_type": "code",
"colab_type": "code",
"colab": {
"colab": {}
"autoexec": {
"startup": false,
"wait_interval": 0
}
}
},
},
"cell_type": "code",
"cell_type": "code",
"source": [
"source": [
...
@@ -474,11 +515,62 @@
...
@@ -474,11 +515,62 @@
},
},
"cell_type": "markdown",
"cell_type": "markdown",
"source": [
"source": [
"The *[activation function](https://developers.google.com/machine-learning/crash-course/glossary#activation_function)* determines the output of a single neuron to the next layer. This is loosely based on how brain neurons are connected. There are many [available activations](https://www.tensorflow.org/api_docs/python/tf/keras/activations), but [ReLU](https://developers.google.com/machine-learning/crash-course/glossary#ReLU) is common for hidden layers.\n",
"The *[activation function](https://developers.google.com/machine-learning/crash-course/glossary#activation_function)* determines the output shape of each node. These non-linearities are important as without them the model would be equivalent to a single layer. There are many [available activations](https://www.tensorflow.org/api_docs/python/tf/keras/activations), but [ReLU](https://developers.google.com/machine-learning/crash-course/glossary#ReLU) is common for hidden layers.\n",
"\n",
"\n",
"The ideal number of hidden layers and neurons depends on the problem and the dataset. Like many aspects of machine learning, picking the best shape of the neural network requires a mixture of knowledge and experimentation. As a rule of thumb, increasing the number of hidden layers and neurons typically creates a more powerful model, which requires more data to train effectively."
"The ideal number of hidden layers and neurons depends on the problem and the dataset. Like many aspects of machine learning, picking the best shape of the neural network requires a mixture of knowledge and experimentation. As a rule of thumb, increasing the number of hidden layers and neurons typically creates a more powerful model, which requires more data to train effectively."
]
]
},
},
{
"metadata": {
"id": "2wFKnhWCpDSS",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Let's have a quick look at what this model does to a batch of features:"
]
},
{
"metadata": {
"id": "xe6SQ5NrpB-I",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"prediction = model(features)\n",
"prediction[:5]"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "wxyXOhwVr5S3",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"For each example it returns a *[logit](https://developers.google.com/machine-learning/crash-course/glossary#logits)* score for each class. \n",
"\n",
"You can calculate the probability that the model assigns to each class using the [`tf.nn.softmax`](https://www.tensorflow.org/api_docs/python/tf/nn/softmax) function.\n",
"\n",
"The model hasn't been trained yet, so these aren't very good predictions."
]
},
{
"metadata": {
"id": "-Jzm_GoErz8B",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"tf.nn.softmax(prediction[:5])"
],
"execution_count": 0,
"outputs": []
},
{
{
"metadata": {
"metadata": {
"id": "Vzq2E5J2QMtw",
"id": "Vzq2E5J2QMtw",
...
@@ -504,45 +596,74 @@
...
@@ -504,45 +596,74 @@
"\n",
"\n",
"Both training and evaluation stages need to calculate the model's *[loss](https://developers.google.com/machine-learning/crash-course/glossary#loss)*. This measures how off a model's predictions are from the desired label, in other words, how bad the model is performing. We want to minimize, or optimize, this value.\n",
"Both training and evaluation stages need to calculate the model's *[loss](https://developers.google.com/machine-learning/crash-course/glossary#loss)*. This measures how off a model's predictions are from the desired label, in other words, how bad the model is performing. We want to minimize, or optimize, this value.\n",
"\n",
"\n",
"Our model will calculate its loss using the [tf.losses.sparse_softmax_cross_entropy](https://www.tensorflow.org/api_docs/python/tf/losses/sparse_softmax_cross_entropy) function which takes the model's prediction and the desired label. The returned loss value is progressively larger as the prediction gets worse."
"Our model will calculate its loss using the [tf.losses.sparse_softmax_cross_entropy](https://www.tensorflow.org/api_docs/python/tf/losses/sparse_softmax_cross_entropy) function which takes the model's prediction and the desired label, and returns the average loss across the examples."
"The `grad` function uses the `loss` function and the [tf.GradientTape](https://www.tensorflow.org/api_docs/python/tf/GradientTape) to record operations that compute the *[gradients](https://developers.google.com/machine-learning/crash-course/glossary#gradient)* used to optimize our model. For more examples of this, see the [eager execution guide](https://www.tensorflow.org/programmers_guide/eager)."
"To perform the optimization we will use the [`tf.GradientTape`](https://www.tensorflow.org/api_docs/python/tf/GradientTape) context to calculate the *[gradients](https://developers.google.com/machine-learning/crash-course/glossary#gradient)* used to optimize our model. For more examples of this, see the [eager execution guide](https://www.tensorflow.org/programmers_guide/eager)."
"An *[optimizer](https://developers.google.com/machine-learning/crash-course/glossary#optimizer)* applies the computed gradients to the model's variables to minimize the `loss` function. You can think of a curved surface (see Figure 3) and we want to find its lowest point by walking around. The gradients point in the direction of steepest ascent—so we'll travel the opposite way and move down the hill. By iteratively calculating the loss and gradient for each batch, we'll adjust the model during training. Gradually, the model will find the best combination of weights and bias to minimize loss. And the lower the loss, the better the model's predictions.\n",
"An *[optimizer](https://developers.google.com/machine-learning/crash-course/glossary#optimizer)* applies the computed gradients to the model's variables to minimize the `loss` function. You can think of the loss function as a curved surface (see Figure 3) and we want to find its lowest point by walking around. The gradients point in the direction of steepest ascent—so we'll travel the opposite way and move down the hill. By iteratively calculating the loss and gradient for each batch, we'll adjust the model during training. Gradually, the model will find the best combination of weights and bias to minimize loss. And the lower the loss, the better the model's predictions.\n",
"\n",
"\n",
"<table>\n",
"<table>\n",
" <tr><td>\n",
" <tr><td>\n",
...
@@ -567,20 +688,58 @@
...
@@ -567,20 +688,58 @@
"TensorFlow has many [optimization algorithms](https://www.tensorflow.org/api_guides/python/train) available for training. This model uses the [tf.train.GradientDescentOptimizer](https://www.tensorflow.org/api_docs/python/tf/train/GradientDescentOptimizer) that implements the *[stochastic gradient descent](https://developers.google.com/machine-learning/crash-course/glossary#gradient_descent)* (SGD) algorithm. The `learning_rate` sets the step size to take for each iteration down the hill. This is a *hyperparameter* that you'll commonly adjust to achieve better results."
"TensorFlow has many [optimization algorithms](https://www.tensorflow.org/api_guides/python/train) available for training. This model uses the [tf.train.GradientDescentOptimizer](https://www.tensorflow.org/api_docs/python/tf/train/GradientDescentOptimizer) that implements the *[stochastic gradient descent](https://developers.google.com/machine-learning/crash-course/glossary#gradient_descent)* (SGD) algorithm. The `learning_rate` sets the step size to take for each iteration down the hill. This is a *hyperparameter* that you'll commonly adjust to achieve better results."
]
]
},
},
{
"metadata": {
"id": "XkUd6UiZa_dF",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Let's setup the optimizer, and the `global_step` counter:"