Last active
February 21, 2025 20:17
-
-
Save Areshkew/16abc7d8bbdcb6e35ad1720d5a89dfd8 to your computer and use it in GitHub Desktop.
Revisions
-
Areshkew revised this gist
Feb 21, 2025 . 1 changed file with 12 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,5 +1,15 @@ { "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "<a href=\"https://colab.research.google.com/gist/Areshkew/16abc7d8bbdcb6e35ad1720d5a89dfd8/intro-to-neural-nets.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" ] }, { "cell_type": "code", "execution_count": null, @@ -607,7 +617,8 @@ "colab": { "name": "Intro to Neural Nets.ipynb", "private_outputs": true, "provenance": [], "include_colab_link": true }, "kernelspec": { "display_name": "Python 3", -
Areshkew created this gist
Feb 21, 2025 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,619 @@ { "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "wDlWLbfkJtvu" }, "outputs": [], "source": [ "#@title Copyright 2020 Google LLC. Double-click here for license information.\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "T4r2z30vJSbA" }, "source": [ "# Colabs\n", "\n", "Machine Learning Crash Course uses Colaboratories (Colabs) for all programming exercises. Colab is Google's implementation of [Jupyter Notebook](https://jupyter.org/). For more information about Colabs and how to use them, go to [Welcome to Colaboratory](https://research.google.com/colaboratory)." ] }, { "cell_type": "markdown", "metadata": { "id": "TL5y5fY9Jy_x" }, "source": [ "# Introduction to Neural Nets\n", "\n", "This Colab builds a deep neural network to perform more sophisticated linear regression than the earlier Colabs." ] }, { "cell_type": "markdown", "metadata": { "id": "7RDY3EeAluPd" }, "source": [ "## Learning Objectives:\n", "\n", "After doing this Colab, you'll know how to do the following:\n", "\n", " * Create a simple deep neural network.\n", " * Tune the hyperparameters for a simple deep neural network." ] }, { "cell_type": "markdown", "metadata": { "id": "XGj0PNaJlubZ" }, "source": [ "## The Dataset\n", " \n", "Like several of the previous Colabs, this Colab uses the [California Housing Dataset](https://developers.google.com/machine-learning/crash-course/california-housing-data-description)." ] }, { "cell_type": "markdown", "metadata": { "id": "xchnxAsaKKqO" }, "source": [ "## Import relevant modules\n", "\n", "The following hidden code cell imports the necessary code to run the code in the rest of this Colaboratory." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "9n9_cTveKmse" }, "outputs": [], "source": [ "#@title Import relevant modules\n", "!pip install tensorflow==2.15.1\n", "import numpy as np\n", "import pandas as pd\n", "import tensorflow as tf\n", "from matplotlib import pyplot as plt\n", "import seaborn as sns\n", "\n", "# The following lines adjust the granularity of reporting.\n", "pd.options.display.max_rows = 10\n", "pd.options.display.float_format = \"{:.1f}\".format\n", "\n", "print(\"Imported modules.\")" ] }, { "cell_type": "markdown", "metadata": { "id": "X_TaJhU4KcuY" }, "source": [ "## Load the dataset\n", "\n", "Like most of the previous Colab exercises, this exercise uses the California Housing Dataset. The following code cell loads the separate .csv files and creates the following two pandas DataFrames:\n", "\n", "* `train_df`, which contains the training set\n", "* `test_df`, which contains the test set\n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JZlvdpyYKx7V" }, "outputs": [], "source": [ "train_df = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv\")\n", "train_df = train_df.reindex(np.random.permutation(train_df.index)) # shuffle the examples\n", "test_df = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_test.csv\")" ] }, { "cell_type": "markdown", "metadata": { "id": "b9ehCgIRjTxy" }, "source": [ "## Represent data\n", "\n", "The following code cell creates preprocessing layers outputting three features:\n", "\n", "* `latitude` X `longitude` (a feature cross)\n", "* `median_income`\n", "* `population`\n", "\n", "This code cell specifies the features that you'll ultimately train the model on and how each of those features will be represented. The transformations (collected in `prepocessing_layers`) don't actually get applied until you pass a DataFrame to it, which will happen when we train the model.\n", "\n", "We'll use `preprocessing_layers` for both our linear regression model and our neural network model.\n", "\n", "(The [`keras.FeatureSpace`](https://keras.io/examples/structured_data/structured_data_classification_with_feature_space) utility offers an alternative to building individual Keras preprocessing layers -- give it a try, if you're feeling adventurous!)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8EkNAQhnjSu-" }, "outputs": [], "source": [ "# Keras Input tensors of float values.\n", "inputs = {\n", " 'latitude':\n", " tf.keras.layers.Input(shape=(1,), dtype=tf.float32,\n", " name='latitude'),\n", " 'longitude':\n", " tf.keras.layers.Input(shape=(1,), dtype=tf.float32,\n", " name='longitude'),\n", " 'median_income':\n", " tf.keras.layers.Input(shape=(1,), dtype=tf.float32,\n", " name='median_income'),\n", " 'population':\n", " tf.keras.layers.Input(shape=(1,), dtype=tf.float32,\n", " name='population')\n", "}\n", "\n", "# Create a Normalization layer to normalize the median_income data.\n", "median_income = tf.keras.layers.Normalization(\n", " name='normalization_median_income',\n", " axis=None)\n", "median_income.adapt(train_df['median_income'])\n", "median_income = median_income(inputs.get('median_income'))\n", "\n", "# Create a Normalization layer to normalize the population data.\n", "population = tf.keras.layers.Normalization(\n", " name='normalization_population',\n", " axis=None)\n", "population.adapt(train_df['population'])\n", "population = population(inputs.get('population'))\n", "\n", "# Create a list of numbers representing the bucket boundaries for latitude.\n", "# Because we're using a Normalization layer, values for latitude and longitude\n", "# will be in the range of approximately -3 to 3 (representing the Z score).\n", "# We'll create 20 buckets, which requires 21 bucket boundaries (hence, 20+1).\n", "latitude_boundaries = np.linspace(-3, 3, 20+1)\n", "\n", "# Create a Normalization layer to normalize the latitude data.\n", "latitude = tf.keras.layers.Normalization(\n", " name='normalization_latitude',\n", " axis=None)\n", "latitude.adapt(train_df['latitude'])\n", "latitude = latitude(inputs.get('latitude'))\n", "\n", "# Create a Discretization layer to separate the latitude data into buckets.\n", "latitude = tf.keras.layers.Discretization(\n", " bin_boundaries=latitude_boundaries,\n", " name='discretization_latitude')(latitude)\n", "\n", "# Create a list of numbers representing the bucket boundaries for longitude.\n", "longitude_boundaries = np.linspace(-3, 3, 20+1)\n", "\n", "# Create a Normalization layer to normalize the longitude data.\n", "longitude = tf.keras.layers.Normalization(\n", " name='normalization_longitude',\n", " axis=None)\n", "longitude.adapt(train_df['longitude'])\n", "longitude = longitude(inputs.get('longitude'))\n", "\n", "# Create a Discretization layer to separate the longitude data into buckets.\n", "longitude = tf.keras.layers.Discretization(\n", " bin_boundaries=longitude_boundaries,\n", " name='discretization_longitude')(longitude)\n", "\n", "# Cross the latitude and longitude features into a single one-hot vector.\n", "feature_cross = tf.keras.layers.HashedCrossing(\n", " # num_bins can be adjusted: Higher values improve accuracy, lower values\n", " # improve performance.\n", " num_bins=len(latitude_boundaries) * len(longitude_boundaries),\n", " output_mode='one_hot',\n", " name='cross_latitude_longitude')([latitude, longitude])\n", "\n", "# Concatenate our inputs into a single tensor.\n", "preprocessing_layers = tf.keras.layers.Concatenate()(\n", " [feature_cross, median_income, population])\n", "\n", "print(\"Preprocessing layers defined.\")" ] }, { "cell_type": "markdown", "metadata": { "id": "Ak_TMAzGOIFq" }, "source": [ "## Build a linear regression model as a baseline\n", "\n", "Before creating a deep neural net, find a [baseline](https://developers.google.com/machine-learning/glossary/#baseline) loss by running a simple linear regression model that uses the preprocessing layers you just created.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "QF0BFRXTOeR3" }, "outputs": [], "source": [ "#@title Define the plotting function.\n", "\n", "def plot_the_loss_curve(epochs, mse_training, mse_validation):\n", " \"\"\"Plot a curve of loss vs. epoch.\"\"\"\n", "\n", " plt.figure()\n", " plt.xlabel(\"Epoch\")\n", " plt.ylabel(\"Mean Squared Error\")\n", "\n", " plt.plot(epochs, mse_training, label=\"Training Loss\")\n", " plt.plot(epochs, mse_validation, label=\"Validation Loss\")\n", "\n", " # mse_training is a pandas Series, so convert it to a list first.\n", " merged_mse_lists = mse_training.tolist() + mse_validation\n", " highest_loss = max(merged_mse_lists)\n", " lowest_loss = min(merged_mse_lists)\n", " top_of_y_axis = highest_loss * 1.03\n", " bottom_of_y_axis = lowest_loss * 0.97\n", "\n", " plt.ylim([bottom_of_y_axis, top_of_y_axis])\n", " plt.legend()\n", " plt.show()\n", "\n", "print(\"Defined the plot_the_loss_curve function.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "RW4Qe710LgnG" }, "outputs": [], "source": [ "#@title Define functions to create and train a linear regression model\n", "def create_model(my_inputs, my_outputs, my_learning_rate):\n", " \"\"\"Create and compile a simple linear regression model.\"\"\"\n", " model = tf.keras.Model(inputs=my_inputs, outputs=my_outputs)\n", "\n", " # Construct the layers into a model that TensorFlow can execute.\n", " model.compile(optimizer=tf.keras.optimizers.Adam(\n", " learning_rate=my_learning_rate),\n", " loss=\"mean_squared_error\",\n", " metrics=[tf.keras.metrics.MeanSquaredError()])\n", "\n", " return model\n", "\n", "\n", "def train_model(model, dataset, epochs, batch_size, label_name, validation_split=0.1):\n", " \"\"\"Feed a dataset into the model in order to train it.\"\"\"\n", "\n", " # Split the dataset into features and label.\n", " features = {name:np.array(value) for name, value in dataset.items()}\n", " label = train_median_house_value_normalized(\n", " np.array(features.pop(label_name)))\n", " history = model.fit(x=features, y=label, batch_size=batch_size,\n", " epochs=epochs, shuffle=True, validation_split=validation_split)\n", "\n", " # Get details that will be useful for plotting the loss curve.\n", " epochs = history.epoch\n", " hist = pd.DataFrame(history.history)\n", " mse = hist[\"mean_squared_error\"]\n", "\n", " return epochs, mse, history.history\n", "\n", "print(\"Defined the create_model and train_model functions.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "avKFnYf8RMyU" }, "outputs": [], "source": [ "#@title Define normalized label columns\n", "# Create Normalization layers to normalize the median_house_value data.\n", "# Because median_house_value is our label (i.e., the target value we're\n", "# predicting), these layers won't be added to our model.\n", "train_median_house_value_normalized = tf.keras.layers.Normalization(axis=None)\n", "train_median_house_value_normalized.adapt(\n", " np.array(train_df['median_house_value']))\n", "\n", "test_median_house_value_normalized = tf.keras.layers.Normalization(axis=None)\n", "test_median_house_value_normalized.adapt(\n", " np.array(test_df['median_house_value']))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "v8JfAtFArOWq" }, "outputs": [], "source": [ "#@title Define linear regression model outputs\n", "def get_outputs_linear_regression():\n", " # Create the Dense output layer.\n", " dense_output = tf.keras.layers.Dense(units=1,\n", " name='dense_output')(preprocessing_layers)\n", "\n", " # Define an output dictionary we'll send to the model constructor.\n", " outputs = {\n", " 'dense_output': dense_output\n", " }\n", " return outputs" ] }, { "cell_type": "markdown", "metadata": { "id": "f47LmxF5X_pu" }, "source": [ "Run the following code cell to invoke the functions defined in the preceding two code cells. (Ignore the warning messages.)\n", "\n", "**Note:** Because we've scaled all the input data, **including the label**, the resulting loss values will be *much less* than models in previous Colabs (e.g., [Representation with a Feature Cross](https://colab.sandbox.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/representation_with_a_feature_cross.ipynb)).\n", "\n", "**Note:** Depending on the version of TensorFlow, running this cell might generate WARNING messages. Please ignore these warnings." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tsfE4ujDL4ju" }, "outputs": [], "source": [ "# The following variables are the hyperparameters.\n", "learning_rate = 0.01\n", "epochs = 15\n", "batch_size = 1000\n", "label_name = \"median_house_value\"\n", "\n", "# Split the original training set into a reduced training set and a\n", "# validation set.\n", "validation_split = 0.2\n", "\n", "outputs = get_outputs_linear_regression()\n", "\n", "# Establish the model's topography.\n", "my_model = create_model(inputs, outputs, learning_rate)\n", "\n", "# Train the model on the normalized training set.\n", "epochs, mse, history = train_model(my_model, train_df, epochs, batch_size,\n", " label_name, validation_split)\n", "plot_the_loss_curve(epochs, mse, history[\"val_mean_squared_error\"])\n", "\n", "test_features = {name:np.array(value) for name, value in test_df.items()}\n", "test_label = test_median_house_value_normalized(test_features.pop(label_name)) # isolate the label\n", "print(\"\\n Evaluate the linear regression model against the test set:\")\n", "my_model.evaluate(x = test_features, y = test_label, batch_size=batch_size, return_dict=True)" ] }, { "cell_type": "markdown", "metadata": { "id": "3014ezH3C7jT" }, "source": [ "## Define a deep neural net model\n", "\n", "The `get_outputs_dnn` function defines the topography of the deep neural net (DNN), specifying the following:\n", "\n", "* The number of [layers](https://developers.google.com/machine-learning/glossary/#layer) in the deep neural net.\n", "* The number of [nodes](https://developers.google.com/machine-learning/glossary/#node) in each layer.\n", "\n", "The `get_outputs_dnn` function also defines the [activation function](https://developers.google.com/machine-learning/glossary/#activation_function) of each layer.\n", "\n", "The first `Dense` layer takes our previously defined `preprocessing_layers` as input." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pedD5GhlDC-y" }, "outputs": [], "source": [ "def get_outputs_dnn():\n", " # Create a Dense layer with 20 nodes.\n", " dense_output = tf.keras.layers.Dense(units=20,\n", " activation='relu',\n", " name='hidden_dense_layer_1')(preprocessing_layers)\n", " # Create a Dense layer with 12 nodes.\n", " dense_output = tf.keras.layers.Dense(units=12,\n", " activation='relu',\n", " name='hidden_dense_layer_2')(dense_output)\n", " # Create the Dense output layer.\n", " dense_output = tf.keras.layers.Dense(units=1,\n", " name='dense_output')(dense_output)\n", "\n", " # Define an output dictionary we'll send to the model constructor.\n", " outputs = {\n", " 'dense_output': dense_output\n", " }\n", "\n", " return outputs" ] }, { "cell_type": "markdown", "metadata": { "id": "D-IXYVfvM4gD" }, "source": [ "## Call the functions to build and train a deep neural net\n", "\n", "Okay, it is time to actually train the deep neural net. If time permits, experiment with the three hyperparameters to see if you can reduce the loss\n", "against the test set.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "both", "id": "nj3v5EKQFY8s" }, "outputs": [], "source": [ "# The following variables are the hyperparameters.\n", "learning_rate = 0.01\n", "epochs = 20\n", "batch_size = 1000\n", "\n", "# Specify the label\n", "label_name = \"median_house_value\"\n", "\n", "# Split the original training set into a reduced training set and a\n", "# validation set.\n", "validation_split = 0.2\n", "\n", "dnn_outputs = get_outputs_dnn()\n", "\n", "# Establish the model's topography.\n", "my_model = create_model(\n", " inputs,\n", " dnn_outputs,\n", " learning_rate)\n", "\n", "# Train the model on the normalized training set. We're passing the entire\n", "# normalized training set, but the model will only use the features\n", "# defined in our inputs.\n", "epochs, mse, history = train_model(my_model, train_df, epochs,\n", " batch_size, label_name, validation_split)\n", "plot_the_loss_curve(epochs, mse, history[\"val_mean_squared_error\"])\n", "\n", "# After building a model against the training set, test that model\n", "# against the test set.\n", "test_features = {name:np.array(value) for name, value in test_df.items()}\n", "test_label = test_median_house_value_normalized(np.array(test_features.pop(label_name))) # isolate the label\n", "print(\"\\n Evaluate the new model against the test set:\")\n", "my_model.evaluate(x = test_features, y = test_label, batch_size=batch_size, return_dict=True)" ] }, { "cell_type": "markdown", "metadata": { "id": "wlPXK-SmmjQ2" }, "source": [ "## Task 1: Compare the two models\n", "\n", "How did the deep neural net perform against the baseline linear regression model?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "hI7ojsL7nnBE" }, "outputs": [], "source": [ "#@title Double-click to view a possible answer\n", "\n", "# Assuming that the linear model converged and\n", "# the deep neural net model also converged, please\n", "# compare the test set loss for each.\n", "# In our experiments, the loss of the deep neural\n", "# network model was consistently lower than\n", "# that of the linear regression model, which\n", "# suggests that the deep neural network model\n", "# will make better predictions than the\n", "# linear regression model." ] }, { "cell_type": "markdown", "metadata": { "id": "Y5IKmk7D49_n" }, "source": [ "## Task 2: Optimize the deep neural network's topography\n", "\n", "Experiment with the number of layers of the deep neural network and the number of nodes in each layer. Aim to achieve both of the following goals:\n", "\n", "* Lower the loss against the test set.\n", "* Minimize the overall number of nodes in the deep neural net.\n", "\n", "The two goals may be in conflict." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "wYG5qXpP5a9n" }, "outputs": [], "source": [ "#@title Double-click to view a possible answer\n", "\n", "# Many answers are possible. We noticed the\n", "# following trends:\n", "# * Two layers outperformed one layer, but\n", "# three layers did not perform significantly\n", "# better than two layers.\n", "# In other words, two layers seemed best.\n", "# * Setting the topography as follows produced\n", "# reasonably good results with relatively few\n", "# nodes:\n", "# * 10 nodes in the first layer.\n", "# * 6 nodes in the second layer.\n", "# As the number of nodes in each layer dropped\n", "# below the preceding, test loss increased.\n", "# However, depending on your application, hardware\n", "# constraints, and the relative pain inflicted\n", "# by a less accurate model, a smaller network\n", "# (for example, 6 nodes in the first layer and\n", "# 4 nodes in the second layer) might be\n", "# acceptable." ] } ], "metadata": { "colab": { "name": "Intro to Neural Nets.ipynb", "private_outputs": true, "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }