Forked from zhaleh-rahimi/2.1Prediction1Dregression_v3.ipynb
Created
August 17, 2022 16:02
-
-
Save mohayl/3e94dc9f0dcf0109eeedeecdf2c8d9fc to your computer and use it in GitHub Desktop.
Revisions
-
Zhaleh Rahimi created this gist
Apr 21, 2021 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,974 @@ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", " <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\" />\n", "</center>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h1>Linear Regression 1D: Prediction</h1>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2>Objective</h2><ul><li> How to make the prediction for multiple inputs.</li><li> How to use linear class to build more complex models.</li><li> How to build a custom module.</li></ul> \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2>Table of Contents</h2>\n", "<p>In this lab, we will review how to make a prediction in several different ways by using PyTorch.</h2>\n", "<ul>\n", " <li><a href=\"#Prediction\">Prediction</a></li>\n", " <li><a href=\"#Linear\">Class Linear</a></li>\n", " <li><a href=\"#Cust\">Build Custom Modules</a></li>\n", "</ul>\n", "<p>Estimated Time Needed: <strong>15 min</strong></p>\n", "\n", "<hr>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2>Preparation</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following are the libraries we are going to use for this lab.\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Collecting torch\n", " Downloading torch-1.8.1-cp38-cp38-win_amd64.whl (190.5 MB)\n", "Requirement already satisfied: numpy in c:\\users\\zhaleh\\anaconda3\\lib\\site-packages (from torch) (1.19.2)\n", "Requirement already satisfied: typing-extensions in c:\\users\\zhaleh\\anaconda3\\lib\\site-packages (from torch) (3.7.4.3)\n", "Installing collected packages: torch\n", "Successfully installed torch-1.8.1\n" ] } ], "source": [ "# These are the libraries will be used for this lab.\n", "!pip install torch\n", "import torch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2 id=\"Prediction\">Prediction</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us create the following expressions:\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$b=-1,w=2$\n", "\n", "$\\hat{y}=-1+2x$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, define the parameters:\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Define w = 2 and b = -1 for y = wx + b\n", "\n", "w = torch.tensor(2.0, requires_grad = True)\n", "b = torch.tensor(-1.0, requires_grad = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, define the function <code>forward(x, w, b)</code> makes the prediction: \n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Function forward(x) for prediction\n", "\n", "def forward(x):\n", " yhat = w * x + b\n", " return yhat" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's make the following prediction at <i>x = 1</i>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$\\hat{y}=-1+2x$\n", "\n", "$\\hat{y}=-1+2(1)$\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The prediction: tensor([[1.]], grad_fn=<AddBackward0>)\n" ] } ], "source": [ "# Predict y = 2x - 1 at x = 1\n", "\n", "x = torch.tensor([[1.0]])\n", "yhat = forward(x)\n", "print(\"The prediction: \", yhat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let us try to make the prediction for multiple inputs:\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0110EN/notebook_images%20/chapter2/2.1.2.png\" width=\"500\" alt=\"Linear Regression Multiple Input Samples\" />\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us construct the <code>x</code> tensor first. Check the shape of <code>x</code>.\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The shape of x: torch.Size([2, 1])\n" ] } ], "source": [ "# Create x Tensor and check the shape of x tensor\n", "\n", "x = torch.tensor([[1.0], [2.0]])\n", "print(\"The shape of x: \", x.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now make the prediction: \n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The prediction: tensor([[1.],\n", " [3.]], grad_fn=<AddBackward0>)\n" ] } ], "source": [ "# Make the prediction of y = 2x - 1 at x = [1, 2]\n", "\n", "yhat = forward(x)\n", "print(\"The prediction: \", yhat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result is the same as what it is in the image above.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h3>Practice</h3>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make a prediction of the following <code>x</code> tensor using the <code>w</code> and <code>b</code> from above.\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[1.],\n", " [3.],\n", " [5.]], grad_fn=<AddBackward0>)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Practice: Make a prediction of y = 2x - 1 at x = [[1.0], [2.0], [3.0]]\n", "\n", "x = torch.tensor([[1.0], [2.0], [3.0]])\n", "forward(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click <b>here</b> for the solution.\n", "\n", "<!-- Your answer is below:\n", "yhat = forward(x)\n", "print(\"The prediction: \", yhat)\n", "-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2 id=\"Linear\">Class Linear</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The linear class can be used to make a prediction. We can also use the linear class to build more complex models. Let's import the module:\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# Import Class Linear\n", "\n", "from torch.nn import Linear" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set the random seed because the parameters are randomly initialized:\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<torch._C.Generator at 0x146e214f3b0>" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Set random seed\n", "\n", "torch.manual_seed(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us create the linear object by using the constructor. The parameters are randomly created. Let us print out to see what <i>w</i> and <i>b</i>. The parameters of an <code>torch.nn.Module</code> model are contained in the model’s parameters accessed with <code>lr.parameters()</code>:\n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameters w and b: [Parameter containing:\n", "tensor([[0.5153]], requires_grad=True), Parameter containing:\n", "tensor([-0.4414], requires_grad=True)]\n" ] } ], "source": [ "# Create Linear Regression Model, and print out the parameters\n", "\n", "lr = Linear(in_features=1, out_features=1, bias=True)\n", "print(\"Parameters w and b: \", list(lr.parameters()))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is equivalent to the following expression: \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$b=-0.44, w=0.5153$\n", "\n", "$\\hat{y}=-0.44+0.5153x$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A method <code>state_dict()</code> Returns a Python dictionary object corresponding to the layers of each parameter tensor. \n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python dictionary: OrderedDict([('weight', tensor([[0.5153]])), ('bias', tensor([-0.4414]))])\n", "keys: odict_keys(['weight', 'bias'])\n", "values: odict_values([tensor([[0.5153]]), tensor([-0.4414])])\n" ] } ], "source": [ "print(\"Python dictionary: \",lr.state_dict())\n", "print(\"keys: \",lr.state_dict().keys())\n", "print(\"values: \",lr.state_dict().values())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The keys correspond to the name of the attributes and the values correspond to the parameter value.\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "weight: Parameter containing:\n", "tensor([[0.5153]], requires_grad=True)\n", "bias: Parameter containing:\n", "tensor([-0.4414], requires_grad=True)\n" ] } ], "source": [ "print(\"weight:\",lr.weight)\n", "print(\"bias:\",lr.bias)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us make a single prediction at <i>x = [[1.0]]</i>.\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The prediction: tensor([[0.0739]], grad_fn=<AddmmBackward>)\n" ] } ], "source": [ "# Make the prediction at x = [[1.0]]\n", "\n", "x = torch.tensor([[1.0]])\n", "yhat = lr(x)\n", "print(\"The prediction: \", yhat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, you can make multiple predictions:\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0110EN/notebook_images%20/chapter2/2.1.2vector_function.png\" width=\"500\" alt=\"Linear Class Sample with Multiple Inputs\" />\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use model <code>lr(x)</code> to predict the result.\n" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The prediction: tensor([[0.0739],\n", " [0.5891]], grad_fn=<AddmmBackward>)\n" ] } ], "source": [ "# Create the prediction using linear model\n", "\n", "x = torch.tensor([[1.0], [2.0]])\n", "yhat = lr(x)\n", "print(\"The prediction: \", yhat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h3>Practice</h3>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make a prediction of the following <code>x</code> tensor using the linear regression model <code>lr</code>.\n" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[0.0739],\n", " [0.5891],\n", " [1.1044]], grad_fn=<AddmmBackward>)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Practice: Use the linear regression model object lr to make the prediction.\n", "\n", "x = torch.tensor([[1.0],[2.0],[3.0]])\n", "lr(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click <b>here</b> for the solution.\n", "\n", "<!-- Your answer is below:\n", "x=torch.tensor([[1.0],[2.0],[3.0]])\n", "yhat = lr(x)\n", "print(\"The prediction: \", yhat)\n", "-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2 id=\"Cust\">Build Custom Modules</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's build a custom module. We can make more complex models by using this method later on. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, import the following library.\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "# Library for this section\n", "\n", "from torch import nn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let us define the class: \n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "# Customize Linear Regression Class\n", "\n", "class LR(nn.Module):\n", " \n", " # Constructor\n", " def __init__(self, input_size, output_size):\n", " \n", " # Inherit from parent\n", " super(LR, self).__init__()\n", " self.linear = nn.Linear(input_size, output_size)\n", " \n", " # Prediction function\n", " def forward(self, x):\n", " out = self.linear(x)\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create an object by using the constructor. Print out the parameters we get and the model.\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The parameters: [Parameter containing:\n", "tensor([[-0.1939]], requires_grad=True), Parameter containing:\n", "tensor([0.4694], requires_grad=True)]\n", "Linear model: Linear(in_features=1, out_features=1, bias=True)\n" ] } ], "source": [ "# Create the linear regression model. Print out the parameters.\n", "\n", "lr = LR(1, 1)\n", "print(\"The parameters: \", list(lr.parameters()))\n", "print(\"Linear model: \", lr.linear)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us try to make a prediction of a single input sample.\n" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The prediction: tensor([[0.2755]], grad_fn=<AddmmBackward>)\n" ] } ], "source": [ "# Try our customize linear regression model with single input\n", "\n", "x = torch.tensor([[1.0]])\n", "yhat = lr(x)\n", "print(\"The prediction: \", yhat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let us try another example with multiple samples.\n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The prediction: tensor([[0.2755],\n", " [0.0816]], grad_fn=<AddmmBackward>)\n" ] } ], "source": [ "# Try our customize linear regression model with multiple input\n", "\n", "x = torch.tensor([[1.0], [2.0]])\n", "yhat = lr(x)\n", "print(\"The prediction: \", yhat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "the parameters are also stored in an ordered dictionary :\n" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python dictionary: OrderedDict([('linear.weight', tensor([[-0.1939]])), ('linear.bias', tensor([0.4694]))])\n", "keys: odict_keys(['linear.weight', 'linear.bias'])\n", "values: odict_values([tensor([[-0.1939]]), tensor([0.4694])])\n" ] } ], "source": [ "print(\"Python dictionary: \", lr.state_dict())\n", "print(\"keys: \",lr.state_dict().keys())\n", "print(\"values: \",lr.state_dict().values())\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h3>Practice</h3>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create an object <code>lr1</code> from the class we created before and make a prediction by using the following tensor: \n" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[ 0.2755],\n", " [ 0.0816],\n", " [-0.1122]], grad_fn=<AddmmBackward>)" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Practice: Use the LR class to create a model and make a prediction of the following tensor.\n", "\n", "x = torch.tensor([[1.0], [2.0], [3.0]])\n", "lr(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click <b>here</b> for the solution.\n", "\n", "<!-- Your answer is below:\n", "x=torch.tensor([[1.0],[2.0],[3.0]])\n", "lr1=LR(1,1)\n", "yhat=lr(x)\n", "yhat\n", "-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " <!-- Your answer is below:\n", "x=torch.tensor([[1.0],[2.0],[3.0]])\n", "lr1=LR(1,1)\n", "yhat=lr1(x)\n", "yhat\n", "-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a href=\"https://dataplatform.cloud.ibm.com/registration/stepone?context=cpdaas&apps=data_science_experience,watson_machine_learning\"><img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/Watson_Studio.png\"/></a>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2>About the Authors:</h2> \n", "\n", "<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Other contributors: <a href=\"https://www.linkedin.com/in/michelleccarey/\">Michelle Carey</a>, <a href=\"www.linkedin.com/in/jiahui-mavis-zhou-a4537814a\">Mavis Zhou</a> \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Change Log\n", "\n", "| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n", "| ----------------- | ------- | ---------- | ----------------------------------------------------------- |\n", "| 2020-09-21 | 2.0 | Shubham | Migrated Lab to Markdown and added to course repo in GitLab |\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<hr>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 2 } This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1 @@ {"cells":[{"cell_type":"markdown","metadata":{},"source":["<center>\n"," <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\" />\n","</center>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h1>Linear regression 1D: Training Two Parameter</h1>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Objective</h2><ul><li> How to train the model and visualize the loss results.</li></ul> \n"]},{"cell_type":"markdown","metadata":{"slideshow":{"slide_type":"slide"}},"source":["<h2>Table of Contents</h2>\n","<p>In this lab, you will train a model with PyTorch by using the data that we created. The model will have the slope and bias. And we will review how to make a prediction in several different ways by using PyTorch.</p>\n","\n","<ul>\n"," <li><a href=\"#Makeup_Data\">Make Some Data</a></li>\n"," <li><a href=\"#Model_Cost\">Create the Model and Cost Function (Total Loss) </a></li>\n"," <li><a href=\"#Train\">Train the Model </a></li>\n","</ul>\n","<p>Estimated Time Needed: <strong>20 min</strong></ul>\n","\n","<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Preparation</h2>\n"]},{"cell_type":"markdown","metadata":{"slideshow":{"slide_type":"slide"}},"source":["We'll need the following libraries: \n"]},{"cell_type":"code","execution_count":null,"metadata":{"slideshow":{"slide_type":"slide"}},"outputs":[],"source":["# These are the libraries we are going to use in the lab.\n","\n","import numpy as np\n","import matplotlib.pyplot as plt\n","from mpl_toolkits import mplot3d"]},{"cell_type":"markdown","metadata":{},"source":["The class <code>plot_error_surfaces</code> is just to help you visualize the data space and the parameter space during training and has nothing to do with PyTorch. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# The class for plot the diagram\n","\n","class plot_error_surfaces(object):\n"," \n"," # Constructor\n"," def __init__(self, w_range, b_range, X, Y, n_samples = 30, go = True):\n"," W = np.linspace(-w_range, w_range, n_samples)\n"," B = np.linspace(-b_range, b_range, n_samples)\n"," w, b = np.meshgrid(W, B) \n"," Z = np.zeros((30,30))\n"," count1 = 0\n"," self.y = Y.numpy()\n"," self.x = X.numpy()\n"," for w1, b1 in zip(w, b):\n"," count2 = 0\n"," for w2, b2 in zip(w1, b1):\n"," Z[count1, count2] = np.mean((self.y - w2 * self.x + b2) ** 2)\n"," count2 += 1\n"," count1 += 1\n"," self.Z = Z\n"," self.w = w\n"," self.b = b\n"," self.W = []\n"," self.B = []\n"," self.LOSS = []\n"," self.n = 0\n"," if go == True:\n"," plt.figure()\n"," plt.figure(figsize = (7.5, 5))\n"," plt.axes(projection='3d').plot_surface(self.w, self.b, self.Z, rstride = 1, cstride = 1,cmap = 'viridis', edgecolor = 'none')\n"," plt.title('Cost/Total Loss Surface')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()\n"," plt.figure()\n"," plt.title('Cost/Total Loss Surface Contour')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.contour(self.w, self.b, self.Z)\n"," plt.show()\n"," \n"," # Setter\n"," def set_para_loss(self, W, B, loss):\n"," self.n = self.n + 1\n"," self.W.append(W)\n"," self.B.append(B)\n"," self.LOSS.append(loss)\n"," \n"," # Plot diagram\n"," def final_plot(self): \n"," ax = plt.axes(projection = '3d')\n"," ax.plot_wireframe(self.w, self.b, self.Z)\n"," ax.scatter(self.W,self.B, self.LOSS, c = 'r', marker = 'x', s = 200, alpha = 1)\n"," plt.figure()\n"," plt.contour(self.w,self.b, self.Z)\n"," plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()\n"," \n"," # Plot diagram\n"," def plot_ps(self):\n"," plt.subplot(121)\n"," plt.ylim\n"," plt.plot(self.x, self.y, 'ro', label=\"training points\")\n"," plt.plot(self.x, self.W[-1] * self.x + self.B[-1], label = \"estimated line\")\n"," plt.xlabel('x')\n"," plt.ylabel('y')\n"," plt.ylim((-10, 15))\n"," plt.title('Data Space Iteration: ' + str(self.n))\n","\n"," plt.subplot(122)\n"," plt.contour(self.w, self.b, self.Z)\n"," plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n"," plt.title('Total Loss Surface Contour Iteration' + str(self.n))\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Makeup_Data\">Make Some Data</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Import PyTorch: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import PyTorch library\n","\n","import torch"]},{"cell_type":"markdown","metadata":{},"source":["Start with generating values from -3 to 3 that create a line with a slope of 1 and a bias of -1. This is the line that you need to estimate.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create f(X) with a slope of 1 and a bias of -1\n","\n","X = torch.arange(-3, 3, 0.1).view(-1, 1)\n","f = 1 * X - 1"]},{"cell_type":"markdown","metadata":{},"source":["Now, add some noise to the data:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Add noise\n","\n","Y = f + 0.1 * torch.randn(X.size())"]},{"cell_type":"markdown","metadata":{},"source":["Plot the line and <code>Y</code> with noise:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot out the line and the points with noise\n","\n","plt.plot(X.numpy(), Y.numpy(), 'rx', label = 'y')\n","plt.plot(X.numpy(), f.numpy(), label = 'f')\n","plt.xlabel('x')\n","plt.ylabel('y')\n","plt.legend()"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Model_Cost\">Create the Model and Cost Function (Total Loss)</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Define the <code>forward</code> function: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the forward function\n","\n","def forward(x):\n"," return w * x + b"]},{"cell_type":"markdown","metadata":{},"source":["Define the cost or criterion function (MSE): \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the MSE Loss function\n","\n","def criterion(yhat,y):\n"," return torch.mean((yhat-y)**2)"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code> plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create plot_error_surfaces for viewing the data\n","\n","get_surface = plot_error_surfaces(15, 15, X, Y, 30)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Train\">Train the Model</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Create model parameters <code>w</code>, <code>b</code> by setting the argument <code>requires_grad</code> to True because we must learn it using the data.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the parameters w, b for y = wx + b\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)"]},{"cell_type":"markdown","metadata":{},"source":["Set the learning rate to 0.1 and create an empty list <code>LOSS</code> for storing the loss for each iteration.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define learning rate and create an empty list for containing the loss for each iteration.\n","\n","lr = 0.1\n","LOSS = []"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model</code> function for train the model.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# The function for training the model\n","\n","def train_model(iter):\n"," \n"," # Loop\n"," for epoch in range(iter):\n"," \n"," # make a prediction\n"," Yhat = forward(X)\n"," \n"," # calculate the loss \n"," loss = criterion(Yhat, Y)\n","\n"," # Section for plotting\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n"," if epoch % 3 == 0:\n"," get_surface.plot_ps()\n"," \n"," # store the loss in the list LOSS\n"," LOSS.append(loss)\n"," \n"," # backward pass: compute gradient of the loss with respect to all the learnable parameters\n"," loss.backward()\n"," \n"," # update parameters slope and bias\n"," w.data = w.data - lr * w.grad.data\n"," b.data = b.data - lr * b.grad.data\n"," \n"," # zero the gradients before running the backward pass\n"," w.grad.data.zero_()\n"," b.grad.data.zero_()"]},{"cell_type":"markdown","metadata":{},"source":["Run 15 iterations of gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Train the model with 15 iterations\n","\n","train_model(15)"]},{"cell_type":"markdown","metadata":{},"source":["Plot total loss/cost surface with loss values for different parameters in red:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot out the Loss Result\n","\n","get_surface.final_plot()\n","plt.plot(LOSS)\n","plt.tight_layout()\n","plt.xlabel(\"Epoch/Iterations\")\n","plt.ylabel(\"Cost\")"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h3>Practice</h3>\n"]},{"cell_type":"markdown","metadata":{},"source":["Experiment using s learning rates 0.2 and width the following parameters. Run 15 iterations.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Practice: train and plot the result with lr = 0.2 and the following parameters\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","lr = 0.2\n","LOSS2 = []"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!-- \n","def my_train_model(iter):\n"," for epoch in range(iter):\n"," Yhat = forward(X)\n"," loss = criterion(Yhat, Y)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n"," if epoch % 3 == 0:\n"," get_surface.plot_ps()\n"," LOSS2.append(loss)\n"," loss.backward()\n"," w.data = w.data - lr * w.grad.data\n"," b.data = b.data - lr * b.grad.data\n"," w.grad.data.zero_()\n"," b.grad.data.zero_()\n","my_train_model(15)\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["Plot the <code>LOSS</code> and <code>LOSS2</code>\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Practice: Plot the LOSS and LOSS2 in order to compare the Total Loss\n","\n","# Type your code here"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!--\n","plt.plot(LOSS, label = \"LOSS\")\n","plt.plot(LOSS2, label = \"LOSS2\")\n","plt.tight_layout()\n","plt.xlabel(\"Epoch/Iterations\")\n","plt.ylabel(\"Cost\")\n","plt.legend()\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<a href=\"https://dataplatform.cloud.ibm.com/registration/stepone?context=cpdaas&apps=data_science_experience,watson_machine_learning\"><img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/Watson_Studio.png\"/></a>\n"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>About the Authors:</h2> \n","\n","<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n"]},{"cell_type":"markdown","metadata":{},"source":["Other contributors: <a href=\"https://www.linkedin.com/in/michelleccarey/\">Michelle Carey</a>, <a href=\"www.linkedin.com/in/jiahui-mavis-zhou-a4537814a\">Mavis Zhou</a> \n"]},{"cell_type":"markdown","metadata":{},"source":["## Change Log\n","\n","| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n","| ----------------- | ------- | ---------- | ----------------------------------------------------------- |\n","| 2020-09-21 | 2.0 | Shubham | Migrated Lab to Markdown and added to course repo in GitLab |\n"]},{"cell_type":"markdown","metadata":{},"source":["<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"}},"nbformat":4,"nbformat_minor":2} This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,978 @@ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "<center>\n", " <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\" />\n", "</center>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h1>Linear regression 1D: Training Two Parameter Stochastic Gradient Descent (SGD)</h1>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2>Objective</h2><ul><li> How to use SGD(Stochastic Gradient Descent) to train the model.</li></ul> \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2>Table of Contents</h2>\n", "<p>In this Lab, you will practice training a model by using Stochastic Gradient descent.</p>\n", "\n", "<ul>\n", " <li><a href=\"#Makeup_Data\">Make Some Data</a></li>\n", " <li><a href=\"#Model_Cost\">Create the Model and Cost Function (Total Loss)</a></li>\n", " <li><a href=\"#BGD\">Train the Model:Batch Gradient Descent</a></li>\n", " <li><a href=\"#SGD\">Train the Model:Stochastic gradient descent</a></li>\n", " <li><a href=\"#SGD_Loader\">Train the Model:Stochastic gradient descent with Data Loader</a></li>\n", "</ul>\n", "<p>Estimated Time Needed: <strong>30 min</strong></p>\n", "\n", "<hr>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2>Preparation</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll need the following libraries: \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# These are the libraries we are going to use in the lab.\n", "\n", "import torch\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "from mpl_toolkits import mplot3d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The class <code>plot_error_surfaces</code> is just to help you visualize the data space and the parameter space during training and has nothing to do with PyTorch.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The class for plot the diagram\n", "\n", "class plot_error_surfaces(object):\n", " \n", " # Constructor\n", " def __init__(self, w_range, b_range, X, Y, n_samples = 30, go = True):\n", " W = np.linspace(-w_range, w_range, n_samples)\n", " B = np.linspace(-b_range, b_range, n_samples)\n", " w, b = np.meshgrid(W, B) \n", " Z = np.zeros((30, 30))\n", " count1 = 0\n", " self.y = Y.numpy()\n", " self.x = X.numpy()\n", " for w1, b1 in zip(w, b):\n", " count2 = 0\n", " for w2, b2 in zip(w1, b1):\n", " Z[count1, count2] = np.mean((self.y - w2 * self.x + b2) ** 2)\n", " count2 += 1\n", " count1 += 1\n", " self.Z = Z\n", " self.w = w\n", " self.b = b\n", " self.W = []\n", " self.B = []\n", " self.LOSS = []\n", " self.n = 0\n", " if go == True:\n", " plt.figure()\n", " plt.figure(figsize = (7.5, 5))\n", " plt.axes(projection = '3d').plot_surface(self.w, self.b, self.Z, rstride = 1, cstride = 1,cmap = 'viridis', edgecolor = 'none')\n", " plt.title('Loss Surface')\n", " plt.xlabel('w')\n", " plt.ylabel('b')\n", " plt.show()\n", " plt.figure()\n", " plt.title('Loss Surface Contour')\n", " plt.xlabel('w')\n", " plt.ylabel('b')\n", " plt.contour(self.w, self.b, self.Z)\n", " plt.show()\n", " \n", " # Setter\n", " def set_para_loss(self, W, B, loss):\n", " self.n = self.n + 1\n", " self.W.append(W)\n", " self.B.append(B)\n", " self.LOSS.append(loss)\n", " \n", " # Plot diagram\n", " def final_plot(self): \n", " ax = plt.axes(projection = '3d')\n", " ax.plot_wireframe(self.w, self.b, self.Z)\n", " ax.scatter(self.W, self.B, self.LOSS, c = 'r', marker = 'x', s = 200, alpha = 1)\n", " plt.figure()\n", " plt.contour(self.w, self.b, self.Z)\n", " plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n", " plt.xlabel('w')\n", " plt.ylabel('b')\n", " plt.show()\n", " \n", " # Plot diagram\n", " def plot_ps(self):\n", " plt.subplot(121)\n", " plt.ylim\n", " plt.plot(self.x, self.y, 'ro', label = \"training points\")\n", " plt.plot(self.x, self.W[-1] * self.x + self.B[-1], label = \"estimated line\")\n", " plt.xlabel('x')\n", " plt.ylabel('y')\n", " plt.ylim((-10, 15))\n", " plt.title('Data Space Iteration: ' + str(self.n))\n", " plt.subplot(122)\n", " plt.contour(self.w, self.b, self.Z)\n", " plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n", " plt.title('Loss Surface Contour Iteration' + str(self.n))\n", " plt.xlabel('w')\n", " plt.ylabel('b')\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2 id=\"Makeup_Data\">Make Some Data</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set random seed: \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set random seed\n", "\n", "torch.manual_seed(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate values from <i>-3</i> to <i>3</i> that create a line with a slope of <i>1</i> and a bias of <i>-1</i>. This is the line that you need to estimate. Add some noise to the data:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Setup the actual data and simulated data\n", "\n", "X = torch.arange(-3, 3, 0.1).view(-1, 1)\n", "f = 1 * X - 1\n", "Y = f + 0.1 * torch.randn(X.size())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the results:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot out the data dots and line\n", "\n", "plt.plot(X.numpy(), Y.numpy(), 'rx', label = 'y')\n", "plt.plot(X.numpy(), f.numpy(), label = 'f')\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2 id=\"Model_Cost\">Create the Model and Cost Function (Total Loss)</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define the <code>forward</code> function:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define the forward function\n", "\n", "def forward(x):\n", " return w * x + b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define the cost or criterion function (MSE): \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define the MSE Loss function\n", "\n", "def criterion(yhat, y):\n", " return torch.mean((yhat - y) ** 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a <code> plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create plot_error_surfaces for viewing the data\n", "\n", "get_surface = plot_error_surfaces(15, 13, X, Y, 30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2 id=\"BGD\">Train the Model: Batch Gradient Descent</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create model parameters <code>w</code>, <code>b</code> by setting the argument <code>requires_grad</code> to True because the system must learn it.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define the parameters w, b for y = wx + b\n", "\n", "w = torch.tensor(-15.0, requires_grad = True)\n", "b = torch.tensor(-10.0, requires_grad = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set the learning rate to 0.1 and create an empty list <code>LOSS</code> for storing the loss for each iteration.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define learning rate and create an empty list for containing the loss for each iteration.\n", "\n", "lr = 0.1\n", "LOSS_BGD = []" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define <code>train_model</code> function for train the model.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The function for training the model\n", "\n", "def train_model(iter):\n", " \n", " # Loop\n", " for epoch in range(iter):\n", " \n", " # make a prediction\n", " Yhat = forward(X)\n", " \n", " # calculate the loss \n", " loss = criterion(Yhat, Y)\n", "\n", " # Section for plotting\n", " get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n", " get_surface.plot_ps()\n", " \n", " # store the loss in the list LOSS_BGD\n", " LOSS_BGD.append(loss)\n", " \n", " # backward pass: compute gradient of the loss with respect to all the learnable parameters\n", " loss.backward()\n", " \n", " # update parameters slope and bias\n", " w.data = w.data - lr * w.grad.data\n", " b.data = b.data - lr * b.grad.data\n", " \n", " # zero the gradients before running the backward pass\n", " w.grad.data.zero_()\n", " b.grad.data.zero_()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run 10 epochs of batch gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Train the model with 10 iterations\n", "\n", "train_model(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2 id=\"SGD\">Train the Model: Stochastic Gradient Descent</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a <code>plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create plot_error_surfaces for viewing the data\n", "\n", "get_surface = plot_error_surfaces(15, 13, X, Y, 30, go = False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define <code>train_model_SGD</code> function for training the model.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The function for training the model\n", "\n", "LOSS_SGD = []\n", "w = torch.tensor(-15.0, requires_grad = True)\n", "b = torch.tensor(-10.0, requires_grad = True)\n", "\n", "def train_model_SGD(iter):\n", " \n", " # Loop\n", " for epoch in range(iter):\n", " \n", " # SGD is an approximation of out true total loss/cost, in this line of code we calculate our true loss/cost and store it\n", " Yhat = forward(X)\n", "\n", " # store the loss \n", " LOSS_SGD.append(criterion(Yhat, Y).tolist())\n", " \n", " for x, y in zip(X, Y):\n", " \n", " # make a pridiction\n", " yhat = forward(x)\n", " \n", " # calculate the loss \n", " loss = criterion(yhat, y)\n", "\n", " # Section for plotting\n", " get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n", " \n", " # backward pass: compute gradient of the loss with respect to all the learnable parameters\n", " loss.backward()\n", " \n", " # update parameters slope and bias\n", " w.data = w.data - lr * w.grad.data\n", " b.data = b.data - lr * b.grad.data\n", "\n", " # zero the gradients before running the backward pass\n", " w.grad.data.zero_()\n", " b.grad.data.zero_()\n", " \n", " #plot surface and data space after each epoch \n", " get_surface.plot_ps()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run 10 epochs of stochastic gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Train the model with 10 iterations\n", "\n", "train_model_SGD(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare the loss of both batch gradient descent as SGD.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot out the LOSS_BGD and LOSS_SGD\n", "\n", "plt.plot(LOSS_BGD,label = \"Batch Gradient Descent\")\n", "plt.plot(LOSS_SGD,label = \"Stochastic Gradient Descent\")\n", "plt.xlabel('epoch')\n", "plt.ylabel('Cost/ total loss')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2 id=\"SGD_Loader\">SGD with Dataset DataLoader</h2>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import the module for building a dataset class: \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import the library for DataLoader\n", "\n", "from torch.utils.data import Dataset, DataLoader" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a dataset class:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Dataset Class\n", "\n", "class Data(Dataset):\n", " \n", " # Constructor\n", " def __init__(self):\n", " self.x = torch.arange(-3, 3, 0.1).view(-1, 1)\n", " self.y = 1 * self.x - 1\n", " self.len = self.x.shape[0]\n", " \n", " # Getter\n", " def __getitem__(self,index): \n", " return self.x[index], self.y[index]\n", " \n", " # Return the length\n", " def __len__(self):\n", " return self.len" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a dataset object and check the length of the dataset.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create the dataset and check the length\n", "\n", "dataset = Data()\n", "print(\"The length of dataset: \", len(dataset))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Obtain the first training point: \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Print the first point\n", "\n", "x, y = dataset[0]\n", "print(\"(\", x, \", \", y, \")\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, obtain the first three training points: \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Print the first 3 point\n", "\n", "x, y = dataset[0:3]\n", "print(\"The first 3 x: \", x)\n", "print(\"The first 3 y: \", y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a <code>plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create plot_error_surfaces for viewing the data\n", "\n", "get_surface = plot_error_surfaces(15, 13, X, Y, 30, go = False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a <code>DataLoader</code> object by using the constructor: \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create DataLoader\n", "\n", "trainloader = DataLoader(dataset = dataset, batch_size = 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define <code>train_model_DataLoader</code> function for training the model.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The function for training the model\n", "\n", "w = torch.tensor(-15.0,requires_grad=True)\n", "b = torch.tensor(-10.0,requires_grad=True)\n", "LOSS_Loader = []\n", "\n", "def train_model_DataLoader(epochs):\n", " \n", " # Loop\n", " for epoch in range(epochs):\n", " \n", " # SGD is an approximation of out true total loss/cost, in this line of code we calculate our true loss/cost and store it\n", " Yhat = forward(X)\n", " \n", " # store the loss \n", " LOSS_Loader.append(criterion(Yhat, Y).tolist())\n", " \n", " for x, y in trainloader:\n", " \n", " # make a prediction\n", " yhat = forward(x)\n", " \n", " # calculate the loss\n", " loss = criterion(yhat, y)\n", " \n", " # Section for plotting\n", " get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n", " \n", " # Backward pass: compute gradient of the loss with respect to all the learnable parameters\n", " loss.backward()\n", " \n", " # Updata parameters slope\n", " w.data = w.data - lr * w.grad.data\n", " b.data = b.data - lr* b.grad.data\n", " \n", " # Clear gradients \n", " w.grad.data.zero_()\n", " b.grad.data.zero_()\n", " \n", " #plot surface and data space after each epoch \n", " get_surface.plot_ps()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run 10 epochs of stochastic gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Run 10 iterations\n", "\n", "train_model_DataLoader(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare the loss of both batch gradient decent as SGD. Note that SGD converges to a minimum faster, that is, it decreases faster. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot the LOSS_BGD and LOSS_Loader\n", "\n", "plt.plot(LOSS_BGD,label=\"Batch Gradient Descent\")\n", "plt.plot(LOSS_Loader,label=\"Stochastic Gradient Descent with DataLoader\")\n", "plt.xlabel('epoch')\n", "plt.ylabel('Cost/ total loss')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h3>Practice</h3>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For practice, try to use SGD with DataLoader to train model with 10 iterations. Store the total loss in <code>LOSS</code>. We are going to use it in the next question.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Practice: Use SGD with trainloader to train model and store the total loss in LOSS\n", "\n", "LOSS = []\n", "w = torch.tensor(-12.0, requires_grad = True)\n", "b = torch.tensor(-10.0, requires_grad = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click <b>here</b> for the solution.\n", "\n", "<!-- \n", "def my_train_model(epochs):\n", " for epoch in range(epochs):\n", " Yhat = forward(X)\n", " LOSS.append(criterion(Yhat, X))\n", " for x, y in trainloader:\n", " yhat = forward(x)\n", " loss = criterion(yhat, y)\n", " get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n", " loss.backward()\n", " w.data = w.data - lr * w.grad.data\n", " b.data = b.data - lr * b.grad.data\n", " w.grad.data.zero_()\n", " b.grad.data.zero_()\n", " get_surface.plot_ps()\n", "my_train_model(10)\n", "-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the total loss\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Practice: Plot the total loss using LOSS\n", "\n", "plt.plot(LOSS,label = \"Stochastic Gradient Descent\")\n", "plt.xlabel('iteration')\n", "plt.ylabel('Cost/ total loss')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Double-click **here** for the solution.\n", "\n", "<!-- \n", "plt.plot(LOSS,label = \"Stochastic Gradient Descent\")\n", "plt.xlabel('iteration')\n", "plt.ylabel('Cost/ total loss')\n", "plt.legend()\n", "plt.show()\n", "-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a href=\"https://dataplatform.cloud.ibm.com/registration/stepone?context=cpdaas&apps=data_science_experience,watson_machine_learning\"><img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/Watson_Studio.png\"/></a>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--Empty Space for separating topics-->\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<h2>About the Authors:</h2> \n", "\n", "<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Other contributors: <a href=\"https://www.linkedin.com/in/michelleccarey/\">Michelle Carey</a>, <a href=\"www.linkedin.com/in/jiahui-mavis-zhou-a4537814a\">Mavis Zhou</a>\n", "\n", "Thanks to: Andrew Kin ,Alessandro Barboza\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Change Log\n", "\n", "| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n", "| ----------------- | ------- | ---------- | ----------------------------------------------------------- |\n", "| 2020-09-23 | 2.0 | Shubham | Migrated Lab to Markdown and added to course repo in GitLab |\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<hr>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 2 } This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1 @@ {"cells":[{"cell_type":"markdown","metadata":{},"source":["<center>\n"," <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\" />\n","</center>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h1>Linear Regression 1D: Training Two Parameter Mini-Batch Gradient Decent</h1>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Objective</h2><ul><li> How to use Mini-Batch Gradient Descent to train model.</li></ul> \n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Table of Contents</h2>\n","<p>In this Lab, you will practice training a model by using Mini-Batch Gradient Descent.</p>\n","\n","<ul>\n"," <li><a href=\"#Makeup_Data\">Make Some Data</a></li>\n"," <li><a href=\"#Model_Cost\">Create the Model and Cost Function (Total Loss)</a></li>\n"," <li><a href=\"#BGD\">Train the Model: Batch Gradient Descent</a></li>\n"," <li><a href=\"#SGD\">Train the Model: Stochastic Gradient Descent with Dataset DataLoader</a></li>\n"," <li><a href=\"#Mini5\">Train the Model: Mini Batch Gradient Decent: Batch Size Equals 5</a></li>\n"," <li><a href=\"#Mini10\">Train the Model: Mini Batch Gradient Decent: Batch Size Equals 10</a></li>\n","</ul>\n","<p>Estimated Time Needed: <strong>30 min</strong></p>\n","</div>\n","\n","<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Preparation</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["We'll need the following libraries:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import the libraries we need for this lab\n","\n","import numpy as np\n","import matplotlib.pyplot as plt\n","from mpl_toolkits import mplot3d"]},{"cell_type":"markdown","metadata":{},"source":["The class <code>plot_error_surfaces</code> is just to help you visualize the data space and the parameter space during training and has nothing to do with PyTorch. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# The class for plotting the diagrams\n","\n","class plot_error_surfaces(object):\n"," \n"," # Constructor\n"," def __init__(self, w_range, b_range, X, Y, n_samples = 30, go = True):\n"," W = np.linspace(-w_range, w_range, n_samples)\n"," B = np.linspace(-b_range, b_range, n_samples)\n"," w, b = np.meshgrid(W, B) \n"," Z = np.zeros((30, 30))\n"," count1 = 0\n"," self.y = Y.numpy()\n"," self.x = X.numpy()\n"," for w1, b1 in zip(w, b):\n"," count2 = 0\n"," for w2, b2 in zip(w1, b1):\n"," Z[count1, count2] = np.mean((self.y - w2 * self.x + b2) ** 2)\n"," count2 += 1\n"," count1 += 1\n"," self.Z = Z\n"," self.w = w\n"," self.b = b\n"," self.W = []\n"," self.B = []\n"," self.LOSS = []\n"," self.n = 0\n"," if go == True:\n"," plt.figure()\n"," plt.figure(figsize = (7.5, 5))\n"," plt.axes(projection = '3d').plot_surface(self.w, self.b, self.Z, rstride = 1, cstride = 1, cmap = 'viridis', edgecolor = 'none')\n"," plt.title('Loss Surface')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()\n"," plt.figure()\n"," plt.title('Loss Surface Contour')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.contour(self.w, self.b, self.Z)\n"," plt.show()\n"," \n"," # Setter\n"," def set_para_loss(self, W, B, loss):\n"," self.n = self.n + 1\n"," self.W.append(W)\n"," self.B.append(B)\n"," self.LOSS.append(loss)\n"," \n"," # Plot diagram\n"," def final_plot(self): \n"," ax = plt.axes(projection = '3d')\n"," ax.plot_wireframe(self.w, self.b, self.Z)\n"," ax.scatter(self.W, self.B, self.LOSS, c = 'r', marker = 'x', s = 200, alpha = 1)\n"," plt.figure()\n"," plt.contour(self.w, self.b, self.Z)\n"," plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()\n"," \n"," # Plot diagram\n"," def plot_ps(self):\n"," plt.subplot(121)\n"," plt.ylim()\n"," plt.plot(self.x, self.y, 'ro', label = \"training points\")\n"," plt.plot(self.x, self.W[-1] * self.x + self.B[-1], label = \"estimated line\")\n"," plt.xlabel('x')\n"," plt.ylabel('y')\n"," plt.title('Data Space Iteration: '+ str(self.n))\n"," plt.subplot(122)\n"," plt.contour(self.w, self.b, self.Z)\n"," plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n"," plt.title('Loss Surface Contour')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Makeup_Data\">Make Some Data </h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Import PyTorch and set random seed:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import PyTorch library\n","\n","import torch\n","torch.manual_seed(1)"]},{"cell_type":"markdown","metadata":{},"source":["Generate values from -3 to 3 that create a line with a slope of 1 and a bias of -1. This is the line that you need to estimate. Add some noise to the data:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Generate the data with noise and the line\n","\n","X = torch.arange(-3, 3, 0.1).view(-1, 1)\n","f = 1 * X - 1\n","Y = f + 0.1 * torch.randn(X.size())"]},{"cell_type":"markdown","metadata":{},"source":["Plot the results:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot the line and the data\n","\n","plt.plot(X.numpy(), Y.numpy(), 'rx', label = 'y')\n","plt.plot(X.numpy(), f.numpy(), label = 'f')\n","plt.xlabel('x')\n","plt.ylabel('y')\n","plt.legend()\n","plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Model_Cost\">Create the Model and Cost Function (Total Loss) </h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Define the <code>forward</code> function: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the prediction function\n","\n","def forward(x):\n"," return w * x + b"]},{"cell_type":"markdown","metadata":{},"source":["Define the cost or criterion function: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the cost function\n","\n","def criterion(yhat, y):\n"," return torch.mean((yhat - y) ** 2)"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code> plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a plot_error_surfaces object.\n","\n","get_surface = plot_error_surfaces(15, 13, X, Y, 30)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Train the Model: Batch Gradient Descent (BGD)</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model_BGD</code> function.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the function for training model\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","lr = 0.1\n","LOSS_BGD = []\n","\n","def train_model_BGD(epochs):\n"," for epoch in range(epochs):\n"," Yhat = forward(X)\n"," loss = criterion(Yhat, Y)\n"," LOSS_BGD.append(loss)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n"," get_surface.plot_ps()\n"," loss.backward()\n"," w.data = w.data - lr * w.grad.data\n"," b.data = b.data - lr * b.grad.data\n"," w.grad.data.zero_()\n"," b.grad.data.zero_()"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of batch gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Run train_model_BGD with 10 iterations\n","\n","train_model_BGD(10)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"SGD\"> Stochastic Gradient Descent (SGD) with Dataset DataLoader</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code>plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a plot_error_surfaces object.\n","\n","get_surface = plot_error_surfaces(15, 13, X, Y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["Import <code>Dataset</code> and <code>DataLoader</code> libraries\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import libraries\n","\n","from torch.utils.data import Dataset, DataLoader"]},{"cell_type":"markdown","metadata":{},"source":["Create <code>Data</code> class\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create class Data\n","\n","class Data(Dataset):\n"," \n"," # Constructor\n"," def __init__(self):\n"," self.x = torch.arange(-3, 3, 0.1).view(-1, 1)\n"," self.y = 1 * X - 1\n"," self.len = self.x.shape[0]\n"," \n"," # Getter\n"," def __getitem__(self, index): \n"," return self.x[index], self.y[index]\n"," \n"," # Get length\n"," def __len__(self):\n"," return self.len"]},{"cell_type":"markdown","metadata":{},"source":["Create a dataset object and a dataloader object: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Data object and DataLoader object\n","\n","dataset = Data()\n","trainloader = DataLoader(dataset = dataset, batch_size = 1)"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model_SGD</code> function for training the model.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define train_model_SGD function\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","LOSS_SGD = []\n","lr = 0.1\n","def train_model_SGD(epochs):\n"," for epoch in range(epochs):\n"," Yhat = forward(X)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), criterion(Yhat, Y).tolist())\n"," get_surface.plot_ps()\n"," LOSS_SGD.append(criterion(forward(X), Y).tolist())\n"," for x, y in trainloader:\n"," yhat = forward(x)\n"," loss = criterion(yhat, y)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n"," loss.backward()\n"," w.data = w.data - lr * w.grad.data\n"," b.data = b.data - lr * b.grad.data\n"," w.grad.data.zero_()\n"," b.grad.data.zero_()\n"," get_surface.plot_ps()"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of stochastic gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Run train_model_SGD(iter) with 10 iterations\n","\n","train_model_SGD(10)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Mini5\">Mini Batch Gradient Descent: Batch Size Equals 5</h2> \n"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code> plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a plot_error_surfaces object.\n","\n","get_surface = plot_error_surfaces(15, 13, X, Y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["Create <code>Data</code> object and create a <code>Dataloader</code> object where the batch size equals 5:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create DataLoader object and Data object\n","\n","dataset = Data()\n","trainloader = DataLoader(dataset = dataset, batch_size = 5)"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model_Mini5</code> function to train the model.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define train_model_Mini5 function\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","LOSS_MINI5 = []\n","lr = 0.1\n","\n","def train_model_Mini5(epochs):\n"," for epoch in range(epochs):\n"," Yhat = forward(X)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), criterion(Yhat, Y).tolist())\n"," get_surface.plot_ps()\n"," LOSS_MINI5.append(criterion(forward(X), Y).tolist())\n"," for x, y in trainloader:\n"," yhat = forward(x)\n"," loss = criterion(yhat, y)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n"," loss.backward()\n"," w.data = w.data - lr * w.grad.data\n"," b.data = b.data - lr * b.grad.data\n"," w.grad.data.zero_()\n"," b.grad.data.zero_()"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of mini-batch gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Run train_model_Mini5 with 10 iterations.\n","\n","train_model_Mini5(10)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Mini10\">Mini Batch Gradient Descent: Batch Size Equals 10</h2> \n"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code> plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a plot_error_surfaces object.\n","\n","get_surface = plot_error_surfaces(15, 13, X, Y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["Create <code>Data</code> object and create a <code>Dataloader</code> object batch size equals 10\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create DataLoader object\n","\n","dataset = Data()\n","trainloader = DataLoader(dataset = dataset, batch_size = 10)"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model_Mini10</code> function for training the model.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define train_model_Mini5 function\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","LOSS_MINI10 = []\n","lr = 0.1\n","\n","def train_model_Mini10(epochs):\n"," for epoch in range(epochs):\n"," Yhat = forward(X)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), criterion(Yhat, Y).tolist())\n"," get_surface.plot_ps()\n"," LOSS_MINI10.append(criterion(forward(X),Y).tolist())\n"," for x, y in trainloader:\n"," yhat = forward(x)\n"," loss = criterion(yhat, y)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n"," loss.backward()\n"," w.data = w.data - lr * w.grad.data\n"," b.data = b.data - lr * b.grad.data\n"," w.grad.data.zero_()\n"," b.grad.data.zero_()"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of mini-batch gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Run train_model_Mini5 with 10 iterations.\n","\n","train_model_Mini10(10)"]},{"cell_type":"markdown","metadata":{},"source":["Plot the loss for each epoch: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot out the LOSS for each method\n","\n","plt.plot(LOSS_BGD,label = \"Batch Gradient Descent\")\n","plt.plot(LOSS_SGD,label = \"Stochastic Gradient Descent\")\n","plt.plot(LOSS_MINI5,label = \"Mini-Batch Gradient Descent, Batch size: 5\")\n","plt.plot(LOSS_MINI10,label = \"Mini-Batch Gradient Descent, Batch size: 10\")\n","plt.legend()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h3>Practice</h3>\n"]},{"cell_type":"markdown","metadata":{},"source":["Perform mini batch gradient descent with a batch size of 20. Store the total loss for each epoch in the list LOSS20. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Practice: Perform mini batch gradient descent with a batch size of 20.\n","\n","dataset = Data()"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!-- \n","trainloader = DataLoader(dataset = dataset, batch_size = 20)\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","\n","LOSS_MINI20 = []\n","lr = 0.1\n","\n","def my_train_model(epochs):\n"," for epoch in range(epochs):\n"," Yhat = forward(X)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), criterion(Yhat, Y).tolist())\n"," get_surface.plot_ps()\n"," LOSS_MINI20.append(criterion(forward(X), Y).tolist())\n"," for x, y in trainloader:\n"," yhat = forward(x)\n"," loss = criterion(yhat, y)\n"," get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n"," loss.backward()\n"," w.data = w.data - lr * w.grad.data\n"," b.data = b.data - lr * b.grad.data\n"," w.grad.data.zero_()\n"," b.grad.data.zero_()\n","\n","my_train_model(10)\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["Plot a graph that shows the LOSS results for all the methods.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Practice: Plot a graph to show all the LOSS functions\n","\n","# Type your code here"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!-- \n","plt.plot(LOSS_BGD, label = \"Batch Gradient Descent\")\n","plt.plot(LOSS_SGD, label = \"Stochastic Gradient Descent\")\n","plt.plot(LOSS_MINI5, label = \"Mini-Batch Gradient Descent,Batch size:5\")\n","plt.plot(LOSS_MINI10, label = \"Mini-Batch Gradient Descent,Batch size:10\")\n","plt.plot(LOSS_MINI20, label = \"Mini-Batch Gradient Descent,Batch size:20\")\n","plt.legend()\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<a href=\"https://dataplatform.cloud.ibm.com/registration/stepone?context=cpdaas&apps=data_science_experience,watson_machine_learning\"><img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/Watson_Studio.png\"/></a>\n"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>About the Authors:</h2> \n","\n","<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD. \n"]},{"cell_type":"markdown","metadata":{},"source":["Other contributors: <a href=\"https://www.linkedin.com/in/michelleccarey/\">Michelle Carey</a>, <a href=\"www.linkedin.com/in/jiahui-mavis-zhou-a4537814a\">Mavis Zhou</a>\n"]},{"cell_type":"markdown","metadata":{},"source":["## Change Log\n","\n","| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n","| ----------------- | ------- | ---------- | ----------------------------------------------------------- |\n","| 2020-09-23 | 2.0 | Shubham | Migrated Lab to Markdown and added to course repo in GitLab |\n"]},{"cell_type":"markdown","metadata":{},"source":["<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"}},"nbformat":4,"nbformat_minor":2} This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1 @@ {"cells":[{"cell_type":"markdown","metadata":{},"source":["<center>\n"," <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\" />\n","</center>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h1>Linear Regression 1D: Training Two Parameter Mini-Batch Gradient Descent </h1> \n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Objective</h2><ul><li> How to use PyTorch build-in functions to create a model.</li></ul> \n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Table of Contents</h2>\n","<p>In this lab, you will create a model the PyTroch way, this will help you as models get more complicated</p>\n","\n","<ul>\n"," <li><a href=\"#Makeup_Data\">Make Some Data </a></li>\n"," <li><a href=\"#Model_Cost\">Create the Model and Cost Function the PyTorch way </a></li>\n"," <li><a href=\"#BGD\">Train the Model: Batch Gradient Descent</a></li>\n","</ul>\n","\n","<p>Estimated Time Needed: <strong>30 min</strong></p>\n","\n","<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Preparation</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["We'll need the following libraries: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# These are the libraries we are going to use in the lab.\n","\n","import numpy as np\n","import matplotlib.pyplot as plt\n","from mpl_toolkits import mplot3d"]},{"cell_type":"markdown","metadata":{},"source":["The class <code>plot_error_surfaces</code> is just to help you visualize the data space and the parameter space during training and has nothing to do with PyTorch. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# class for ploting \n","\n","class plot_error_surfaces(object):\n"," \n"," # Constructor\n"," def __init__(self, w_range, b_range, X, Y, n_samples = 30, go = True):\n"," W = np.linspace(-w_range, w_range, n_samples)\n"," B = np.linspace(-b_range, b_range, n_samples)\n"," w, b = np.meshgrid(W, B) \n"," Z = np.zeros((30, 30))\n"," count1 = 0\n"," self.y = Y.numpy()\n"," self.x = X.numpy()\n"," for w1, b1 in zip(w, b):\n"," count2 = 0\n"," for w2, b2 in zip(w1, b1):\n"," Z[count1, count2] = np.mean((self.y - w2 * self.x + b2) ** 2)\n"," count2 += 1\n"," count1 += 1\n"," self.Z = Z\n"," self.w = w\n"," self.b = b\n"," self.W = []\n"," self.B = []\n"," self.LOSS = []\n"," self.n = 0\n"," if go == True:\n"," plt.figure()\n"," plt.figure(figsize = (7.5, 5))\n"," plt.axes(projection = '3d').plot_surface(self.w, self.b, self.Z, rstride = 1, cstride = 1, cmap = 'viridis', edgecolor = 'none')\n"," plt.title('Loss Surface')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()\n"," plt.figure()\n"," plt.title('Loss Surface Contour')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.contour(self.w, self.b, self.Z)\n"," plt.show()\n"," \n"," # Setter\n"," def set_para_loss(self, model, loss):\n"," self.n = self.n + 1\n"," self.LOSS.append(loss)\n"," self.W.append(list(model.parameters())[0].item())\n"," self.B.append(list(model.parameters())[1].item())\n"," \n"," # Plot diagram\n"," def final_plot(self): \n"," ax = plt.axes(projection = '3d')\n"," ax.plot_wireframe(self.w, self.b, self.Z)\n"," ax.scatter(self.W, self.B, self.LOSS, c = 'r', marker = 'x', s = 200, alpha = 1)\n"," plt.figure()\n"," plt.contour(self.w, self.b, self.Z)\n"," plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()\n"," \n"," # Plot diagram \n"," def plot_ps(self):\n"," plt.subplot(121)\n"," plt.ylim()\n"," plt.plot(self.x, self.y, 'ro', label = \"training points\")\n"," plt.plot(self.x, self.W[-1] * self.x + self.B[-1], label = \"estimated line\")\n"," plt.xlabel('x')\n"," plt.ylabel('y')\n"," plt.ylim((-10, 15))\n"," plt.title('Data Space Iteration: ' + str(self.n))\n"," plt.subplot(122)\n"," plt.contour(self.w, self.b, self.Z)\n"," plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n"," plt.title('Loss Surface Contour Iteration' + str(self.n) )\n"," plt.xlabel('w')\n"," plt.ylabel('b')\n"," plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Makeup_Data\">Make Some Data</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Import libraries and set random seed.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import libraries and set random seed\n","\n","import torch\n","from torch.utils.data import Dataset, DataLoader\n","torch.manual_seed(1)"]},{"cell_type":"markdown","metadata":{},"source":["Generate values from -3 to 3 that create a line with a slope of 1 and a bias of -1. This is the line that you need to estimate. Add some noise to the data:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Data Class\n","\n","class Data(Dataset):\n"," \n"," # Constructor\n"," def __init__(self):\n"," self.x = torch.arange(-3, 3, 0.1).view(-1, 1)\n"," self.f = 1 * self.x - 1\n"," self.y = self.f + 0.1 * torch.randn(self.x.size())\n"," self.len = self.x.shape[0]\n"," \n"," # Getter\n"," def __getitem__(self,index): \n"," return self.x[index],self.y[index]\n"," \n"," # Get Length\n"," def __len__(self):\n"," return self.len"]},{"cell_type":"markdown","metadata":{},"source":["Create a dataset object: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create dataset object\n","\n","dataset = Data()"]},{"cell_type":"markdown","metadata":{},"source":["Plot out the data and the line.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot the data\n","\n","plt.plot(dataset.x.numpy(), dataset.y.numpy(), 'rx', label = 'y')\n","plt.plot(dataset.x.numpy(), dataset.f.numpy(), label = 'f')\n","plt.xlabel('x')\n","plt.ylabel('y')\n","plt.legend()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Model_Cost\">Create the Model and Total Loss Function (Cost)</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Create a linear regression class \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a linear regression model class\n","\n","from torch import nn, optim\n","\n","class linear_regression(nn.Module):\n"," \n"," # Constructor\n"," def __init__(self, input_size, output_size):\n"," super(linear_regression, self).__init__()\n"," self.linear = nn.Linear(input_size, output_size)\n"," \n"," # Prediction\n"," def forward(self, x):\n"," yhat = self.linear(x)\n"," return yhat"]},{"cell_type":"markdown","metadata":{},"source":["We will use PyTorch build-in functions to create a criterion function; this calculates the total loss or cost \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Build in cost function\n","\n","criterion = nn.MSELoss()"]},{"cell_type":"markdown","metadata":{},"source":["Create a linear regression object and optimizer object, the optimizer object will use the linear regression object.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create optimizer\n","\n","model = linear_regression(1,1)\n","optimizer = optim.SGD(model.parameters(), lr = 0.01)"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["list(model.parameters())"]},{"cell_type":"markdown","metadata":{},"source":["Remember to construct an optimizer you have to give it an iterable containing the parameters i.e. provide <code> model.parameters()</code> as an input to the object constructor \n"]},{"cell_type":"markdown","metadata":{},"source":["<img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0110EN/notebook_images%20/chapter2/2.4model_optmiz.png\" width=\"100\" alt=\"Model Optimizer\" />\n"]},{"cell_type":"markdown","metadata":{},"source":["Similar to the model, the optimizer has a state dictionary:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["optimizer.state_dict()"]},{"cell_type":"markdown","metadata":{},"source":["Many of the keys correspond to more advanced optimizers.\n"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code>Dataloader</code> object: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Dataloader object\n","\n","trainloader = DataLoader(dataset = dataset, batch_size = 1)"]},{"cell_type":"markdown","metadata":{},"source":["PyTorch randomly initialises your model parameters. If we use those parameters, the result will not be very insightful as convergence will be extremely fast. So we will initialise the parameters such that they will take longer to converge, i.e. look cool \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Customize the weight and bias\n","\n","model.state_dict()['linear.weight'][0] = -15\n","model.state_dict()['linear.bias'][0] = -10"]},{"cell_type":"markdown","metadata":{},"source":["Create a plotting object, not part of PyTroch, just used to help visualize \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create plot surface object\n","\n","get_surface = plot_error_surfaces(15, 13, dataset.x, dataset.y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"BGD\">Train the Model via Batch Gradient Descent</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of stochastic gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Train Model\n","\n","def train_model_BGD(iter):\n"," for epoch in range(iter):\n"," for x,y in trainloader:\n"," yhat = model(x)\n"," loss = criterion(yhat, y)\n"," get_surface.set_para_loss(model, loss.tolist()) \n"," optimizer.zero_grad()\n"," loss.backward()\n","\n"," optimizer.step()\n"," get_surface.plot_ps()\n","\n","\n","train_model_BGD(10)"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["model.state_dict()"]},{"cell_type":"markdown","metadata":{},"source":["Let's use the following diagram to help clarify the process. The model takes <code>x</code> to produce an estimate <code>yhat</code>, it will then be compared to the actual <code>y</code> with the loss function.\n"]},{"cell_type":"markdown","metadata":{},"source":["<img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0110EN/notebook_images%20/chapter2/2.4get_loss.png\" width=\"400\" alt=\"Old Model Cost diagram\" />\n"]},{"cell_type":"markdown","metadata":{},"source":["When we call <code>backward()</code> on the loss function, it will handle the differentiation. Calling the method step on the optimizer object it will update the parameters as they were inputs when we constructed the optimizer object. The connection is shown in the following figure :\n"]},{"cell_type":"markdown","metadata":{},"source":["<img src = \"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0110EN/notebook_images%20/chapter2/2.4update_param.png\" width=\"500\" alt=\"Model Cost with optimizer\" />\n"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h3>Practice</h3>\n"]},{"cell_type":"markdown","metadata":{},"source":["Try to train the model via BGD with <code>lr = 0.1</code>. Use <code>optimizer</code> and the following given variables.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Practice: Train the model via BGD using optimizer\n","\n","model = linear_regression(1,1)\n","model.state_dict()['linear.weight'][0] = -15\n","model.state_dict()['linear.bias'][0] = -10\n","get_surface = plot_error_surfaces(15, 13, dataset.x, dataset.y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!-- \n","optimizer = optim.SGD(model.parameters(), lr = 0.1)\n","trainloader = DataLoader(dataset = dataset, batch_size = 1)\n","\n","def my_train_model(iter):\n"," for epoch in range(iter):\n"," for x,y in trainloader:\n"," yhat = model(x)\n"," loss = criterion(yhat, y)\n"," get_surface.set_para_loss(model, loss.tolist()) \n"," optimizer.zero_grad()\n"," loss.backward()\n"," optimizer.step()\n"," get_surface.plot_ps()\n","\n","train_model_BGD(10)\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<a href=\"https://dataplatform.cloud.ibm.com/registration/stepone?context=cpdaas&apps=data_science_experience,watson_machine_learning\"><img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/Watson_Studio.png\"/></a>\n"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>About the Authors:</h2> \n","\n","<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n"]},{"cell_type":"markdown","metadata":{},"source":["Other contributors: <a href=\"https://www.linkedin.com/in/michelleccarey/\">Michelle Carey</a>, <a href=\"www.linkedin.com/in/jiahui-mavis-zhou-a4537814a\">Mavis Zhou</a>\n"]},{"cell_type":"markdown","metadata":{},"source":["## Change Log\n","\n","| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n","| ----------------- | ------- | ---------- | ----------------------------------------------------------- |\n","| 2020-09-23 | 2.0 | Shubham | Migrated Lab to Markdown and added to course repo in GitLab |\n"]},{"cell_type":"markdown","metadata":{},"source":["<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"}},"nbformat":4,"nbformat_minor":2} This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1 @@ {"cells":[{"cell_type":"markdown","metadata":{},"source":["<center>\n"," <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\" />\n","</center>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h1>Linear regression: Training and Validation Data</h1> \n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Objective</h2><ul><li> How to use learning rate hyperparameter to improve your model result. .</li></ul> \n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Table of Contents</h2>\n","<p>In this lab, you will learn to select the best learning rate by using validation data.</p>\n","\n","<ul>\n"," <li><a href=\"#Makeup_Data\">Make Some Data</a></li>\n"," <li><a href=\"#LR_Loader_Cost\">Create a Linear Regression Object, Data Loader and Criterion Function</a></li>\n"," <li><a href=\"#LR_Hyper\">Different learning rates and Data Structures to Store results for Different Hyperparameters</a></li>\n"," <li><a href=\"#Model\">Train different modules for different Hyperparameters</a></li>\n"," <li><a href=\"#Result\">View Results</a></li>\n","</ul>\n","\n","<p>Estimated Time Needed: <strong>30 min</strong></p>\n","\n","<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Preparation</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["We'll need the following libraries and set the random seed.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import libraries we need for this lab, and set the random seed\n","\n","from torch import nn\n","import torch\n","import numpy as np\n","import matplotlib.pyplot as plt\n","from torch import nn,optim"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Makeup_Data\">Make Some Data</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["First, we'll create some artificial data in a dataset class. The class will include the option to produce training data or validation data. The training data will include outliers.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Data class\n","\n","from torch.utils.data import Dataset, DataLoader\n","\n","class Data(Dataset):\n"," \n"," # Constructor\n"," def __init__(self, train = True):\n"," self.x = torch.arange(-3, 3, 0.1).view(-1, 1)\n"," self.f = -3 * self.x + 1\n"," self.y = self.f + 0.1 * torch.randn(self.x.size())\n"," self.len = self.x.shape[0]\n"," \n"," #outliers \n"," if train == True:\n"," self.y[0] = 0\n"," self.y[50:55] = 20\n"," else:\n"," pass\n"," \n"," # Getter\n"," def __getitem__(self, index): \n"," return self.x[index], self.y[index]\n"," \n"," # Get Length\n"," def __len__(self):\n"," return self.len"]},{"cell_type":"markdown","metadata":{},"source":["Create two objects: one that contains training data and a second that contains validation data. Assume that the training data has the outliers. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create training dataset and validation dataset\n","\n","train_data = Data()\n","val_data = Data(train = False)"]},{"cell_type":"markdown","metadata":{},"source":["Overlay the training points in red over the function that generated the data. Notice the outliers at x=-3 and around x=2:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot out training points\n","\n","plt.plot(train_data.x.numpy(), train_data.y.numpy(), 'xr',label=\"training data \")\n","plt.plot(train_data.x.numpy(), train_data.f.numpy(),label=\"true function \")\n","plt.xlabel('x')\n","plt.ylabel('y')\n","plt.legend()\n","plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"LR_Loader_Cost\">Create a Linear Regression Object, Data Loader, and Criterion Function</h2>\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Linear Regression Class\n","\n","from torch import nn\n","\n","class linear_regression(nn.Module):\n"," \n"," # Constructor\n"," def __init__(self, input_size, output_size):\n"," super(linear_regression, self).__init__()\n"," self.linear = nn.Linear(input_size, output_size)\n"," \n"," # Prediction function\n"," def forward(self, x):\n"," yhat = self.linear(x)\n"," return yhat"]},{"cell_type":"markdown","metadata":{},"source":["Create the criterion function and a <code>DataLoader</code> object: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create MSELoss function and DataLoader\n","\n","criterion = nn.MSELoss()\n","trainloader = DataLoader(dataset = train_data, batch_size = 1)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"LR_Hyper\">Different learning rates and Data Structures to Store results for different Hyperparameters</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Create a list with different learning rates and a tensor (can be a list) for the training and validating cost/total loss. Include the list MODELS, which stores the training model for every value of the learning rate. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Learning Rate list, the error lists and the MODELS list\n","\n","learning_rates=[0.0001, 0.001, 0.01, 0.1]\n","\n","train_error=torch.zeros(len(learning_rates))\n","validation_error=torch.zeros(len(learning_rates))\n","\n","MODELS=[]"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Model\">Train different models for different Hyperparameters</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Try different values of learning rates, perform stochastic gradient descent, and save the results on the training data and validation data. Finally, save each model in a list.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the train model function and train the model\n","\n","def train_model_with_lr (iter, lr_list):\n"," \n"," # iterate through different learning rates \n"," for i, lr in enumerate(lr_list):\n"," model = linear_regression(1, 1)\n"," optimizer = optim.SGD(model.parameters(), lr = lr)\n"," for epoch in range(iter):\n"," for x, y in trainloader:\n"," yhat = model(x)\n"," loss = criterion(yhat, y)\n"," optimizer.zero_grad()\n"," loss.backward()\n"," optimizer.step()\n"," \n"," # train data\n"," Yhat = model(train_data.x)\n"," train_loss = criterion(Yhat, train_data.y)\n"," train_error[i] = train_loss.item()\n"," \n"," # validation data\n"," Yhat = model(val_data.x)\n"," val_loss = criterion(Yhat, val_data.y)\n"," validation_error[i] = val_loss.item()\n"," MODELS.append(model)\n","\n","train_model_with_lr(10, learning_rates)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Result\">View the Results</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Plot the training loss and validation loss for each learning rate: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot the training loss and validation loss\n","\n","plt.semilogx(np.array(learning_rates), train_error.numpy(), label = 'training loss/total Loss')\n","plt.semilogx(np.array(learning_rates), validation_error.numpy(), label = 'validation cost/total Loss')\n","plt.ylabel('Cost\\ Total Loss')\n","plt.xlabel('learning rate')\n","plt.legend()\n","plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["Produce a prediction by using the validation data for each model: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot the predictions\n","\n","i = 0\n","for model, learning_rate in zip(MODELS, learning_rates):\n"," yhat = model(val_data.x)\n"," plt.plot(val_data.x.numpy(), yhat.detach().numpy(), label = 'lr:' + str(learning_rate))\n"," print('i', yhat.detach().numpy()[0:3])\n","plt.plot(val_data.x.numpy(), val_data.f.numpy(), 'or', label = 'validation data')\n","plt.xlabel('x')\n","plt.ylabel('y')\n","plt.legend()\n","plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h3>Practice</h3>\n"]},{"cell_type":"markdown","metadata":{},"source":["The object <code>good_model</code> is the best performing model. Use the train loader to get the data samples x and y. Produce an estimate for <code>yhat</code> and print it out for every sample in a for a loop. Compare it to the actual prediction <code>y</code>.\n"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!-- \n","for x, y in trainloader:\n"," print(\"yhat= \", good_model(x),\"y\", y)\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<a href=\"https://dataplatform.cloud.ibm.com/registration/stepone?context=cpdaas&apps=data_science_experience,watson_machine_learning\"><img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/Watson_Studio.png\"/></a>\n"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>About the Authors:</h2> \n","\n","<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD. \n"]},{"cell_type":"markdown","metadata":{},"source":["Other contributors: <a href=\"https://www.linkedin.com/in/michelleccarey/\">Michelle Carey</a>, <a href=\"www.linkedin.com/in/jiahui-mavis-zhou-a4537814a\">Mavis Zhou</a>\n"]},{"cell_type":"markdown","metadata":{},"source":["## Change Log\n","\n","| Date (YYYY-MM-DD) | Version | Changed By | Change Description |\n","| ----------------- | ------- | ---------- | ----------------------------------------------------------- |\n","| 2020-09-23 | 2.0 | Shubham | Migrated Lab to Markdown and added to course repo in GitLab |\n"]},{"cell_type":"markdown","metadata":{},"source":["<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"}},"nbformat":4,"nbformat_minor":2}