{"cells":[{"cell_type":"markdown","metadata":{},"source":["<center>\n","    <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\"  />\n","</center>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h1>Linear Regression 1D: Training Two Parameter Mini-Batch Gradient Descent </h1> \n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Objective</h2><ul><li> How to use PyTorch build-in functions to create a model.</li></ul> \n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Table of Contents</h2>\n","<p>In this lab, you will create a model the PyTroch way, this will help you as models get more complicated</p>\n","\n","<ul>\n","    <li><a href=\"#Makeup_Data\">Make Some Data </a></li>\n","    <li><a href=\"#Model_Cost\">Create the Model and Cost Function the PyTorch way </a></li>\n","    <li><a href=\"#BGD\">Train the Model: Batch Gradient Descent</a></li>\n","</ul>\n","\n","<p>Estimated Time Needed: <strong>30 min</strong></p>\n","\n","<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Preparation</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["We'll need the following libraries:  \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# These are the libraries we are going to use in the lab.\n","\n","import numpy as np\n","import matplotlib.pyplot as plt\n","from mpl_toolkits import mplot3d"]},{"cell_type":"markdown","metadata":{},"source":["The class <code>plot_error_surfaces</code> is just to help you visualize the data space and the parameter space during training and has nothing to do with PyTorch. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# class for ploting  \n","\n","class plot_error_surfaces(object):\n","    \n","    # Constructor\n","    def __init__(self, w_range, b_range, X, Y, n_samples = 30, go = True):\n","        W = np.linspace(-w_range, w_range, n_samples)\n","        B = np.linspace(-b_range, b_range, n_samples)\n","        w, b = np.meshgrid(W, B)    \n","        Z = np.zeros((30, 30))\n","        count1 = 0\n","        self.y = Y.numpy()\n","        self.x = X.numpy()\n","        for w1, b1 in zip(w, b):\n","            count2 = 0\n","            for w2, b2 in zip(w1, b1):\n","                Z[count1, count2] = np.mean((self.y - w2 * self.x + b2) ** 2)\n","                count2 += 1\n","            count1 += 1\n","        self.Z = Z\n","        self.w = w\n","        self.b = b\n","        self.W = []\n","        self.B = []\n","        self.LOSS = []\n","        self.n = 0\n","        if go == True:\n","            plt.figure()\n","            plt.figure(figsize = (7.5, 5))\n","            plt.axes(projection = '3d').plot_surface(self.w, self.b, self.Z, rstride = 1, cstride = 1, cmap = 'viridis', edgecolor = 'none')\n","            plt.title('Loss Surface')\n","            plt.xlabel('w')\n","            plt.ylabel('b')\n","            plt.show()\n","            plt.figure()\n","            plt.title('Loss Surface Contour')\n","            plt.xlabel('w')\n","            plt.ylabel('b')\n","            plt.contour(self.w, self.b, self.Z)\n","            plt.show()\n","            \n","    # Setter\n","    def set_para_loss(self, model, loss):\n","        self.n = self.n + 1\n","        self.LOSS.append(loss)\n","        self.W.append(list(model.parameters())[0].item())\n","        self.B.append(list(model.parameters())[1].item())\n","    \n","    # Plot diagram\n","    def final_plot(self): \n","        ax = plt.axes(projection = '3d')\n","        ax.plot_wireframe(self.w, self.b, self.Z)\n","        ax.scatter(self.W, self.B, self.LOSS, c = 'r', marker = 'x', s = 200, alpha = 1)\n","        plt.figure()\n","        plt.contour(self.w, self.b, self.Z)\n","        plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n","        plt.xlabel('w')\n","        plt.ylabel('b')\n","        plt.show()\n","        \n","    # Plot diagram    \n","    def plot_ps(self):\n","        plt.subplot(121)\n","        plt.ylim()\n","        plt.plot(self.x, self.y, 'ro', label = \"training points\")\n","        plt.plot(self.x, self.W[-1] * self.x + self.B[-1], label = \"estimated line\")\n","        plt.xlabel('x')\n","        plt.ylabel('y')\n","        plt.ylim((-10, 15))\n","        plt.title('Data Space Iteration: ' + str(self.n))\n","        plt.subplot(122)\n","        plt.contour(self.w, self.b, self.Z)\n","        plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n","        plt.title('Loss Surface Contour Iteration' + str(self.n) )\n","        plt.xlabel('w')\n","        plt.ylabel('b')\n","        plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Makeup_Data\">Make Some Data</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Import libraries and set random seed.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import libraries and set random seed\n","\n","import torch\n","from torch.utils.data import Dataset, DataLoader\n","torch.manual_seed(1)"]},{"cell_type":"markdown","metadata":{},"source":["Generate values from -3 to 3 that create a line with a slope of 1 and a bias of -1. This is the line that you need to estimate. Add some noise to the data:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Data Class\n","\n","class Data(Dataset):\n","    \n","    # Constructor\n","    def __init__(self):\n","        self.x = torch.arange(-3, 3, 0.1).view(-1, 1)\n","        self.f = 1 * self.x - 1\n","        self.y = self.f + 0.1 * torch.randn(self.x.size())\n","        self.len = self.x.shape[0]\n","        \n","    # Getter\n","    def __getitem__(self,index):    \n","        return self.x[index],self.y[index]\n","    \n","    # Get Length\n","    def __len__(self):\n","        return self.len"]},{"cell_type":"markdown","metadata":{},"source":["Create a dataset object: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create dataset object\n","\n","dataset = Data()"]},{"cell_type":"markdown","metadata":{},"source":["Plot out the data and the line.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot the data\n","\n","plt.plot(dataset.x.numpy(), dataset.y.numpy(), 'rx', label = 'y')\n","plt.plot(dataset.x.numpy(), dataset.f.numpy(), label = 'f')\n","plt.xlabel('x')\n","plt.ylabel('y')\n","plt.legend()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Model_Cost\">Create the Model and Total Loss Function (Cost)</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Create a linear regression class \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a linear regression model class\n","\n","from torch import nn, optim\n","\n","class linear_regression(nn.Module):\n","    \n","    # Constructor\n","    def __init__(self, input_size, output_size):\n","        super(linear_regression, self).__init__()\n","        self.linear = nn.Linear(input_size, output_size)\n","        \n","    # Prediction\n","    def forward(self, x):\n","        yhat = self.linear(x)\n","        return yhat"]},{"cell_type":"markdown","metadata":{},"source":["We will use PyTorch build-in functions to create a criterion function; this calculates the total loss or cost \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Build in cost function\n","\n","criterion = nn.MSELoss()"]},{"cell_type":"markdown","metadata":{},"source":["Create a linear regression object and optimizer object, the optimizer object will use the linear regression object.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create optimizer\n","\n","model = linear_regression(1,1)\n","optimizer = optim.SGD(model.parameters(), lr = 0.01)"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["list(model.parameters())"]},{"cell_type":"markdown","metadata":{},"source":["Remember to construct an optimizer you have to give it an iterable containing the parameters i.e. provide <code> model.parameters()</code> as an input to the object constructor \n"]},{"cell_type":"markdown","metadata":{},"source":["<img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0110EN/notebook_images%20/chapter2/2.4model_optmiz.png\" width=\"100\" alt=\"Model Optimizer\" />\n"]},{"cell_type":"markdown","metadata":{},"source":["Similar to the model, the optimizer has a state dictionary:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["optimizer.state_dict()"]},{"cell_type":"markdown","metadata":{},"source":["Many of the keys correspond to more advanced optimizers.\n"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code>Dataloader</code> object: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Dataloader object\n","\n","trainloader = DataLoader(dataset = dataset, batch_size = 1)"]},{"cell_type":"markdown","metadata":{},"source":["PyTorch randomly initialises your model parameters. If we use those parameters, the result will not be very insightful as convergence will be extremely fast. So we will initialise the parameters such that they will take longer to converge, i.e. look cool  \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Customize the weight and bias\n","\n","model.state_dict()['linear.weight'][0] = -15\n","model.state_dict()['linear.bias'][0] = -10"]},{"cell_type":"markdown","metadata":{},"source":["Create a plotting object, not part of PyTroch, just used to help visualize \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create plot surface object\n","\n","get_surface = plot_error_surfaces(15, 13, dataset.x, dataset.y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"BGD\">Train the Model via Batch Gradient Descent</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of stochastic gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Train Model\n","\n","def train_model_BGD(iter):\n","    for epoch in range(iter):\n","        for x,y in trainloader:\n","            yhat = model(x)\n","            loss = criterion(yhat, y)\n","            get_surface.set_para_loss(model, loss.tolist())          \n","            optimizer.zero_grad()\n","            loss.backward()\n","\n","            optimizer.step()\n","        get_surface.plot_ps()\n","\n","\n","train_model_BGD(10)"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["model.state_dict()"]},{"cell_type":"markdown","metadata":{},"source":["Let's use the following diagram to help clarify the process. The model takes <code>x</code> to produce an estimate <code>yhat</code>, it will then be compared to the actual <code>y</code>  with the loss function.\n"]},{"cell_type":"markdown","metadata":{},"source":["<img src=\"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0110EN/notebook_images%20/chapter2/2.4get_loss.png\" width=\"400\" alt=\"Old Model Cost diagram\" />\n"]},{"cell_type":"markdown","metadata":{},"source":["When we call <code>backward()</code> on the loss function, it will handle the differentiation. Calling the method step on the optimizer object it will update the parameters as they were inputs when we constructed the optimizer object. The connection is shown in the following figure :\n"]},{"cell_type":"markdown","metadata":{},"source":["<img src = \"https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0110EN/notebook_images%20/chapter2/2.4update_param.png\" width=\"500\" alt=\"Model Cost with optimizer\" />\n"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h3>Practice</h3>\n"]},{"cell_type":"markdown","metadata":{},"source":["Try to train the model via BGD with <code>lr = 0.1</code>. Use <code>optimizer</code> and the following given variables.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Practice: Train the model via BGD using optimizer\n","\n","model = linear_regression(1,1)\n","model.state_dict()['linear.weight'][0] = -15\n","model.state_dict()['linear.bias'][0] = -10\n","get_surface = plot_error_surfaces(15, 13, dataset.x, dataset.y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!-- \n","optimizer = optim.SGD(model.parameters(), lr = 0.1)\n","trainloader = DataLoader(dataset = dataset, batch_size = 1)\n","\n","def my_train_model(iter):\n","    for epoch in range(iter):\n","        for x,y in trainloader:\n","            yhat = model(x)\n","            loss = criterion(yhat, y)\n","            get_surface.set_para_loss(model, loss.tolist()) \n","            optimizer.zero_grad()\n","            loss.backward()\n","            optimizer.step()\n","        get_surface.plot_ps()\n","\n","train_model_BGD(10)\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<a href=\"https://dataplatform.cloud.ibm.com/registration/stepone?context=cpdaas&apps=data_science_experience,watson_machine_learning\"><img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/Watson_Studio.png\"/></a>\n"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>About the Authors:</h2> \n","\n","<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n"]},{"cell_type":"markdown","metadata":{},"source":["Other contributors: <a href=\"https://www.linkedin.com/in/michelleccarey/\">Michelle Carey</a>, <a href=\"www.linkedin.com/in/jiahui-mavis-zhou-a4537814a\">Mavis Zhou</a>\n"]},{"cell_type":"markdown","metadata":{},"source":["## Change Log\n","\n","| Date (YYYY-MM-DD) | Version | Changed By | Change Description                                          |\n","| ----------------- | ------- | ---------- | ----------------------------------------------------------- |\n","| 2020-09-23        | 2.0     | Shubham    | Migrated Lab to Markdown and added to course repo in GitLab |\n"]},{"cell_type":"markdown","metadata":{},"source":["<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"}},"nbformat":4,"nbformat_minor":2}