{"cells":[{"cell_type":"markdown","metadata":{},"source":["<center>\n","    <img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/IDSNlogo.png\" width=\"300\" alt=\"cognitiveclass.ai logo\"  />\n","</center>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h1>Linear Regression 1D: Training Two Parameter Mini-Batch Gradient Decent</h1>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Objective</h2><ul><li> How to use Mini-Batch Gradient Descent to train model.</li></ul> \n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Table of Contents</h2>\n","<p>In this Lab, you will practice training a model by using Mini-Batch Gradient Descent.</p>\n","\n","<ul>\n","    <li><a href=\"#Makeup_Data\">Make Some Data</a></li>\n","    <li><a href=\"#Model_Cost\">Create the Model and Cost Function (Total Loss)</a></li>\n","    <li><a href=\"#BGD\">Train the Model: Batch Gradient Descent</a></li>\n","    <li><a href=\"#SGD\">Train the Model: Stochastic Gradient Descent with Dataset DataLoader</a></li>\n","    <li><a href=\"#Mini5\">Train the Model: Mini Batch Gradient Decent: Batch Size Equals 5</a></li>\n","    <li><a href=\"#Mini10\">Train the Model: Mini Batch Gradient Decent: Batch Size Equals 10</a></li>\n","</ul>\n","<p>Estimated Time Needed: <strong>30 min</strong></p>\n","</div>\n","\n","<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Preparation</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["We'll need the following libraries:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import the libraries we need for this lab\n","\n","import numpy as np\n","import matplotlib.pyplot as plt\n","from mpl_toolkits import mplot3d"]},{"cell_type":"markdown","metadata":{},"source":["The class <code>plot_error_surfaces</code> is just to help you visualize the data space and the parameter space during training and has nothing to do with PyTorch. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# The class for plotting the diagrams\n","\n","class plot_error_surfaces(object):\n","    \n","    # Constructor\n","    def __init__(self, w_range, b_range, X, Y, n_samples = 30, go = True):\n","        W = np.linspace(-w_range, w_range, n_samples)\n","        B = np.linspace(-b_range, b_range, n_samples)\n","        w, b = np.meshgrid(W, B)    \n","        Z = np.zeros((30, 30))\n","        count1 = 0\n","        self.y = Y.numpy()\n","        self.x = X.numpy()\n","        for w1, b1 in zip(w, b):\n","            count2 = 0\n","            for w2, b2 in zip(w1, b1):\n","                Z[count1, count2] = np.mean((self.y - w2 * self.x + b2) ** 2)\n","                count2 += 1\n","            count1 += 1\n","        self.Z = Z\n","        self.w = w\n","        self.b = b\n","        self.W = []\n","        self.B = []\n","        self.LOSS = []\n","        self.n = 0\n","        if go == True:\n","            plt.figure()\n","            plt.figure(figsize = (7.5, 5))\n","            plt.axes(projection = '3d').plot_surface(self.w, self.b, self.Z, rstride = 1, cstride = 1, cmap = 'viridis', edgecolor = 'none')\n","            plt.title('Loss Surface')\n","            plt.xlabel('w')\n","            plt.ylabel('b')\n","            plt.show()\n","            plt.figure()\n","            plt.title('Loss Surface Contour')\n","            plt.xlabel('w')\n","            plt.ylabel('b')\n","            plt.contour(self.w, self.b, self.Z)\n","            plt.show()\n","            \n","     # Setter\n","    def set_para_loss(self, W, B, loss):\n","        self.n = self.n + 1\n","        self.W.append(W)\n","        self.B.append(B)\n","        self.LOSS.append(loss)\n","    \n","    # Plot diagram\n","    def final_plot(self): \n","        ax = plt.axes(projection = '3d')\n","        ax.plot_wireframe(self.w, self.b, self.Z)\n","        ax.scatter(self.W, self.B, self.LOSS, c = 'r', marker = 'x', s = 200, alpha = 1)\n","        plt.figure()\n","        plt.contour(self.w, self.b, self.Z)\n","        plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n","        plt.xlabel('w')\n","        plt.ylabel('b')\n","        plt.show()\n","    \n","    # Plot diagram\n","    def plot_ps(self):\n","        plt.subplot(121)\n","        plt.ylim()\n","        plt.plot(self.x, self.y, 'ro', label = \"training points\")\n","        plt.plot(self.x, self.W[-1] * self.x + self.B[-1], label = \"estimated line\")\n","        plt.xlabel('x')\n","        plt.ylabel('y')\n","        plt.title('Data Space Iteration: '+ str(self.n))\n","        plt.subplot(122)\n","        plt.contour(self.w, self.b, self.Z)\n","        plt.scatter(self.W, self.B, c = 'r', marker = 'x')\n","        plt.title('Loss Surface Contour')\n","        plt.xlabel('w')\n","        plt.ylabel('b')\n","        plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Makeup_Data\">Make Some Data </h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Import PyTorch and set random seed:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import PyTorch library\n","\n","import torch\n","torch.manual_seed(1)"]},{"cell_type":"markdown","metadata":{},"source":["Generate values from -3 to 3 that create a line with a slope of 1 and a bias of -1. This is the line that you need to estimate. Add some noise to the data:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Generate the data with noise and the line\n","\n","X = torch.arange(-3, 3, 0.1).view(-1, 1)\n","f = 1 * X - 1\n","Y = f + 0.1 * torch.randn(X.size())"]},{"cell_type":"markdown","metadata":{},"source":["Plot the results:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot the line and the data\n","\n","plt.plot(X.numpy(), Y.numpy(), 'rx', label = 'y')\n","plt.plot(X.numpy(), f.numpy(), label = 'f')\n","plt.xlabel('x')\n","plt.ylabel('y')\n","plt.legend()\n","plt.show()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Model_Cost\">Create the Model and Cost Function (Total Loss) </h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Define the <code>forward</code> function: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the prediction function\n","\n","def forward(x):\n","    return w * x + b"]},{"cell_type":"markdown","metadata":{},"source":["Define the cost or criterion function: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the cost function\n","\n","def criterion(yhat, y):\n","    return torch.mean((yhat - y) ** 2)"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code> plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a plot_error_surfaces object.\n","\n","get_surface = plot_error_surfaces(15, 13, X, Y, 30)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>Train the Model: Batch Gradient Descent (BGD)</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model_BGD</code> function.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define the function for training model\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","lr = 0.1\n","LOSS_BGD = []\n","\n","def train_model_BGD(epochs):\n","    for epoch in range(epochs):\n","        Yhat = forward(X)\n","        loss = criterion(Yhat, Y)\n","        LOSS_BGD.append(loss)\n","        get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n","        get_surface.plot_ps()\n","        loss.backward()\n","        w.data = w.data - lr * w.grad.data\n","        b.data = b.data - lr * b.grad.data\n","        w.grad.data.zero_()\n","        b.grad.data.zero_()"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of batch gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Run train_model_BGD with 10 iterations\n","\n","train_model_BGD(10)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"SGD\"> Stochastic Gradient Descent (SGD) with Dataset DataLoader</h2>\n"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code>plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a plot_error_surfaces object.\n","\n","get_surface = plot_error_surfaces(15, 13, X, Y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["Import <code>Dataset</code> and <code>DataLoader</code> libraries\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Import libraries\n","\n","from torch.utils.data import Dataset, DataLoader"]},{"cell_type":"markdown","metadata":{},"source":["Create <code>Data</code> class\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create class Data\n","\n","class Data(Dataset):\n","    \n","    # Constructor\n","    def __init__(self):\n","        self.x = torch.arange(-3, 3, 0.1).view(-1, 1)\n","        self.y = 1 * X - 1\n","        self.len = self.x.shape[0]\n","        \n","    # Getter\n","    def __getitem__(self, index):    \n","        return self.x[index], self.y[index]\n","    \n","    # Get length\n","    def __len__(self):\n","        return self.len"]},{"cell_type":"markdown","metadata":{},"source":["Create a dataset object and a dataloader object: \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create Data object and DataLoader object\n","\n","dataset = Data()\n","trainloader = DataLoader(dataset = dataset, batch_size = 1)"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model_SGD</code> function for training the model.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define train_model_SGD function\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","LOSS_SGD = []\n","lr = 0.1\n","def train_model_SGD(epochs):\n","    for epoch in range(epochs):\n","        Yhat = forward(X)\n","        get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), criterion(Yhat, Y).tolist())\n","        get_surface.plot_ps()\n","        LOSS_SGD.append(criterion(forward(X), Y).tolist())\n","        for x, y in trainloader:\n","            yhat = forward(x)\n","            loss = criterion(yhat, y)\n","            get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n","            loss.backward()\n","            w.data = w.data - lr * w.grad.data\n","            b.data = b.data - lr * b.grad.data\n","            w.grad.data.zero_()\n","            b.grad.data.zero_()\n","        get_surface.plot_ps()"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of stochastic gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Run train_model_SGD(iter) with 10 iterations\n","\n","train_model_SGD(10)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Mini5\">Mini Batch Gradient Descent: Batch Size Equals 5</h2> \n"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code> plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a plot_error_surfaces object.\n","\n","get_surface = plot_error_surfaces(15, 13, X, Y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["Create <code>Data</code> object and create a <code>Dataloader</code> object where the batch size equals 5:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create DataLoader object and Data object\n","\n","dataset = Data()\n","trainloader = DataLoader(dataset = dataset, batch_size = 5)"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model_Mini5</code> function to train the model.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define train_model_Mini5 function\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","LOSS_MINI5 = []\n","lr = 0.1\n","\n","def train_model_Mini5(epochs):\n","    for epoch in range(epochs):\n","        Yhat = forward(X)\n","        get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), criterion(Yhat, Y).tolist())\n","        get_surface.plot_ps()\n","        LOSS_MINI5.append(criterion(forward(X), Y).tolist())\n","        for x, y in trainloader:\n","            yhat = forward(x)\n","            loss = criterion(yhat, y)\n","            get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n","            loss.backward()\n","            w.data = w.data - lr * w.grad.data\n","            b.data = b.data - lr * b.grad.data\n","            w.grad.data.zero_()\n","            b.grad.data.zero_()"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of mini-batch gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Run train_model_Mini5 with 10 iterations.\n","\n","train_model_Mini5(10)"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2 id=\"Mini10\">Mini Batch Gradient Descent: Batch Size Equals 10</h2> \n"]},{"cell_type":"markdown","metadata":{},"source":["Create a <code> plot_error_surfaces</code> object to visualize the data space and the parameter space during training:\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a plot_error_surfaces object.\n","\n","get_surface = plot_error_surfaces(15, 13, X, Y, 30, go = False)"]},{"cell_type":"markdown","metadata":{},"source":["Create <code>Data</code> object and create a <code>Dataloader</code> object batch size equals 10\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create DataLoader object\n","\n","dataset = Data()\n","trainloader = DataLoader(dataset = dataset, batch_size = 10)"]},{"cell_type":"markdown","metadata":{},"source":["Define <code>train_model_Mini10</code> function for training the model.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Define train_model_Mini5 function\n","\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","LOSS_MINI10 = []\n","lr = 0.1\n","\n","def train_model_Mini10(epochs):\n","    for epoch in range(epochs):\n","        Yhat = forward(X)\n","        get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), criterion(Yhat, Y).tolist())\n","        get_surface.plot_ps()\n","        LOSS_MINI10.append(criterion(forward(X),Y).tolist())\n","        for x, y in trainloader:\n","            yhat = forward(x)\n","            loss = criterion(yhat, y)\n","            get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n","            loss.backward()\n","            w.data = w.data - lr * w.grad.data\n","            b.data = b.data - lr * b.grad.data\n","            w.grad.data.zero_()\n","            b.grad.data.zero_()"]},{"cell_type":"markdown","metadata":{},"source":["Run 10 epochs of mini-batch gradient descent: <b>bug</b> data space is 1 iteration ahead of parameter space. \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Run train_model_Mini5 with 10 iterations.\n","\n","train_model_Mini10(10)"]},{"cell_type":"markdown","metadata":{},"source":["Plot the loss for each epoch:  \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Plot out the LOSS for each method\n","\n","plt.plot(LOSS_BGD,label = \"Batch Gradient Descent\")\n","plt.plot(LOSS_SGD,label = \"Stochastic Gradient Descent\")\n","plt.plot(LOSS_MINI5,label = \"Mini-Batch Gradient Descent, Batch size: 5\")\n","plt.plot(LOSS_MINI10,label = \"Mini-Batch Gradient Descent, Batch size: 10\")\n","plt.legend()"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h3>Practice</h3>\n"]},{"cell_type":"markdown","metadata":{},"source":["Perform mini batch gradient descent with a batch size of 20. Store the total loss for each epoch in the list LOSS20.  \n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Practice: Perform mini batch gradient descent with a batch size of 20.\n","\n","dataset = Data()"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!-- \n","trainloader = DataLoader(dataset = dataset, batch_size = 20)\n","w = torch.tensor(-15.0, requires_grad = True)\n","b = torch.tensor(-10.0, requires_grad = True)\n","\n","LOSS_MINI20 = []\n","lr = 0.1\n","\n","def my_train_model(epochs):\n","    for epoch in range(epochs):\n","        Yhat = forward(X)\n","        get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), criterion(Yhat, Y).tolist())\n","        get_surface.plot_ps()\n","        LOSS_MINI20.append(criterion(forward(X), Y).tolist())\n","        for x, y in trainloader:\n","            yhat = forward(x)\n","            loss = criterion(yhat, y)\n","            get_surface.set_para_loss(w.data.tolist(), b.data.tolist(), loss.tolist())\n","            loss.backward()\n","            w.data = w.data - lr * w.grad.data\n","            b.data = b.data - lr * b.grad.data\n","            w.grad.data.zero_()\n","            b.grad.data.zero_()\n","\n","my_train_model(10)\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["Plot a graph that shows the LOSS results for all the methods.\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Practice: Plot a graph to show all the LOSS functions\n","\n","# Type your code here"]},{"cell_type":"markdown","metadata":{},"source":["Double-click <b>here</b> for the solution.\n","\n","<!-- \n","plt.plot(LOSS_BGD, label = \"Batch Gradient Descent\")\n","plt.plot(LOSS_SGD, label = \"Stochastic Gradient Descent\")\n","plt.plot(LOSS_MINI5, label = \"Mini-Batch Gradient Descent,Batch size:5\")\n","plt.plot(LOSS_MINI10, label = \"Mini-Batch Gradient Descent,Batch size:10\")\n","plt.plot(LOSS_MINI20, label = \"Mini-Batch Gradient Descent,Batch size:20\")\n","plt.legend()\n","-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<a href=\"https://dataplatform.cloud.ibm.com/registration/stepone?context=cpdaas&apps=data_science_experience,watson_machine_learning\"><img src=\"https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DL0110EN-SkillsNetwork/Template/module%201/images/Watson_Studio.png\"/></a>\n"]},{"cell_type":"markdown","metadata":{},"source":["<!--Empty Space for separating topics-->\n"]},{"cell_type":"markdown","metadata":{},"source":["<h2>About the Authors:</h2> \n","\n","<a href=\"https://www.linkedin.com/in/joseph-s-50398b136/\">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD. \n"]},{"cell_type":"markdown","metadata":{},"source":["Other contributors: <a href=\"https://www.linkedin.com/in/michelleccarey/\">Michelle Carey</a>, <a href=\"www.linkedin.com/in/jiahui-mavis-zhou-a4537814a\">Mavis Zhou</a>\n"]},{"cell_type":"markdown","metadata":{},"source":["## Change Log\n","\n","| Date (YYYY-MM-DD) | Version | Changed By | Change Description                                          |\n","| ----------------- | ------- | ---------- | ----------------------------------------------------------- |\n","| 2020-09-23        | 2.0     | Shubham    | Migrated Lab to Markdown and added to course repo in GitLab |\n"]},{"cell_type":"markdown","metadata":{},"source":["<hr>\n"]},{"cell_type":"markdown","metadata":{},"source":["## <h3 align=\"center\"> © IBM Corporation 2020. All rights reserved. <h3/>\n"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.6"}},"nbformat":4,"nbformat_minor":2}