{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# A primer for sports stats scraping\n",
    "\n",
    "Though I'm using an example for a random Green Bay Pheonix game from the school's website here, the principles generally hold true across the board. Some sites are easier to scrape than others, but most require at least minor parsing to get the exact format you want in the end.\n",
    "\n",
    "*Anyway*, this should serve as a decent barebones example to show just how easy it can be to scrape this information into a relatively usable format with little effort. [kenpompy](https://github.com/j-andrews7/kenpompy) uses very similar code and logic, though his site tends to require heavy manual parsing as well."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, we just need to load up our packages. I use `mechanicalsoup` here, but you could do the same with `requests` easily enough. `mechanicalsoup` is useful if you have to login or otherwise interact with a page through a form though."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import mechanicalsoup\n",
    "import pandas as pd\n",
    "from bs4 import BeautifulSoup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we snag the actual HTML for the [page containing the boxscore](https://greenbayphoenix.com/sports/womens-basketball/stats/2019-20/mizzou/boxscore/4186). If a page dynamically generates/populates data via javascript, this entire process becomes moot, as you'll need something like [selenium](https://selenium-python.readthedocs.io/) to actually load the page."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<Response [200]>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This just initializes a browser.\n",
    "browser = mechanicalsoup.StatefulBrowser()\n",
    "browser.open(\"https://greenbayphoenix.com/sports/womens-basketball/stats/2019-20/mizzou/boxscore/4186\")\n",
    "\n",
    "# This actually scrapes the page contents into a variable, HTML tags, javascript, and all. \n",
    "content = browser.get_current_page()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we grab all tables from the page, storing them in a list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "tables = content.find_all('table')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So this site has 7 tables:\n",
    " - Scoring by quarter\n",
    " - Away team box score\n",
    " - Away team summary stats\n",
    " - Away team miscellaneous stats\n",
    " - Home team box score\n",
    " - Home team summary stats\n",
    " - Home team miscellaneous stats\n",
    " \n",
    "Kudos to the Green Bay website designers, their site has a better layout than most major programs. \n",
    "\n",
    "First, we'll grab the scoring by quarter table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Team</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>Total F</th>\n",
       "      <th>Records</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>Mizzou Missouri</td>\n",
       "      <td>9</td>\n",
       "      <td>11</td>\n",
       "      <td>14</td>\n",
       "      <td>30</td>\n",
       "      <td>64</td>\n",
       "      <td>1-3,0-0 SEC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>GB Green Bay</td>\n",
       "      <td>16</td>\n",
       "      <td>15</td>\n",
       "      <td>21</td>\n",
       "      <td>20</td>\n",
       "      <td>72</td>\n",
       "      <td>2-1,0-0 Horizon</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              Team   1   2   3   4  Total F          Records\n",
       "0  Mizzou Missouri   9  11  14  30       64      1-3,0-0 SEC\n",
       "1     GB Green Bay  16  15  21  20       72  2-1,0-0 Horizon"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Parse the first table into a pandas dataframe.\n",
    "quarters_df = pd.read_html(str(tables[0]))\n",
    "\n",
    "# For some reason it always nests the result in a list.\n",
    "quarters_df = quarters_df[0]\n",
    "quarters_df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Not bad, pretty usable right off the bat depending on what you want. Some folks might want to parse out conferences and conference records from the `Records` column, or split the `Team` column so that the team abbreviations are in their own column. But overall, pretty decent.\n",
    "\n",
    "Now we'll try the actual box score."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>##</th>\n",
       "      <th>Player</th>\n",
       "      <th>GS</th>\n",
       "      <th>MIN</th>\n",
       "      <th>FG</th>\n",
       "      <th>3PT</th>\n",
       "      <th>FT</th>\n",
       "      <th>ORB-DRB</th>\n",
       "      <th>REB</th>\n",
       "      <th>PF</th>\n",
       "      <th>A</th>\n",
       "      <th>TO</th>\n",
       "      <th>BLK</th>\n",
       "      <th>STL</th>\n",
       "      <th>PTS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>13</td>\n",
       "      <td>13 Schuchts,Hannah</td>\n",
       "      <td>*</td>\n",
       "      <td>30</td>\n",
       "      <td>4-5</td>\n",
       "      <td>3-4</td>\n",
       "      <td>2-2</td>\n",
       "      <td>2-6</td>\n",
       "      <td>8</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>23</td>\n",
       "      <td>23 Smith,Amber</td>\n",
       "      <td>*</td>\n",
       "      <td>29</td>\n",
       "      <td>5-12</td>\n",
       "      <td>2-5</td>\n",
       "      <td>0-0</td>\n",
       "      <td>1-7</td>\n",
       "      <td>8</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>22</td>\n",
       "      <td>22 Roundtree,Jordan</td>\n",
       "      <td>*</td>\n",
       "      <td>35</td>\n",
       "      <td>3-7</td>\n",
       "      <td>1-4</td>\n",
       "      <td>2-2</td>\n",
       "      <td>0-4</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>24</td>\n",
       "      <td>24 Chavis,Jordan</td>\n",
       "      <td>*</td>\n",
       "      <td>18</td>\n",
       "      <td>0-4</td>\n",
       "      <td>0-3</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0-1</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>12</td>\n",
       "      <td>12 Brown,Elle</td>\n",
       "      <td>*</td>\n",
       "      <td>6</td>\n",
       "      <td>0-1</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>43</td>\n",
       "      <td>43 Frank,Hayley</td>\n",
       "      <td>NaN</td>\n",
       "      <td>26</td>\n",
       "      <td>4-11</td>\n",
       "      <td>1-4</td>\n",
       "      <td>3-3</td>\n",
       "      <td>3-1</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>33</td>\n",
       "      <td>33 Blackwell,Aijha</td>\n",
       "      <td>NaN</td>\n",
       "      <td>16</td>\n",
       "      <td>2-8</td>\n",
       "      <td>0-2</td>\n",
       "      <td>4-8</td>\n",
       "      <td>1-2</td>\n",
       "      <td>3</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>11</td>\n",
       "      <td>11 Troup,Haley</td>\n",
       "      <td>NaN</td>\n",
       "      <td>23</td>\n",
       "      <td>1-7</td>\n",
       "      <td>1-4</td>\n",
       "      <td>3-4</td>\n",
       "      <td>0-4</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>8</td>\n",
       "      <td>10</td>\n",
       "      <td>10 Green,Nadia</td>\n",
       "      <td>NaN</td>\n",
       "      <td>14</td>\n",
       "      <td>1-3</td>\n",
       "      <td>0-0</td>\n",
       "      <td>2-2</td>\n",
       "      <td>3-0</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>9</td>\n",
       "      <td>45</td>\n",
       "      <td>45 Garner,Brittany</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>10</td>\n",
       "      <td>TM</td>\n",
       "      <td>TM TEAM</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0-0</td>\n",
       "      <td>0-0</td>\n",
       "      <td>2-3</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>11</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Totals</td>\n",
       "      <td>-</td>\n",
       "      <td>200</td>\n",
       "      <td>20-58</td>\n",
       "      <td>8-26</td>\n",
       "      <td>16-21</td>\n",
       "      <td>12-28</td>\n",
       "      <td>40</td>\n",
       "      <td>30</td>\n",
       "      <td>13</td>\n",
       "      <td>17</td>\n",
       "      <td>2</td>\n",
       "      <td>8</td>\n",
       "      <td>64</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     ##               Player   GS  MIN     FG   3PT     FT ORB-DRB  REB  PF  \\\n",
       "0    13   13 Schuchts,Hannah    *   30    4-5   3-4    2-2     2-6    8   3   \n",
       "1    23       23 Smith,Amber    *   29   5-12   2-5    0-0     1-7    8   5   \n",
       "2    22  22 Roundtree,Jordan    *   35    3-7   1-4    2-2     0-4    4   2   \n",
       "3    24     24 Chavis,Jordan    *   18    0-4   0-3    0-0     0-1    1   3   \n",
       "4    12        12 Brown,Elle    *    6    0-1   0-0    0-0     0-0    0   1   \n",
       "5    43      43 Frank,Hayley  NaN   26   4-11   1-4    3-3     3-1    4   5   \n",
       "6    33   33 Blackwell,Aijha  NaN   16    2-8   0-2    4-8     1-2    3   5   \n",
       "7    11       11 Troup,Haley  NaN   23    1-7   1-4    3-4     0-4    4   1   \n",
       "8    10       10 Green,Nadia  NaN   14    1-3   0-0    2-2     3-0    3   3   \n",
       "9    45   45 Garner,Brittany  NaN    3    0-0   0-0    0-0     0-0    0   2   \n",
       "10   TM              TM TEAM  NaN    0    0-0   0-0    0-0     2-3    5   0   \n",
       "11  NaN               Totals    -  200  20-58  8-26  16-21   12-28   40  30   \n",
       "\n",
       "     A  TO  BLK  STL  PTS  \n",
       "0    1   2    2    0   13  \n",
       "1    2   3    0    0   12  \n",
       "2    0   1    0    1    9  \n",
       "3    3   0    0    1    0  \n",
       "4    1   1    0    0    0  \n",
       "5    2   3    0    3   12  \n",
       "6    2   4    0    0    8  \n",
       "7    1   2    0    2    6  \n",
       "8    1   1    0    1    4  \n",
       "9    0   0    0    0    0  \n",
       "10   0   0    0    0    0  \n",
       "11  13  17    2    8   64  "
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Get the next element of our tables list.\n",
    "mizzou_box = pd.read_html(str(tables[1]))\n",
    "mizzou_box = mizzou_box[0]\n",
    "mizzou_box"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, not bad. Would probably definitely tweak some things here, like dropping player numbers from the `Player` column, splitting the `ORB-DRB` column and creating a total rebounds column, splitting up made field goals and attempt, etc. \n",
    "\n",
    "But ya'll can figure that out on your own. `kenpompy` does plenty of that, if you need inspiration. \n",
    "\n",
    "On to the summary stats."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Team Summary</th>\n",
       "      <th>FG</th>\n",
       "      <th>3PT</th>\n",
       "      <th>FT</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1st Quarter</td>\n",
       "      <td>3-16</td>\n",
       "      <td>2-8</td>\n",
       "      <td>1-2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>1st Quarter</td>\n",
       "      <td>18.75 %</td>\n",
       "      <td>25.00 %</td>\n",
       "      <td>50.00 %</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>2nd Quarter</td>\n",
       "      <td>3-10</td>\n",
       "      <td>1-4</td>\n",
       "      <td>4-5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>2nd Quarter</td>\n",
       "      <td>30.00 %</td>\n",
       "      <td>25.00 %</td>\n",
       "      <td>80.00 %</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>3rd Quarter</td>\n",
       "      <td>5-14</td>\n",
       "      <td>1-6</td>\n",
       "      <td>3-6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>3rd Quarter</td>\n",
       "      <td>35.71 %</td>\n",
       "      <td>16.67 %</td>\n",
       "      <td>50.00 %</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>4th Quarter</td>\n",
       "      <td>9-18</td>\n",
       "      <td>4-8</td>\n",
       "      <td>8-8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>4th Quarter</td>\n",
       "      <td>50.00 %</td>\n",
       "      <td>50.00 %</td>\n",
       "      <td>100.00 %</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>8</td>\n",
       "      <td>Total</td>\n",
       "      <td>20-58</td>\n",
       "      <td>8-26</td>\n",
       "      <td>16-21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>9</td>\n",
       "      <td>NaN</td>\n",
       "      <td>34.5 %</td>\n",
       "      <td>30.8 %</td>\n",
       "      <td>76.2 %</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Team Summary       FG      3PT        FT\n",
       "0  1st Quarter     3-16      2-8       1-2\n",
       "1  1st Quarter  18.75 %  25.00 %   50.00 %\n",
       "2  2nd Quarter     3-10      1-4       4-5\n",
       "3  2nd Quarter  30.00 %  25.00 %   80.00 %\n",
       "4  3rd Quarter     5-14      1-6       3-6\n",
       "5  3rd Quarter  35.71 %  16.67 %   50.00 %\n",
       "6  4th Quarter     9-18      4-8       8-8\n",
       "7  4th Quarter  50.00 %  50.00 %  100.00 %\n",
       "8        Total    20-58     8-26     16-21\n",
       "9          NaN   34.5 %   30.8 %    76.2 %"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mizzou_sum = pd.read_html(str(tables[2]))\n",
    "mizzou_sum = mizzou_sum[0]\n",
    "mizzou_sum"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alright, this one definitely needs a bit of work. Not too much, moving the percentage rows to their own columns would probably be enough to make plotting doable and clean up the row redundancy.\n",
    "\n",
    "Lastly, the miscellaneous table (timeouts, tech fouls, lead changes, etc)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>Technical Fouls: none</td>\n",
       "      <td>Second Chance Points: 10</td>\n",
       "      <td>Scores Tied: 0 time(s)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>Points in the Paint: 20</td>\n",
       "      <td>Fast Break Points: 15</td>\n",
       "      <td>Lead Changed: 0 time(s)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>Points off Turnovers: 10</td>\n",
       "      <td>Bench Points: 30</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                          0                         1                        2\n",
       "0     Technical Fouls: none  Second Chance Points: 10   Scores Tied: 0 time(s)\n",
       "1   Points in the Paint: 20     Fast Break Points: 15  Lead Changed: 0 time(s)\n",
       "2  Points off Turnovers: 10          Bench Points: 30                      NaN"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mizzou_misc = pd.read_html(str(tables[3]))\n",
    "mizzou_misc = mizzou_misc[0]\n",
    "mizzou_misc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Yikes, this one is pretty useless as is. I'd be more inclined to create a class out of this than actually parse it into a usable dataframe format. You can however, parse individual elements easily enough. For instance, if I just wanted to grab the points in the paint:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Points in the Paint: 20'"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# pandas indexing is row by column.\n",
    "mizzou_misc.iloc[1,0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And getting just the number is just another line"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'20'"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "paint_pts = mizzou_misc.iloc[1,0]\n",
    "paint_pts = paint_pts.split(\": \")[1]\n",
    "paint_pts"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's all I've got. The last 3 tables in the `tables` list contain the same information for Green Bay.\n",
    "\n",
    "If you have questions, hit me up on [twitter](https://twitter.com/JMA_Data)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}