cosme12 · January 5, 2018 03:31 · Jan 5, 2018 · Feb 21, 2012 · Feb 21, 2012
diff --git a/ShortIntroToScraping.rst b/ShortIntroToScraping.rst
@@ -18,8 +18,10 @@ Lets grab the Free Book Samplers from O'Reilly: `http://oreilly.com/store/sample
 ::
 
     >>> import requests
+    #ADDED
+    >>> headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
     >>> 
-    >>> result = requests.get("http://oreilly.com/store/samplers.html")
+    >>> result = requests.get("http://oreilly.com/store/samplers.html", headers=headers)
 
 Make sure we got a result.
 

diff --git a/ShortIntroToScraping.rst b/ShortIntroToScraping.rst
@@ -16,13 +16,15 @@ Start Scraping!
 Lets grab the Free Book Samplers from O'Reilly: `http://oreilly.com/store/samplers.html <http://oreilly.com/store/samplers.html>`_.
 
 ::
+
     >>> import requests
     >>> 
     >>> result = requests.get("http://oreilly.com/store/samplers.html")
 
 Make sure we got a result.
 
 ::
+
     >>> result.status_code
     200
     >>> result.headers
@@ -31,11 +33,13 @@ Make sure we got a result.
 Store your content in an easy-to-type variable!
 
 ::
+
     >>> c = result.content
 
 Start parsing with Beautiful Soup.  NOTE: If you installed with pip, you'll need to import from ``bs4``. If you download the source, you'll need to import from ``BeautifulSoup`` (which is what they do in the `online docs <http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html#Quick%20Start>`_).
 
 ::
+
     >>> from bs4 import BeautifulSoup
     >>> soup = BeautifulSoup(c)
     >>> samples = soup.find_all("a", "item-title")
@@ -47,6 +51,7 @@ Start parsing with Beautiful Soup.  NOTE: If you installed with pip, you'll need
 Now, pick apart individual links.
 
 ::
+
     >>> data = {}
     >>> for a in samples:
     ...     title = a.string.strip()
@@ -55,4 +60,4 @@ Now, pick apart individual links.
 
 Check out the keys/values in the ``data`` dict. Rejoice!
 
-    
+Now go scrape some stuff!
diff --git a/ShortIntroToScraping.rst b/ShortIntroToScraping.rst
@@ -0,0 +1,58 @@
+Web Scraping Workshop
+=====================
+
+Using `Requests <http://python-requests.org>`_ and `Beautiful Soup <http://www.crummy.com/software/BeautifulSoup/>`_, with the most recent `Beautiful Soup 4 docs <http://www.crummy.com/software/BeautifulSoup/bs4/doc/>`_.
+
+Getting Started
+---------------
+Install our tools (preferably in a new virtualenv)::
+
+    pip install beautifulsoup4
+    pip install requests
+
+Start Scraping!
+---------------
+
+Lets grab the Free Book Samplers from O'Reilly: `http://oreilly.com/store/samplers.html <http://oreilly.com/store/samplers.html>`_.
+
+::
+    >>> import requests
+    >>> 
+    >>> result = requests.get("http://oreilly.com/store/samplers.html")
+
+Make sure we got a result.
+
+::
+    >>> result.status_code
+    200
+    >>> result.headers
+    ...
+
+Store your content in an easy-to-type variable!
+
+::
+    >>> c = result.content
+
+Start parsing with Beautiful Soup.  NOTE: If you installed with pip, you'll need to import from ``bs4``. If you download the source, you'll need to import from ``BeautifulSoup`` (which is what they do in the `online docs <http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html#Quick%20Start>`_).
+
+::
+    >>> from bs4 import BeautifulSoup
+    >>> soup = BeautifulSoup(c)
+    >>> samples = soup.find_all("a", "item-title")
+    >>> samples[0]
+    <a class="item-title" href="http://cdn.oreilly.com/oreilly/booksamplers/9780596004927_sampler.pdf">
+    Programming Perl
+    </a>
+
+Now, pick apart individual links.
+
+::
+    >>> data = {}
+    >>> for a in samples:
+    ...     title = a.string.strip()
+    ...     data[title] = a.attrs['href']
+
+
+Check out the keys/values in the ``data`` dict. Rejoice!
+
+
No results found