Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save BouncingBison/2a80bb11ac066fda2d05cb745871c263 to your computer and use it in GitHub Desktop.
Save BouncingBison/2a80bb11ac066fda2d05cb745871c263 to your computer and use it in GitHub Desktop.

Revisions

  1. @4bpb 4bpb revised this gist May 23, 2017. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion All of Adidas U.S. Products
    Original file line number Diff line number Diff line change
    @@ -7,7 +7,7 @@ import time
    headers = {"User-Agent" : "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36",
    "Accept-Language" : "en-US,en;q=0.8"}

    proxy = {"http" : "http://127.0.0.1:24068"}


    url = 'http://www.adidas.com/on/demandware.static/-/Sites-adidas-US-Library/en_US/v/sitemap/product/adidas-US-en-us-product.xml'

  2. @4bpb 4bpb created this gist May 23, 2017.
    31 changes: 31 additions & 0 deletions All of Adidas U.S. Products
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,31 @@
    from bs4 import BeautifulSoup
    import urllib.request
    import re
    import urllib.parse
    import time

    headers = {"User-Agent" : "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36",
    "Accept-Language" : "en-US,en;q=0.8"}

    proxy = {"http" : "http://127.0.0.1:24068"}

    url = 'http://www.adidas.com/on/demandware.static/-/Sites-adidas-US-Library/en_US/v/sitemap/product/adidas-US-en-us-product.xml'

    values = {'s':'search',
    'submit':'search'}


    data = urllib.parse.urlencode(values)
    data = data.encode('utf-8')



    req = urllib.request.Request(url, data, headers=headers)
    resp = urllib.request.urlopen(req)
    respData = resp.read()
    rawdata = re.findall(r'<loc>(.*?)</loc>', str(respData))

    for Product_list in rawdata:
    print(Product_list)

    time.sleep(500)