Skip to content

Instantly share code, notes, and snippets.

@Rigel772
Last active September 25, 2020 09:53
Show Gist options
  • Save Rigel772/d90c17461dce8d1070a75bf1d1e1d2d9 to your computer and use it in GitHub Desktop.
Save Rigel772/d90c17461dce8d1070a75bf1d1e1d2d9 to your computer and use it in GitHub Desktop.
[scrapy save html page] #scrapy #python
#Write all site html
class Spider(scrapy.Spider):
name = "posts"
start_urls = [
'https://blog.scrapinghub.com/page/1/',
'https://blog.scrapinghub.com/page/2/'
]
def parse(self, response):
page = response.url.split('/')[-1]
filename = 'posts-%s.html' % page
with open(filename, 'wb') as f:
f.write(response.body)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment