Skip to content

Instantly share code, notes, and snippets.

@Parth-Vader
Created July 10, 2017 13:47
Show Gist options
  • Save Parth-Vader/236738a373cc4308d51d1173a5a35b9c to your computer and use it in GitHub Desktop.
Save Parth-Vader/236738a373cc4308d51d1173a5a35b9c to your computer and use it in GitHub Desktop.

Revisions

  1. Parth-Vader created this gist Jul 10, 2017.
    19 changes: 19 additions & 0 deletions run.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,19 @@
    from scrapy.crawler import CrawlerProcess
    from scrapy.crawler import CrawlerRunner
    from twisted.internet import reactor
    from scrapy.utils.project import get_project_settings
    from scrapy.utils.log import configure_logging
    process = CrawlerProcess(get_project_settings())

    # 'followall' is the name of one of the spiders of the project.
    #process.crawl('followall')

    configure_logging()
    runner = CrawlerRunner()
    runner.crawl('followall')
    runner.crawl('followall')
    d = runner.join()
    d.addBoth(lambda _: reactor.stop())

    reactor.run()
    #process.start() # the script will block here until the crawling is finished