Skip to content

Instantly share code, notes, and snippets.

@grammakov
Created September 3, 2016 11:02
Show Gist options
  • Select an option

  • Save grammakov/ac6976453ee02f7d9554dff45b7d94c1 to your computer and use it in GitHub Desktop.

Select an option

Save grammakov/ac6976453ee02f7d9554dff45b7d94c1 to your computer and use it in GitHub Desktop.

Revisions

  1. grammakov created this gist Sep 3, 2016.
    23 changes: 23 additions & 0 deletions phantomjs.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,23 @@
    require 'rubygems'
    require 'capybara'
    require 'capybara/dsl'
    require 'capybara/poltergeist'

    Capybara.default_driver = :poltergeist
    Capybara.run_server = false

    module GetTitle
    class WebScraper
    include Capybara::DSL

    def get_page_data(url)
    visit(url)
    doc = Nokogiri::HTML(page.html)
    doc.css('title')
    end
    end
    end

    scraper = GetTitle::WebScraper.new

    puts scraper.get_page_data('http://eaq.sagepub.com/content/39/4/468.short').map(&:text).inspect