Skip to content

Instantly share code, notes, and snippets.

@pugson
Forked from schmich/psiupuxa.rb
Last active August 29, 2015 14:13
Show Gist options
  • Save pugson/777f11d9ec73d9c322cd to your computer and use it in GitHub Desktop.
Save pugson/777f11d9ec73d9c322cd to your computer and use it in GitHub Desktop.

Revisions

  1. Wojtek Witkowski revised this gist Jan 21, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion psiupuxa.rb
    Original file line number Diff line number Diff line change
    @@ -7,7 +7,7 @@
    require 'uri'

    # Uncomment next line if you don't have the Ruby SSL cert bundle installed.
    # OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE
    OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE

    page = 'http://psiupuxa.com/'
    images = []
  2. @schmich schmich revised this gist Jan 19, 2015. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions psiupuxa.rb
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,5 @@
    # Download desktop-resolution wallpapers from http://psiupuxa.com/
    # into the current directory.

    require 'nokogiri'
    require 'open-uri'
  3. @schmich schmich revised this gist Jan 19, 2015. 1 changed file with 18 additions and 22 deletions.
    40 changes: 18 additions & 22 deletions psiupuxa.rb
    Original file line number Diff line number Diff line change
    @@ -1,55 +1,51 @@
    # Download desktop-resolution wallpapers from http://psiupuxa.com/

    require 'nokogiri'
    require 'open-uri'
    require 'openssl'
    require 'uri'

    # Uncomment next line if you don't have the Ruby SSL cert bundle installed.
    # OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE

    pages = ['http://psiupuxa.com/']
    page = 'http://psiupuxa.com/'
    images = []

    puts 'Scraping pages for image links.'

    while page = pages.shift
    while page
    puts "Scraping #{page}."

    doc = Nokogiri::HTML(open(page))
    doc.css('.post a[href*="desktop"]').each do |e|
    href = e['href'].to_s
    images << href
    end

    next_link = doc.css('.pages-link-active + a.pages-link').first
    if next_link
    next_page = next_link['href'].to_s
    pages << URI.join(page, next_page)
    doc.css('.post a[href*="desktop"]').each do |a|
    images << a['href']
    end

    link = doc.css('.pages-link-active + a.pages-link').first
    page = link ? URI.join(page, link['href']) : nil
    end

    puts "Found #{images.length} images."
    puts 'Downloading images.'

    count = 0
    images.each do |url|
    count += 1
    local_file = url.split('/').last

    print "[#{count}/#{images.length}] "
    if File.exist? local_file
    puts "Skipping #{local_file}, file exists."
    next
    else
    puts "Downloading #{local_file}."
    end

    open(url, 'rb') do |image|
    open(local_file, 'wb') do |file|
    file.write(image.read)
    end
    end
    end

    puts 'Fin.'
  4. @schmich schmich revised this gist Jan 19, 2015. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion psiupuxa.rb
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,5 @@
    # Download desktop-resolution wallpapers from http://psiupuxa.com/

    require 'nokogiri'
    require 'open-uri'
    require 'openssl'
    @@ -6,7 +8,7 @@
    # Uncomment next line if you don't have the Ruby SSL cert bundle installed.
    # OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE

    pages = ['http://psiupuxa.com/index']
    pages = ['http://psiupuxa.com/']
    images = []

    puts 'Scraping pages for image links.'
  5. @schmich schmich created this gist Jan 19, 2015.
    53 changes: 53 additions & 0 deletions psiupuxa.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,53 @@
    require 'nokogiri'
    require 'open-uri'
    require 'openssl'
    require 'uri'

    # Uncomment next line if you don't have the Ruby SSL cert bundle installed.
    # OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE

    pages = ['http://psiupuxa.com/index']
    images = []

    puts 'Scraping pages for image links.'

    while page = pages.shift
    puts "Scraping #{page}."

    doc = Nokogiri::HTML(open(page))
    doc.css('.post a[href*="desktop"]').each do |e|
    href = e['href'].to_s
    images << href
    end

    next_link = doc.css('.pages-link-active + a.pages-link').first
    if next_link
    next_page = next_link['href'].to_s
    pages << URI.join(page, next_page)
    end
    end

    puts "Found #{images.length} images."
    puts 'Downloading images.'

    count = 0
    images.each do |url|
    count += 1
    local_file = url.split('/').last

    print "[#{count}/#{images.length}] "
    if File.exist? local_file
    puts "Skipping #{local_file}, file exists."
    next
    else
    puts "Downloading #{local_file}."
    end

    open(url, 'rb') do |image|
    open(local_file, 'wb') do |file|
    file.write(image.read)
    end
    end
    end

    puts 'Fin.'