Skip to content

Instantly share code, notes, and snippets.

@jeanim
Forked from henrik/google_art_project.rb
Created August 11, 2016 08:40
Show Gist options
  • Select an option

  • Save jeanim/3c596b249e34095bbdcf76fd9a344788 to your computer and use it in GitHub Desktop.

Select an option

Save jeanim/3c596b249e34095bbdcf76fd9a344788 to your computer and use it in GitHub Desktop.

Revisions

  1. @henrik henrik revised this gist Oct 13, 2012. No changes.
  2. @henrik henrik revised this gist May 18, 2012. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -2,6 +2,9 @@
    # By Henrik Nyh <http://henrik.nyh.se> 2011-02-05 under the MIT license.
    # Requires Ruby and ImageMagick.
    #
    # NOTE:
    # I'm afraid this script no longer works! See the Gist comments.
    #
    # Usage e.g.:
    # ruby google_art_project.rb http://www.googleartproject.com/museums/tate/portrait-of-william-style-of-langley-174
    #
    @@ -208,4 +211,4 @@ def full_path
    puts "Error: #{e.message}"
    end
    end
    end
    end
  3. @henrik henrik revised this gist Jun 5, 2011. 1 changed file with 7 additions and 3 deletions.
    10 changes: 7 additions & 3 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,6 @@
    # Google Art Project fullsize image downloader.
    # By Henrik Nyh <http://henrik.nyh.se> 2011-02-05 under the MIT license.
    # Requires Ruby and ImageMagick.
    # Requires Ruby and ImageMagick.
    #
    # Usage e.g.:
    # ruby google_art_project.rb http://www.googleartproject.com/museums/tate/portrait-of-william-style-of-langley-174
    @@ -27,6 +27,10 @@ def windows?

    class GAPDownloader

    # Set this to "jpg" or "tif".
    # jpg is a lot smaller but destructively compressed.
    OUTPUT_EXTENSION = "jpg"

    if windows?
    # Case-sensitive. Use forward slashes, or double-escape backslashes.
    TEMP_DIRECTORY = "C:/WINDOWS/Temp"
    @@ -184,11 +188,11 @@ def tile_path(x, y)
    end

    def row_path(y)
    File.join(TEMP_DIRECTORY, "gap-#{@perma_id}-row-#{@max_zoom}-#{y}.jpg")
    File.join(TEMP_DIRECTORY, "gap-#{@perma_id}-row-#{@max_zoom}-#{y}.#{OUTPUT_EXTENSION}")
    end

    def full_path
    File.join(OUTPUT_DIRECTORY, "#{@perma_id}.jpg")
    File.join(OUTPUT_DIRECTORY, "#{@perma_id}.#{OUTPUT_EXTENSION}")
    end

    end
  4. @henrik henrik revised this gist May 7, 2011. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -21,7 +21,7 @@

    module Kernel
    def windows?
    Config::CONFIG['host_os'].match(/mswin|windows/i)
    Config::CONFIG['host_os'].match(/mswin|windows|mingw/i)
    end
    end

  5. @henrik henrik revised this gist May 6, 2011. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -152,6 +152,8 @@ def trim
    end

    def set_metadata
    # 300 DPI instead of 72 DPI; more sane for printing.
    `convert #{full_path} -density 300 #{full_path}`
    if !windows? && !`which xattr`.empty?
    # Set "Downloaded from" Finder metadata, like Safari does.
    system('xattr', '-w', 'com.apple.metadata:kMDItemWhereFroms', @url, full_path)
  6. @henrik henrik revised this gist May 6, 2011. 1 changed file with 2 additions and 1 deletion.
    3 changes: 2 additions & 1 deletion google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -17,10 +17,11 @@

    require "open-uri"
    require "fileutils"
    require "rbconfig"

    module Kernel
    def windows?
    RUBY_PLATFORM.include?("mswin")
    Config::CONFIG['host_os'].match(/mswin|windows/i)
    end
    end

  7. @henrik henrik revised this gist Apr 30, 2011. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -13,7 +13,7 @@
    #
    # Can reportedly run on Windows as well, with Ruby from http://www.ruby-lang.org/en/downloads/
    # and ImageMagick from http://www.imagemagick.org/script/binary-releases.php#windows
    # Note that you may need to edit the DIRECTORY below.
    # Note that you may need to edit the TEMP_DIRECTORY/OUTPUT_DIRECTORY below.

    require "open-uri"
    require "fileutils"
  8. @henrik henrik revised this gist Apr 30, 2011. 1 changed file with 9 additions and 5 deletions.
    14 changes: 9 additions & 5 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -16,6 +16,7 @@
    # Note that you may need to edit the DIRECTORY below.

    require "open-uri"
    require "fileutils"

    module Kernel
    def windows?
    @@ -27,9 +28,12 @@ class GAPDownloader

    if windows?
    # Case-sensitive. Use forward slashes, or double-escape backslashes.
    DIRECTORY = "C:/WINDOWS/Temp"
    TEMP_DIRECTORY = "C:/WINDOWS/Temp"
    OUTPUT_DIRECTORY = TEMP_DIRECTORY
    else
    DIRECTORY = "/tmp"
    TEMP_DIRECTORY = "/tmp"
    OUTPUT_DIRECTORY = "#{ENV['HOME']}/Downloads"
    FileUtils.mkdir_p OUTPUT_DIRECTORY
    end

    # You can lower this if you get ridiculously high-res images otherwise.
    @@ -173,15 +177,15 @@ def tile_url(x, y, zoom)
    end

    def tile_path(x, y)
    File.join(DIRECTORY, "gap-#{@perma_id}-tile-#{x}-#{y}.jpg")
    File.join(TEMP_DIRECTORY, "gap-#{@perma_id}-tile-#{x}-#{y}.jpg")
    end

    def row_path(y)
    File.join(DIRECTORY, "gap-#{@perma_id}-row-#{@max_zoom}-#{y}.jpg")
    File.join(TEMP_DIRECTORY, "gap-#{@perma_id}-row-#{@max_zoom}-#{y}.jpg")
    end

    def full_path
    File.join(DIRECTORY, "gap-#{@perma_id}-full.jpg")
    File.join(OUTPUT_DIRECTORY, "#{@perma_id}.jpg")
    end

    end
  9. @henrik henrik revised this gist Apr 30, 2011. 1 changed file with 11 additions and 6 deletions.
    17 changes: 11 additions & 6 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -69,8 +69,13 @@ def verify_url!

    def get_image_id
    @html = open(@url).read
    @iid = @html[/data-thumbnail="(.+?)"/, 1]
    unless @iid
    # Reportedly the data-thumbnail can change in the middle of a long download session, but
    # the encodedInfospotId will not. So if we key local files by the InfospotId, we can
    # check for them if download fails and we start over. Also makes for more human names.
    # If I run into it myself, I may adapt the code to auto-resolve a changed data-thumbnail.
    @thumb_id = @html[/data-thumbnail="(.+?)"/, 1]
    @perma_id = @html[/data-encodedInfospotId="(.+?)"/, 1]
    unless @thumb_id && @perma_id
    error "Couldn't find an image at this URL, sorry!"
    end
    end
    @@ -164,19 +169,19 @@ def error(message)

    def tile_url(x, y, zoom)
    # The subdomain can seemingly be anything from lh3 to lh6.
    "http://lh5.ggpht.com/#{@iid}=x#{x}-y#{y}-z#{zoom}"
    "http://lh5.ggpht.com/#{@thumb_id}=x#{x}-y#{y}-z#{zoom}"
    end

    def tile_path(x, y)
    File.join(DIRECTORY, "gap-#{@iid}-tile-#{x}-#{y}.jpg")
    File.join(DIRECTORY, "gap-#{@perma_id}-tile-#{x}-#{y}.jpg")
    end

    def row_path(y)
    File.join(DIRECTORY, "gap-#{@iid}-row-#{@max_zoom}-#{y}.jpg")
    File.join(DIRECTORY, "gap-#{@perma_id}-row-#{@max_zoom}-#{y}.jpg")
    end

    def full_path
    File.join(DIRECTORY, "gap-#{@iid}-full.jpg")
    File.join(DIRECTORY, "gap-#{@perma_id}-full.jpg")
    end

    end
  10. @henrik henrik revised this gist Apr 30, 2011. 1 changed file with 5 additions and 1 deletion.
    6 changes: 5 additions & 1 deletion google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -91,10 +91,14 @@ def get_tiles
    0.upto(@max_y) do |y|
    0.upto(@max_x) do |x|
    url = tile_url(x, y, @max_zoom)
    path = tile_path(x, y)
    if File.exists?(path)
    puts "Skipping #{url} (already downloaded)..."
    next
    end
    begin
    data = open(url) # Raises at 404.
    puts "Getting #{url}..."
    path = tile_path(x, y)
    File.open(path, "wb") { |f| f.print data.read }
    rescue OpenURI::HTTPError => e
    raise unless e.message == "404 Not Found"
  11. @henrik henrik revised this gist Apr 24, 2011. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -32,11 +32,11 @@ class GAPDownloader
    DIRECTORY = "/tmp"
    end

    class RuntimeError < StandardError; end

    # You can lower this if you get ridiculously high-res images otherwise.
    MAX_ZOOM_ALLOWED = 10

    class RuntimeError < StandardError; end

    def initialize(url)
    ensure_image_magick!
    @url = url
  12. @henrik henrik revised this gist Apr 24, 2011. 1 changed file with 6 additions and 4 deletions.
    10 changes: 6 additions & 4 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -17,6 +17,12 @@

    require "open-uri"

    module Kernel
    def windows?
    RUBY_PLATFORM.include?("mswin")
    end
    end

    class GAPDownloader

    if windows?
    @@ -49,10 +55,6 @@ def download

    private

    def windows?
    RUBY_PLATFORM.include?("mswin")
    end

    def ensure_image_magick!
    if !windows? && `which montage`.empty?
    error "You must have ImageMagick installed. Could not find 'montage' in your PATH."
  13. @henrik henrik revised this gist Apr 24, 2011. 1 changed file with 27 additions and 8 deletions.
    35 changes: 27 additions & 8 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -1,18 +1,31 @@
    # Google Art Project fullsize image downloader.
    # By Henrik Nyh <http://henrik.nyh.se> 2011-02-05 under the MIT license.
    # Requires Ruby and ImageMagick. On OS X, it sets "Downloaded from" metadata and reveals in Finder.
    # Requires Ruby and ImageMagick.
    #
    # Usage e.g.:
    # ruby google_art_project.rb http://www.googleartproject.com/museums/tate/portrait-of-william-style-of-langley-174
    #
    # You can specify multiple URLs on the command line, separated by space.
    # Or you can specify no URLs on the command line and instead list them at the end of this file, one on each line,
    # with "__END__" before the list.
    #
    # On OS X, it sets "Downloaded from" metadata and reveals in Finder.
    #
    # Can reportedly run on Windows as well, with Ruby from http://www.ruby-lang.org/en/downloads/
    # and ImageMagick from http://www.imagemagick.org/script/binary-releases.php#windows
    # Note that you may need to edit the DIRECTORY below.

    require "open-uri"

    class GAPDownloader

    if windows?
    # Case-sensitive. Use forward slashes, or double-escape backslashes.
    DIRECTORY = "C:/WINDOWS/Temp"
    else
    DIRECTORY = "/tmp"
    end

    class RuntimeError < StandardError; end

    # You can lower this if you get ridiculously high-res images otherwise.
    @@ -36,8 +49,12 @@ def download

    private

    def windows?
    RUBY_PLATFORM.include?("mswin")
    end

    def ensure_image_magick!
    if `which montage`.empty?
    if !windows? && `which montage`.empty?
    error "You must have ImageMagick installed. Could not find 'montage' in your PATH."
    end
    end
    @@ -119,7 +136,7 @@ def trim
    end

    def set_metadata
    unless `which xattr`.empty?
    if !windows? && !`which xattr`.empty?
    # Set "Downloaded from" Finder metadata, like Safari does.
    system('xattr', '-w', 'com.apple.metadata:kMDItemWhereFroms', @url, full_path)
    end
    @@ -128,7 +145,9 @@ def set_metadata
    def done
    puts "Done: #{full_path}"
    # Reveal in Finder if on OS X.
    `which osascript && osascript -e 'tell app "Finder"' -e 'reveal POSIX file "#{full_path}"' -e 'activate' -e 'end'`
    unless windows?
    `which osascript && osascript -e 'tell app "Finder"' -e 'reveal POSIX file "#{full_path}"' -e 'activate' -e 'end'`
    end
    end


    @@ -143,15 +162,15 @@ def tile_url(x, y, zoom)
    end

    def tile_path(x, y)
    "/tmp/gap-#{@iid}-tile-#{x}-#{y}.jpg"
    File.join(DIRECTORY, "gap-#{@iid}-tile-#{x}-#{y}.jpg")
    end

    def row_path(y)
    "/tmp/gap-#{@iid}-row-#{@max_zoom}-#{y}.jpg"
    File.join(DIRECTORY, "gap-#{@iid}-row-#{@max_zoom}-#{y}.jpg")
    end

    def full_path
    "/tmp/gap-#{@iid}-full.jpg"
    File.join(DIRECTORY, "gap-#{@iid}-full.jpg")
    end

    end
    @@ -167,4 +186,4 @@ def full_path
    puts "Error: #{e.message}"
    end
    end
    end
    end
  14. @henrik henrik revised this gist Feb 5, 2011. 1 changed file with 0 additions and 2 deletions.
    2 changes: 0 additions & 2 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -11,8 +11,6 @@

    require "open-uri"

    url = ARGV.first

    class GAPDownloader

    class RuntimeError < StandardError; end
  15. @henrik henrik created this gist Feb 5, 2011.
    172 changes: 172 additions & 0 deletions google_art_project.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,172 @@
    # Google Art Project fullsize image downloader.
    # By Henrik Nyh <http://henrik.nyh.se> 2011-02-05 under the MIT license.
    # Requires Ruby and ImageMagick. On OS X, it sets "Downloaded from" metadata and reveals in Finder.
    #
    # Usage e.g.:
    # ruby google_art_project.rb http://www.googleartproject.com/museums/tate/portrait-of-william-style-of-langley-174
    #
    # You can specify multiple URLs on the command line, separated by space.
    # Or you can specify no URLs on the command line and instead list them at the end of this file, one on each line,
    # with "__END__" before the list.

    require "open-uri"

    url = ARGV.first

    class GAPDownloader

    class RuntimeError < StandardError; end

    # You can lower this if you get ridiculously high-res images otherwise.
    MAX_ZOOM_ALLOWED = 10

    def initialize(url)
    ensure_image_magick!
    @url = url
    verify_url!
    end

    def download
    get_image_id
    determine_zoom
    get_tiles
    stitch_tiles
    trim
    set_metadata
    done
    end

    private

    def ensure_image_magick!
    if `which montage`.empty?
    error "You must have ImageMagick installed. Could not find 'montage' in your PATH."
    end
    end

    def verify_url!
    unless @url.to_s.match(%r{\Ahttp://www\.googleartproject\.com/})
    error "Please specify a Google Art Project URL."
    end
    end

    def get_image_id
    @html = open(@url).read
    @iid = @html[/data-thumbnail="(.+?)"/, 1]
    unless @iid
    error "Couldn't find an image at this URL, sorry!"
    end
    end

    def determine_zoom
    0.upto(MAX_ZOOM_ALLOWED) do |zoom|
    open(tile_url(0, 0, zoom))
    @max_zoom = zoom
    end
    rescue OpenURI::HTTPError => e
    raise unless e.message == "404 Not Found"
    end

    def get_tiles
    @max_x = 999
    @max_y = 999

    0.upto(@max_y) do |y|
    0.upto(@max_x) do |x|
    url = tile_url(x, y, @max_zoom)
    begin
    data = open(url) # Raises at 404.
    puts "Getting #{url}..."
    path = tile_path(x, y)
    File.open(path, "wb") { |f| f.print data.read }
    rescue OpenURI::HTTPError => e
    raise unless e.message == "404 Not Found"
    if y.zero?
    # Found max x. Start on next row.
    @max_x = x - 1
    break
    else
    # Found max y. We have all tiles, so bail.
    @max_y = y - 1
    return
    end
    end
    end
    end
    end

    def stitch_tiles
    # `montage` is ImageMagick.
    # We first stitch together the tiles of each row, then stitch all rows.
    # Stitching the full image all at once can get extremely inefficient for large images.
    tiles_wide = @max_x + 1
    tiles_high = @max_y + 1

    puts "Stitching #{tiles_wide} x #{tiles_high} = #{tiles_wide*tiles_high} tiles..."

    0.upto(@max_y) do |y|
    tiles = (0..@max_x).map { |x| tile_path(x, y) }.join(' ')
    `montage #{tiles} -geometry +0+0 -tile #{tiles_wide}x1 #{row_path(y)}`
    end

    tiles = (0..@max_y).map { |y| row_path(y) }.join(' ')
    `montage #{tiles} -geometry +0+0 -tile 1x#{tiles_high} #{full_path}`
    end

    def trim
    # Trim the black blocks that may appear on right and bottom.
    # We first add a black border to ensure no other color is trimmed, as described on
    # http://www.imagemagick.org/Usage/crop/#trim
    `convert #{full_path} -bordercolor black -border 1x1 -trim #{full_path}`
    end

    def set_metadata
    unless `which xattr`.empty?
    # Set "Downloaded from" Finder metadata, like Safari does.
    system('xattr', '-w', 'com.apple.metadata:kMDItemWhereFroms', @url, full_path)
    end
    end

    def done
    puts "Done: #{full_path}"
    # Reveal in Finder if on OS X.
    `which osascript && osascript -e 'tell app "Finder"' -e 'reveal POSIX file "#{full_path}"' -e 'activate' -e 'end'`
    end


    def error(message)
    raise GAPDownloader::RuntimeError, "#{message} (#{@url})"
    end


    def tile_url(x, y, zoom)
    # The subdomain can seemingly be anything from lh3 to lh6.
    "http://lh5.ggpht.com/#{@iid}=x#{x}-y#{y}-z#{zoom}"
    end

    def tile_path(x, y)
    "/tmp/gap-#{@iid}-tile-#{x}-#{y}.jpg"
    end

    def row_path(y)
    "/tmp/gap-#{@iid}-row-#{@max_zoom}-#{y}.jpg"
    end

    def full_path
    "/tmp/gap-#{@iid}-full.jpg"
    end

    end

    if __FILE__ == $0
    urls = ARGV.any? ? ARGV : (defined?(DATA) ? DATA.read.strip.split("\n") : [])
    puts "Error: No URLs given!" if urls.empty?

    urls.each do |url|
    begin
    GAPDownloader.new(url).download
    rescue GAPDownloader::RuntimeError => e
    puts "Error: #{e.message}"
    end
    end
    end