Skip to content

Instantly share code, notes, and snippets.

@edsono
Forked from ttscoff/rtftomarkdown.rb
Created February 24, 2013 15:49
Show Gist options
  • Save edsono/5024293 to your computer and use it in GitHub Desktop.
Save edsono/5024293 to your computer and use it in GitHub Desktop.
#!/usr/bin/ruby
# Uses textutil, available on Mac only (installed by default)
# Usage: rtftomarkdown.rb FILENAME.rtf
# Outputs to STDOUT
if ARGV.length == 0
puts "#{__FILE__} expects an input file (RTF or DOC) as an argument"
exit
end
def remove_empty(input)
input.gsub!(/(<(\w+)( class=".*?")?>\s*<\/\2>)/,'')
input = remove_empty(input) if input =~ /(<(\w+)( class=".*")?>\s*<\/\2>)/
return input.strip
end
ARGV.each do |infile|
file = infile.sub(/\/$/,'')
if File.exists?(File.expand_path(file))
ext = file.match(/\.(\w+)$/)[1]
input = %x{/usr/bin/textutil -convert html -stdout #{file}}.strip
input.gsub!(/.*?<body>(.*?)<\/body>.*/m,"\\1")
# remove span/br tags, unneccessary
input.gsub!(/<br>/,'')
input.gsub!(/<\/?span( class=".*?")?>/,'')
# substitute headers
input.gsub!(/<p class="p1"><b>(.+?)<\/b><\/p>/,'# \\1')
input.gsub!(/<p class="p2"><b>(.+?)<\/b><\/p>/,'## \\1')
input.gsub!(/<p class="p3"><b>(.+?)<\/b><\/p>/,'## \\1')
input.gsub!(/<p class="p4"><b>(.+?)<\/b><\/p>/,'### \\1')
input.gsub!(/<p class="p5"><b>(.+?)<\/b><\/p>/,'### \\1')
input = input.split("\n").map { |line|
remove_empty(line)
}.join("\n")
# remove paragraph tags
input.gsub!(/<p class="p5">(.*?)<\/p>/,'\\1')
# emphasis
input.gsub!(/<\/?b>/,'**')
input.gsub!(/<\/?i>/,'*')
input = input.split("\n").map { |line|
line.strip
}.join("\n")
puts input
else
puts "File not found: #{file}"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment