Skip to content

Instantly share code, notes, and snippets.

@emad-elsaid
Created March 23, 2014 13:06
Show Gist options
  • Select an option

  • Save emad-elsaid/9722831 to your computer and use it in GitHub Desktop.

Select an option

Save emad-elsaid/9722831 to your computer and use it in GitHub Desktop.
PDF to Text converter using ruby
#!/usr/bin/env ruby
require 'pdf/reader' # gem install pdf-reader
# credits to :
# https://github.com/yob/pdf-reader/blob/master/examples/text.rb
# usage example:
# ruby pdf2txt.rb /path-to-file/file1.pdf [/path-to-file/file2.pdf..]
ARGV.each do |filename|
PDF::Reader.open(filename) do |reader|
puts "Converting : #{filename}"
pageno = 0
txt = reader.pages.map do |page|
pageno += 1
begin
print "Converting Page #{pageno}/#{reader.page_count}\r"
page.text
rescue
puts "Page #{pageno}/#{reader.page_count} Failed to convert"
''
end
end # pages map
puts "\nWriting text to disk"
File.write filename+'.txt', txt.join("\n")
end # reader
end # each
@rjattrill
Copy link
Copy Markdown

Thanks. Nice work.

@Ahmad-Hassan0222
Copy link
Copy Markdown

can you give us complete detail about this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment