Return to Snippet

Revision: 11163
at January 26, 2009 23:18 by inkdeep


Initial Code
require 'rubygems'
require 'pdf/reader'

class TotalPages

  def count(dir)
    @conv_directory = dir
    ## I output the directory argument as a test with the below line - 
    ## mostly to make sure that passing '.' gets current dir   
    # puts @conv_directory
    recurse_and_count
  end
  
  def directory
    @conv_directory
  end

  def directory_tree
    Dir["#{directory}/**/*"]
  end
  
  def recurse_and_count
    total = 0
    directory_tree.each do |item|
      case File.stat(item).ftype
        when 'file'
          if File.extname(item).downcase == ".pdf"
            receiver = PageReceiver.new
            pdf = PDF::Reader.file(item, receiver, :pages => false) 
            total += receiver.pages
          end rescue p item
        end
    end
    total
  end
  
end

# receiver = PageReceiver.new
# pdf = PDF::Reader.file("somefile.pdf", receiver, :pages => false)
class PageReceiver
  attr_accessor :pages

  # Called when page parsing ends
  def page_count(arg)
    @pages = arg
  end
end

Initial URL


Initial Description
I had a directory tree with around 4000 pdf files and I needed a page count - so I semi-rolled this. I swiped the counter code from the gem README:
http://github.com/yob/pdf-reader/tree/master

It could be more contained - as is I run it from irb:

`>> require 'total_pages'`  
`>> pagetotal = TotalPages.new`  
`>> pagetotal.count('/my/pdf/directory')`  

I added rescue to print info on a file if it fails to open or doesn't conform to to the PDF specification and causes pdf-reader to raise an error - without this the script will quit - that sucks when you're trying to count pages in thousands of files.

Initial Title
Recurse directory tree and count pages in all pdf files using pdf-reader gem

Initial Tags
ruby

Initial Language
Ruby