I had a directory tree with around 4000 pdf files and I needed a page count - so I semi-rolled this. I swiped the counter code from the gem README: http://github.com/yob/pdf-reader/tree/master
It could be more contained - as is I run it from irb:
>> require 'total_pages'
>> pagetotal = TotalPages.new
I added rescue to print info on a file if it fails to open or doesn't conform to to the PDF specification and causes pdf-reader to raise an error - without this the script will quit - that sucks when you're trying to count pages in thousands of files.
- require 'rubygems'
- require 'pdf/reader'
- class TotalPages
- def count(dir)
- @conv_directory = dir
- ## I output the directory argument as a test with the below line -
- ## mostly to make sure that passing '.' gets current dir
- # puts @conv_directory
- def directory
- def directory_tree
- def recurse_and_count
- total = 0
- directory_tree.each do |item|
- case File.stat(item).ftype
- when 'file'
- if File.extname(item).downcase == ".pdf"
- receiver = PageReceiver.new
- pdf = PDF::Reader.file(item, receiver, :pages => false)
- total += receiver.pages
- end rescue p item
- # receiver = PageReceiver.new
- # pdf = PDF::Reader.file("somefile.pdf", receiver, :pages => false)
- class PageReceiver
- attr_accessor :pages
- # Called when page parsing ends
- def page_count(arg)
- @pages = arg
You need to login to post a comment.