Saturday, August 25, 2007

Trials and tribulations of class variables in Ruby.

I guess I missed the discussion earlier this year about Ruby and class variables. There were a lot of posts about the problems with using class variables incorrectly, many of which, called for the use of class instance variables instead.

I recently fell into this class variable trap described in the posts above. Here is some code simulating my problem.
class Person
  @@count = 0
  def initialize(name)
    @name = name
    @@count += 1
  end
end

class Customer < Person
  @@count = 0
  def self.count
    puts @@count
  end
end

a = Person.new("Jim Bob")
b = Person.new("Jason Yates")
c = Customer.new("Albert Einstein")

puts  Customer.count
3

You would expect to get '1' back from the 'Customer.count' method. It is the more obvious behavior or the one with the least surprise. The issue is, as you may have already noticed, class variables are shared with its subclasses. This behavior, speaking from personal experience here,  can lead to very confusing and hard to diagnose bugs.

The problem above is easily solved by the use of class instance variables.
class Person
  class << self; attr_accessor :count; end
  def initialize(name)
    @name = name
    self.class.count ||= 0
    self.class.count += 1
  end
end

class Customer < Person
end

a = Person.new("Jim Bob")

b = Person.new("Jason Yates")
c = Customer.new("Albert Einstein")

puts Customer.count
1

As you can see this simulates the expected behavior.

This will all become mute fairly soon anyways. In Ruby 1.9, class variables are now read-only by its subclasses. With this even the first example should work correctly.

Monday, March 5, 2007

PDF Templates via Rails

Update:

From the idea in this post I created a RubyGem called pdf-stamper.

gem install pdf-stamper

For more information try Rubyforge or GitHub.

http://rubyforge.org/projects/pdf-stamper/
http://github.com/jaywhy/pdf-stamper/tree/master

Recently, for a project, I needed a PDF generator which took existing PDF as templates and filled in data. My simple requirement was PDF look and feel wouldn't require any coding. How hard could that be? Famous last words.

The Problem

The majority of PDF generators out there are exactly that generators. They simply create new PDF's based on programming code. They don't edit existing PDF files, let alone have the ability to fill in data. So I looked into a couple other solutions. First I checked out image based solutions like using ImageMagick. I could of course edit existing image files and add data. But, the problem is the added image data would have to be positioned. If the template image changed, so would my positions in code. So that was out.

I tried using PostScript files saved from Adobe Illustrator. I created a template and placed text like <% replace_me %>. That should work right? Wrong. For whatever reason Adobe Illustrator likes to create HUGE files. Yes 100,000 lines long in some cases. It also splits up text arbitrarily. So "<% replace_me %>" became "<% r" 20 lines of positioning and font information, "eplac" 20 lines of positioning and font information, etc. Making a search and replace impossible.

iText to the Rescue

iText is an excellent Java PDF library. In fact, I believe other then it's C# counterpart. Is the only PDF library which could do what I wanted. iText can take existing PDF's and manipulate them. It also can take blank PDF forms and fill out the data, similar to taking HTML form and setting each form field tags value attribute. This is exactly what I wanted. Although it is an ad-hoc round about solution it does work.

Creating PDF forms is pretty easy also. Downside is only one product can create them Adobe LiveCycle Designer, which comes with Adobe Acrobat Professional.

Solution

So first I needed to create a wrapper around iText using the excellent Ruby Java Bridge(Rjb).
require 'rjb'
Rjb::load('lib/itext-1.4.8.jar')

class PDFStamper
  attr_accessor :writer

  def initialize( template = "proposal_template.pdf" )
    filestream   = Rjb::import('java.io.FileOutputStream')
    acrofields   = Rjb::import('com.lowagie.text.pdf.AcroFields')
    pdfreader    = Rjb::import('com.lowagie.text.pdf.PdfReader')
    pdfstamper   = Rjb::import('com.lowagie.text.pdf.PdfStamper')

    reader = pdfreader.new( template )
    @stamp = pdfstamper.new( reader, filestream.new( tmpfile() ) )
    @form = @stamp.getAcroFields()
  end

  def set( key, value )
    @form.setField( key, value.to_s )
  end

  def fill
    @stamp.setFormFlattening(true)
    @stamp.close
  end

  def tmpfile
    return @tmpfile unless @tmpfile.nil?
    @tmpfile = File.join( Dir::tmpdir, make_tmpname )
  end

  private

  def make_tmpname
    return 'proposal-' + rand(10000).to_s + '.pdf'
  end
end

Caveats

First of, you need Java environment variables set correctly before this will work.
export LD_LIBRARY_PATH=/usr/java/jdk1.6.0/jre/lib/i386/:/usr/java/jdk1.6.0/jre/lib/i386/client/:./
export JAVA_HOME=/usr/java/jdk1.6.0/

You can set these variables in the command line and start mongrel manually "mongrel_rails start". Which will work fine. Except in production this isn't really a good solution.

I ended up using the mongrel_cluster init.d script that comes with mongrel. Documentation is available here. I simply placed the export commands on the top of the script.

Another issue I hit was when Java starts. Java will check for total available system memory and then precedes to steal a good portion of it. Now this isn't a problem with a dedicated server. A virtual server, on the other hand, is allocated a portion of the available system memory. So if the server you are on has 4gbs of memory. Java thinks it has 4gbs to play with, not the 256mb allocated to your virtual server.

This caused this weird issue where one mongrel process in my cluster would work and one wouldn't. Because each mongrel instance starts its own Java process. The first one would steal all the available memory. Then the second couldn't even start because no memory was available.

Making matters worse Rails or Mongrel, not sure which, would hide this memory error. I didn't figure it out until I created a test script that forked, each fork loading the iText jar. The test showed the error coming from Java.

To fix this, I set the _JAVA_OPTIONS environment variable. The options get sent to Java as it loads, it limits the amount of ram each Java instance can eat up. Just place this next to your other Java environment variables inside your init.d mongrel script.
export _JAVA_OPTIONS='-Xms16m -Xmx32m'

You may have to fudge these numbers a little for your particular environment. Or, if you are using a dedicated server don't worry about it.

Limitations

Now for my particular needs, I only needed text placeholders for the template. However, I believe using LiveCycle designer you can place image placeholders and table based data. Then use iText to fill them in. Don't take my word for it though.

Wednesday, August 16, 2006

Who needs compliance, we have "improvements"?

Microsoft's Chris Wilson, the Group Program Manager for IE, was interviewed by ZDNet today and with great marketing finesse managed to completely dodge the whole standards compliance issue. Instead Chris talks about “standard improvements” or “improvements in our standards support in IE7”. I applaud the IE team here for the IE6 bug improvements found in IE7. However, what about actual standards compliance? According to the IEBlog, the IE7 team has added support for.
  • HTML 4.01 ABBR tag
  • Improved (though not yet perfect) <object> fallback
  • CSS 2.1 Selector support (child, adjacent, attribute, first-child etc.)
  • CSS 2.1 Fixed positioning
  • Alpha channel in PNG images
  • Fix :hover on all elements
  • Background-attachment: fixed on all elements not just body
Most of these added features still have issues, or caveats related to them. Slack, of course, should be given with IE7 still being in beta. Overall though even with these additions, standards compliance hasn't changed much in IE7. Web Devout analysis of both IE6 and IE7 confirms this. First let me say a semi-legitimate argument could be made saying Web Devout's numbers are biased towards IE. However I still think is fairly effective especially when comparing IE with itself.
  IE 6 IE 7 Firefox 1.5 Opera 8.5 Opera 9
HTML / XHTML 73% 73% 90% 85% 85%
CSS 2.1 51% 55% 93% 92% 95%
CSS 3 changes 10% 13% 27% 8% 22%
DOM 50% 51% 79% 78% 84%
ECMAScript 99% 99% Y Y Y

As you can see, there really isn't much difference between IE6 and IE7 with regards to standards compliance. Most importantly, in my opinion, IE7 is still lacking:
  • DOM Level 2 Events(Netscape Communicator had this in 2000!)
  • DOM attributes are still broken
  • No Javascript 1.6 support
  • CSS :focus
  • SVG support
However my list is quite short. There are many more lists out better than mine.

I wouldn't go so far to say as IE7 is “just a bug release”. It is fairly close however. IE7 is still years behind most modern web browsers, and it hasn't even been released.