Railsmagazine60x60 Background Processing in Rails

by Erik Andrejko

Issue: Winter Jam

published in December 2009

Portrait

Erik Andrejko is a web developer currently living in Madison, WI where he enjoys sailing keelboats. He writes about Ruby on Rails, web design and user experience at http://railsillustrated.com and can be reached at erik /at/ railsillustrated /dot/ com or on Twitter as 'eandrejko'.

Background Processing in Rails

You probably want your Rails application to be fast, and usually the faster the better.  Optimiziation is, in general pretty hard.  Often it is enough for your application to appear fast, even if it is only a trick.  One trick is to use some kind of background processing.

When a client issues a request, the correct controller action is invoked, and the view rendered.  Until this process is complete, the user has no choice but to wait. There are many cases when time taken to respond to a request is due to one particular long running process that is invoked in the controller action or the view.

long-running-processes-300dpi.png

Long Running Processes

If the long running process needs to be run, but doesn't need to be run in order to send the response to the client, you can use a standard trick: run the long running process after, or in parallel with, sending the response back to the client.  By moving the long running process to the background you can improve the user experience of your application by creating a faster response time without actually speeding up any of the code.

Blocking the Rails application to perform a long running action presents a few problems.   When the Rails application process is blocked by a long running request it doesn't do anything else.  For example, if instead of taking 50ms to respond to the request it takes 500ms, this leaves 9 other requests that would have been handled by the Rails application instance that must be handled by another process.  From the user perspective the problem is different but just as bad.  Their experience using your application leaves the impression that your application is slow and they may exacerbate the problem by issuing multiple requests.

We'll show how to run code in the background in a moment.  First let's take a look at the various different architectures available to run part of your Rails application to the background.

Time for Divorce

Background processing is most applicable when there is a deferrable long running method that is slowing down the request response cycle. Since the method that is slowing down the request response cycle is unavoidable but also deferrable, the only alternative is to divorce this method from the request response cycle and to run this method in a background process.  There are two primary ways to run this method in a different process: spawning and queuing.

Spawning Queues and Workers

The first method is the simplest method, the spawner method.

spawning-300dpi.png

Spawning

Using the spawner method, a duplicate copy of the Rails application is created (or forked) and the duplicate copy runs the long running method.  This is fairly straightforward to implement, even without using any plugins, but there is a big disadvantage to this method.  First, it is slow since an entire Rails application has be started to handle each background process.  Second, it uses a lot of memory for exactly the same reason.

One approach to avoid the problems with spawning a new Rails process is to replace spawning with threads.  This is architecturally identical to the spawner approach whereby a new thread is created to handle the background process and the original thread sends back the response to the client.  There are however a few problems with this approach.  All of Rails is not yet thread-safe and working around those parts that are not introduces some avoidable complexity.  In addition, concurrency is considered a 'hard problem' and can introduce some complex and very difficult to fix bugs.  Finally, if the Rails process is taken down, all of the threads will be lost with it and there will be no way of knowing which of these threads have completed their task.

All of the problems with spawning another process and using threads can be avoided by using a work queue.

queueing-300dpi.png

Queueing

Using a work queue to perform background processing the Rails application places each job in a designated queue. A separate worker application takes jobs from the queue and performs them. Using a queue has a few advantages over the spawning/threading method.  It is possible to have the worker, or several, running on a separated machine. Also, the queue can be maintained separately from the Rails process, so if the Rails process is taken down or restarted the state of each job will not be affected. 

Setting up Workling

There is a flexible Rails plugin for background processing called `workling`.  Workling supports both the spawner and queuing approach to background processing.  To install the `workling` plugin use

script/plugin install git://github.com/purzelrakete/workling.git

Since we want to use Workling in a queueing architecture, we will need to install queue for Workling to use.  A good choice is to use the Starling queue.  The Starling queue is built on top of memcached and will be used, if installed, by Workling by default. To install the starling queue:

gem sources -a http://gems.github.com/

sudo gem install starling-starling

mkdir /var/spool/starling

Now there are two processes that must be running in order for the Workling to work correctly: the Starling queue and the Workling client. The Starling queue can be started with the command:

sudo starling -d -p 22122

This will start `starling` on port 22122. By default Workling will look for Starling on that port in development. However, in production Workling will look for Starling by default on port 15151.  In production either start Starling on port 15151, or change the Workling configuration in the config/workling.yml file. 

We have to also start the Workling client.  To start this worker process:

script/workling_client start

It is probably a good idea to ensure that these processes are running all the time by using a tool such as `monit`. If these process are not running, the Rails application may fail to start and also may throw exceptions when a background process is initiated.

Sending Emails in the Background

As an example let's implement background email delivery using Workling. This is a good example because background delivery of email doesn't need to provide feeback to the user in the web browser about the status of the background process.  In general when performing background processing, the current progress of the background process should be communicated in a reasonable way to the client.  Exactly how to do this is usually to be application specific and outside the scope of this article.

First, create a worker class that will be responsible for delivery of the email in the background.  The workers are stored in the app/workers directory.  To create the EmailWorker, add the file email_worker.rb to the app/workers.

class EmailWorker < Workling::Base

  def send_activation_email(options)

    user = User.find(options[:id])

    user.deliver_activation_email

  end

end

In this example, the EmailWorker has a single method that will be called from the Rails application. The EmailWorker will have access to all of the models and will find the correct User instance and then call the deliver_activation_email on that instance.

Calling the worker   

Inside of the controller or model in the Rails application we can call any EmailWorker method asynchronously by prefixing the method name with async_. For exampled, if we keep our business logic inside the models, we may call the send_activation_email method from within a User instance by:

begin
  EmailWorker.asynch_send_activation_email(:id => self.id)
rescue => e
  logger.error("#{e.message}")
  self.send_activation_email

end

The EmailWorker.asynch_send_activation_email method call will return almost immedately, and a job will be placed in the queue. The worker will then take the job from queue, find the correct worker and execute the send_activation_email method of the worker.

It is a good idea to surround the EmailWorker.asynch_send_activation_email call with a begin ... rescue block. Since we have added background processing as a progressive enhancement, it should fail gracefully. It is probably a good idea to log the execption and then to and attempt delivery without using background processing. Otherwise there is a risk of never calling @user.send_activitation_email.

Queue Options

Starling is the default queue for Workling. The creators of Starling describe it as 'slow' but I find it fast enough to use for things like sending activation emails. Workling supports many other queues that are significantly faster including: RabbitMQ and RudeQueue. If you will be processing many background jobs you might want to investigate some alternative queues and also moving the worker process to a separate machine.

An Alternative Logger

The Rails logger is accessible from the Workling worker.  I have found it helpful to log messages associated with background processing to a separate log file that is associated with a particular model.  In our email example you might want to have a user log file that will log all messages associated with the background processing of the deliver_activation_email method. For instructions on installing a plugin to provide a separate log for Rails models see: http://railsillustrated.com/class-based-logger-in-rails.html