I have a Ruby on Rails Website that makes HTTP calls to an external Web Service.
About once a day I get a SystemExit (stacktrace below) error email where a call to the service has failed. If I then try the exact same query on my site moments later it works fine. It's been happening since the site went live and I've had no luck tracking down what causes it.
Ruby is version 1.8.6 and rails is version 1.2.6.
Anyone else have this problem?
This is the error and stacktrace.
A SystemExit occurred /usr/local/lib/ruby/gems/1.8/gems/rails-1.2.6/lib/fcgi_handler.rb:116:in `exit' /usr/local/lib/ruby/gems/1.8/gems/rails-1.2.6/lib/fcgi_handler.rb:116:in `exit_now_handler' /usr/local/lib/ruby/gems/1.8/gems/activesupport-1.4.4/lib/active_support/inflector.rb:250:in `to_proc' /usr/local/lib/ruby/1.8/net/protocol.rb:133:in `call' /usr/local/lib/ruby/1.8/net/protocol.rb:133:in `sysread' /usr/local/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill' /usr/local/lib/ruby/1.8/timeout.rb:56:in `timeout' /usr/local/lib/ruby/1.8/timeout.rb:76:in `timeout' /usr/local/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill' /usr/local/lib/ruby/1.8/net/protocol.rb:116:in `readuntil' /usr/local/lib/ruby/1.8/net/protocol.rb:126:in `readline' /usr/local/lib/ruby/1.8/net/http.rb:2017:in `read_status_line' /usr/local/lib/ruby/1.8/net/http.rb:2006:in `read_new' /usr/local/lib/ruby/1.8/net/http.rb:1047:in `request' /usr/local/lib/ruby/1.8/net/http.rb:945:in `request_get' /usr/local/lib/ruby/1.8/net/http.rb:380:in `get_response' /usr/local/lib/ruby/1.8/net/http.rb:543:in `start' /usr/local/lib/ruby/1.8/net/http.rb:379:in `get_response'
Using fcgi with Ruby is known to be very buggy.
Practically everybody has moved to Mongrel for this reason, and I recommend you do the same.
It's been awhile since I used FCGI but I think a FCGI process could throw a SystemExit if the thread was taking too long. This could be the web service not responding or even a slow DNS query. Some google results show a similar error with Python and FCGI so moving to mongrel would be a good idea. This post is my reference I used to setup mongrel and I still refer back to it.
I used to get these all the time on Apache1/fastcgi. I think it's caused by fastcgi hanging up before Ruby is done.
Switching to mongrel is a good first step, but there's more to do. It's a bad idea to cull from web services on live pages, particularly from Rails. Rails is not thread-safe. The number of concurrent connections you can support equals the number of mongrels (or Passenger processes) in your cluster.
If you have one mongrel and someone accesses a page that calls a web service that takes 10 seconds to time out, every request to your website will timeout during that time. Most of the load balancers just cycle through your mongrels blindly, so if you have two mongrels, every other request will timeout.
Anything that can be unpredictably slow needs to happen in a job queue. The first hit to /slow/action adds the job to the queue, and /slow/action keeps on refreshing via page refreshes or queries via ajax until the job is finished, and then you get your results from the job queue. There are a few job queues for Rails nowadays, but the oldest and probably most widely used one is BackgroundRB.
Another alternative, depending on the nature of your app, is to cull the service every N minutes via cron, cache the data locally, and have your live page read from the cache.
I would also take a look at Passenger. It's a lot easier to get going than the traditional solution of Apache/nginx + Mongrel.