Lets Get Dugg!



I am releasing Typeface today. This is a blogging application written ontop of Catalyst. You can go ahead and download the release right here.

Typeface-0.2.tbz2


Catalyst vs Rails vs Django Cook off

Today I began working on a new project and decided to benchmark Catalyst and Rails for fun. See how my new favorable framework does against Rails. I was a bit shocked at the results though. I guess this is worth mentioning in hope Catalyst can improve in it's Accessor Generation code. So here are the results:

Benchmark System
Celeron 1.8Ghz
1 Gig of Ram
FreeBSD-6

Interpreters:
Ruby - 1.8.5
Perl - 5.8.8

Frameworks:
Catalyst - 5.7003
Rails - 1.1.6

Run as:
Lighttpd: 1.4.13
FCGI: 3 max proc

Benchmarked as:
ab -n 1000 -c 100 http://siteurl.com/

h3. Some background

I specifically turned off sessions and did not use ActiveRecord/DBIC to keep it as fair as possible between the two frameworks. Both frameworks were run under Lighttpd and FCGI. I tried to keep this as apples to apples as possible.

So lets take a look at the results!

Rails:


Server Software:        lighttpd/1.4.13                                    
Server Hostname:        wansanity.com
Server Port:            9090

Document Path:          /main/index
Document Length:        2142 bytes

Concurrency Level:      100
Time taken for tests:   18.261 seconds
Complete requests:      1000
Failed requests:        0
Broken pipe errors:     0
Total transferred:      2296892 bytes
HTML transferred:       2143288 bytes
Requests per second:    54.76 [#/sec] (mean)
Time per request:       1826.10 [ms] (mean)
Time per request:       18.26 [ms] (mean, across all concurrent requests)
Transfer rate:          125.78 [Kbytes/sec] received

Connnection Times (ms)
            min  mean[+/-sd] median   max
Connect:       74   885 1742.9    138 11785
Processing:   172   661 1216.8    173  8195
Waiting:       84   661 1216.8    173  8194
Total:        172  1547 2123.8    330 11893

Percentage of the requests served within a certain time (ms)
50%    330
66%   1354
75%   2786
80%   3106
90%   4297
95%   6279
98%   8216
99%   9285
100%  11893 (last request)

Thats 54 connections / sec which is great. I have seen it peak at 70 connections/sec which is just awesome!

Catalyst:

	
Server Software:        lighttpd/1.4.13                                    
Server Hostname:        wansanity.com
Server Port:            80

Document Path:          /
Document Length:        2232 bytes

Concurrency Level:      100
Time taken for tests:   43.503 seconds
Complete requests:      1000
Failed requests:        0
Broken pipe errors:     0
Total transferred:      2401300 bytes
HTML transferred:       2238490 bytes
Requests per second:    22.99 [#/sec] (mean)
Time per request:       4350.30 [ms] (mean)
Time per request:       43.50 [ms] (mean, across all concurrent requests)
Transfer rate:          55.20 [Kbytes/sec] received

Connnection Times (ms)
              min  mean[+/-sd] median   max
Connect:       75   322  808.5     93  6028
Processing:   269  3804  851.8   3928  6754
Waiting:      192  3804  851.7   3928  6754
Total:        269  4126 1178.5   4186 10293

Percentage of the requests served within a certain time (ms)
  50%   4186
  66%   4384
  75%   4404
  80%   4424
  90%   5025
  95%   6422
  98%   7194
  99%   7709
 100%  10293 (last request)
	

22 connections / sec not exactly what I expected from a framework built on top of the fast Perl Interpreter.

Being a bit disappointed with the results, I investigated further.

So here are the perl dprof results.

	
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 0.00   0.605  4.128   1512   0.0004 0.0027  NEXT::AUTOLOAD
 0.00   0.373  0.373  25794   0.0000 0.0000  Class::Accessor::Fast::__ANON__
 0.00   0.235  0.235   1177   0.0002 0.0002  NEXT::ELSEWHERE::ancestors
 0.00   0.211  0.225      1   0.2107 0.2253  YAML::Type::code::BEGIN
 0.00   0.184  5.182     86   0.0021 0.0603  Catalyst::Engine::HTTP::_handler
 0.00   0.177  0.205   2583   0.0001 0.0001  File::Spec::Unix::canonpath
 0.00   0.164  0.309   1942   0.0001 0.0002  File::Spec::Unix::catdir
 0.00   0.156  2.408   3201   0.0000 0.0008  Catalyst::Action::__ANON__
 0.00   0.134  0.739     73   0.0018 0.0101  base::import
 0.00   0.129  0.136   5904   0.0000 0.0000  Class::Data::Inheritable::__ANON__
 0.00   0.109  0.814      7   0.0155 0.1163  main::BEGIN
 0.00   0.108  0.108   1323   0.0001 0.0001  HTTP::Headers::_header
 0.00   0.101  0.116     10   0.0101 0.0116  Template::Parser::BEGIN
 0.00   0.101  0.334     11   0.0092 0.0304  Catalyst::Engine::BEGIN
 0.00   0.101  0.295   1264   0.0001 0.0002  Path::Class::Dir::stringify
	

It seems like the main bottleneck in Catalyst 5.7003 is Next. Jrockway was kind enough to post some new code into Catalyst's trunk for me to try; a new replacement for Next - C3.

Here are the results with the C3 Plugin from Trunk

	
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 0.00   0.211  0.233      1   0.2106 0.2330  YAML::Type::code::BEGIN
 0.00   0.135  0.135   8035   0.0000 0.0000  Class::Accessor::Fast::__ANON__
 0.00   0.126  0.721     73   0.0017 0.0099  base::import
 0.00   0.109  0.116     10   0.0109 0.0116  Template::Parser::BEGIN
 0.00   0.108  0.805      7   0.0155 0.1150  main::BEGIN
 0.00   0.093  0.106      7   0.0133 0.0152  Catalyst::Engine::HTTP::Restarter:
                                             :Watcher::BEGIN
 0.00   0.090  0.105   1023   0.0001 0.0001  File::Spec::Unix::canonpath
 0.00   0.085  0.326     11   0.0077 0.0296  Catalyst::Engine::BEGIN
 0.00   0.081  0.905    196   0.0004 0.0046  Catalyst::execute
 0.00   0.069  0.120      8   0.0087 0.0150  Catalyst::Plugin::Server::XMLRPC::
                                             Request::BEGIN
 0.00   0.064  1.639    444   0.0001 0.0037  next::method
 0.00   0.061  0.313     32   0.0019 0.0098  Catalyst::BEGIN
 0.00   0.054  0.216      7   0.0077 0.0309  Template::Config::load
 0.00   0.054  0.189      4   0.0135 0.0473  HTTP::Body::OctetStream::BEGIN
 0.00   0.054  0.388      4   0.0135 0.0970  Gambit::BEGIN	
	

So there you have it, the results with the C3 Plugin. It only made a slight difference by pushing the Catalyst benchmark score to 25 connections / sec.

I hope this benchmark can get some changes put into place for Catalyst's next release.

Conclusion

It seems like Rails is roughly 62% faster than Catalyst at this time. Keep in mind this benchmark does not take into account the ORM performance. This benchmark tests how quick the frameworks themselves dispatch methods and render views.

Also take into consideration when choosing a framework you need to look at the problem at hand. Catalyst can feed off Perl's vast CPAN resource library. Catalyst has features that Rails does not have. Catalyst's DBIC ORM supports multi-column primary keys and can do relationship mapping just by reading the schema! You don't even have to bother writing any has_many belongs_to definitions!

I am going to have to take a look into Django see how well it fairs in this benchmark. Perhaps an update on this?

Update Django Results


Server Software:        lighttpd/1.4.13                                    
Server Hostname:        fab40
Server Port:            9090

Document Path:          /
Document Length:        2235 bytes

Concurrency Level:      100
Time taken for tests:   13.643 seconds
Complete requests:      1000
Failed requests:        0
Broken pipe errors:     0
Total transferred:      2409769 bytes
HTML transferred:       2253459 bytes
Requests per second:    73.30 [#/sec] (mean)
Time per request:       1364.30 [ms] (mean)
Time per request:       13.64 [ms] (mean, across all concurrent requests)
Transfer rate:          176.63 [Kbytes/sec] received

Connnection Times (ms)
              min  mean[+/-sd] median   max
Connect:       76   483 1068.1    101  8666
Processing:   190   744  726.3    571  6088
Waiting:       93   744  726.4    572  6088
Total:        190  1227 1414.2    692  9606

Percentage of the requests served within a certain time (ms)
  50%    692
  66%    972
  75%   1209
  80%   1445
  90%   3282
  95%   4020
  98%   6414
  99%   8113
 100%   9606 (last request)

72 connections / sec! Amazing and the winner!

And anyone that disagrees with this can go ahead and look at the code for all three projects.

I have the least experience with django for your information

mst Please don't kill me'

Many thanks go out to jrockway to helping me point out the root cause of the bottleneck in Catalyst.


My second release of Typeface. Lots of bugs have been fixed revolving around datetime. I have also included three schema files for mysql,sqlite,pgsql to bootstrap the initial database.

Enjoy

Typeface-0.3


well, hopefully this release will go smoother. I forgot to remove specific tidbits from testing that got included in 0.03. This should be a more polished release.

Fixed:
IE render width issue.
Removed all traces of lets get dugg.

New Features:
Template support.
Site title configurable in YAML

Typeface-0.4.tbz2


- Fourth release, a bit more user friendly than before.

Fixed:
Fixed all remaining IE render issues.
Cleaned up CSS.
Fixed caching issues with cache::store::fastmmap.
Code clean up.
Fixed miscellaneous metaweblog xmlrpc issues.
Fixed time in words to not be dependent on a single time zone.
Cleaned up documentation.

New Features:
Added install script to bootstrap the initial user.


My first Catalyst screencast.

I used my view helper TTSimple.

Enjoy!


This is my Catalyst Textmate bundle. It features snippet shortcuts that should make you a more productive Catalyst developer. Happy coding.

Catalyst/trunk/misc/textmate_bundle/

Use SVN to grab it.

So what can you do with it?

Snippet Command
Output
csub

default Catalyst controller method.

body

$c->res->body()

forward

$c->forward()

param

$c->req->params->{}

stash

$c->stash->{}

tmpl

$c->stash->{template} = ''

flash

$c->flash->{}

headers

$c->req->headers->{}

model

$c->model('::')->.


New version of Typeface. One too many things to list of what has changed. Just check it out for your self.


Instead of sitting on this any longer, I decided to release Typeface 0.7. Typeface has been updated to utilize the recent Catalyst::Controller::FormBuilder module. I have also included two new ported WordPress themes Chaotic Soul and Connections. Since refactoring the template view, porting WordPress themes is extremely trivial now.

You can find the release at http://typeface-project.org/

-Victor


Quick blurb concerning Typeface

I have been off in Java Wicket Land for the past month or two. In this time, I have let Typeface slide without putting out much needed maintenance releases for the various quirks currently found in Typeface. Anyway, I hope to get a new release of Typeface in a few days or so that will fix some of the rough edges.


Unlike most people I came to Catalyst from the Rails camp. I miss a few things namely, file_column for mapping file uploads to the filesystem.
Thankfully writing DBIC components is fairly trivial. I wrote this in a few minutes so consider this a hack. This component makes your DBIC class dependent on Catalyst since it uses a reference from the Catalyst upload context.

I hope to improve upon this and add thumbnail support for images. For now this will do.


Just wanted to wish everyone a happy new year. A new Typeface release is imminent. The new revision has a vastly snappier Dojo backend due to optimization. The Typeface release will include two new extra themes; connections and chaoticsoul.

I have also been working on Catalyst::Controller::FormBuilder::DBIC. This module builds FormBuilder interfaces based off DBIC schemas. It should make developing CRUD applications in Catalyst a snap. The unique thing that makes this stand out from other solutions such as Rails scaffolding or the Django admin backend is that Catalyst::Controller::FormBuilder::DBIC builds a formbuilder object at runtime which can be modified in the controller action before sending it off to the view. This enables you to modify any details on the form before displaying to the user. Since no code is written you don't have to scaffold, rewrite and repeat.

Should look something like this:


Not everyone chooses to go the FCGI way of deploying their web applications. Some people, like I, prefer deploying applications under Catalyst's httpd or Rails' Mongrel. Unfortunately, Lighttpd at this time (version 1.4.13) has a inept mod_proxy module. It does not load balance correctly and nor does it recover from a downed proxy node, requiring a full restart. Obviously this is unacceptable when it comes to a production system.

deployment

Pound comes in and saves the day. It is a fast load balancing proxy that claims it can handle 600 requests/sec. The deployment of choice here is Lighttpd => Pound => web application. However, there is a small snag, Pound appends X-Forwarded-for headers without an option to disable it. So every request that comes in from Pound to your web application comes with "X-Forwarded-For: 127.0.0.1." This means you can't tell from where the client came from. Here is the solution to remedy this issue, but it requires some hacking on the Pound source.

Open http.c and comment out line 902 and 903
====

You basically want to comment out the top two lines. With this out of the way Pound does not append the extra X-Forwarded-for headers. You should now be able to receive the originating IP address of the client connected to your web application.

Now to finish up. Configuring Lighttpd to pass along to Pound and then to your web application.

Sample Lighttpd configuration

$HTTP["host"] =~ "^letsgetdugg.com$|^www.letsgetdugg.com$" {
    server.document-root        ="/home/victori/servers/letsgetdugg/root"
    dir-listing.activate        = "disable"
    accesslog.filename          = "/var/www/lighttpd/log/letsgetdugg.access.log"
    server.errorlog             = "/var/www/lighttpd/log/letsgetdugg.error.log"
    $HTTP["url"] !~ "static/" {
        proxy.server = ( "" => ( "Letsgetdugg" => ( "host" => "127.0.0.1" , "port" => 7999, "check-local" => "disable" )))	 
    }
}

We make sure that anything in /static does not get sent to your web application but gets processed by Lighttpd.

Sample Pound configuration

ListenHTTP 
  Address 127.0.0.1 
  Port    7999 
  Service
    HeadRequire "Host: .*letsgetdugg.com.*"
    BackEnd
      Address 127.0.0.1
      Port    9010
    End
    BackEnd
      Address 127.0.0.1
      Port    9011
    End
    BackEnd
      Address 127.0.0.1
      Port    9012
    End
  End

Thats all! This deployment should suffice till Lighttpd 1.5 goes stable.


Try this fun perl benchmark, to test your dual core, SMP or hyperthreaded system.

Before running, make sure you have perl 5.8 with threading support compiled in.

Perl has native ithreads as of perl 5.8.


Dynamic type languages such as Perl, Ruby, PHP, and Python free you as the developer from managing memory in your application. However, it isn’t a fool proof solution that you won’t have memory leaks in your application. You as the developer should be aware of how the underlying garbage collector of your preferred language works to accommodate for the inadequacies of its garbage collection algorithm.

Currently there are two ways of doing garbage collection; mark and sweep and reference counting. The Perl interpreter uses the latter. Reference counting is a fairly simple garbage collection technique. Each time you declare an instance, the reference count increments by one. When your program reaches the end of scope, objects with a reference of one get collected. However, if your object has a reference count of two it is kept. The one main draw back of reference counting is the fact it can’t deal with circular references. This is when both objects point to each other and they never get garbage collected.

On the other hand, Ruby and Java use the mark and sweep garbage collector. I personally have mixed feelings about it, since I don’t know exactly when my objects will be collected. The way mark and sweep garbage collection works, is it does not collect anything for a period of time. At intervals when the heap gets full, it runs its garbage collection. The downside to this is you don’t know exactly know when this happens and if there are lots of objects to be collected this leads to “stutters” and unresponsiveness of the application. If you have ever used a Java swing application you might have noticed these stutters, this is when garbage collection is taking place. However, it’s not as gloomy as I set the pretense to be with the mark and sweep garbage collection. Mark and sweep garbage collection can handle cyclic references unlike with reference counting, which is a huge boon to its usefulness. There has been much work done on mark and sweep garbage collection, specifically with generational mark and sweep collectors that try to fix the unresponsiveness issue. Java currently uses a generation GC, and Ruby hopes to obtain a generational GC for the Ruby 2.0 interpreter. Ideally a generational garbage collector would be the preferred GC for a long-standing process.

With that little garbage collection background out of the way, lets look at the life cycle of a instance in reference counting garbage collector.

Here is an example of how reference counting works ideally:

Unlike mark and sweep garbage collection with reference counting, you know exactly when your instance gets collected.

Here is a very simple problematic case for reference counting:

This is a fairly simple case of where reference counting falls right on its face. Usually this isn’t a problem since most Perl scripting revolves around short-lived scripts. However, with frameworks such as Catalyst that are long running perl scripts this becomes an issue quickly. Thankfully, with Perl it is extremely easily to nail memory leaks, more so than with Ruby or Java. Enter Devel::Cycle and Devel::Peek, both of these modules can be installed from cpan. Both Devel::Cycle and Devel::Peek can assist you in tracking down the memory leak in a relatively short time.

# Sample output
# ibook:~/Desktop victori$ perl blah.pl     
# Cycle (1):  <-- find_cycle tells you literly where the cyclic reference leak is at.
#   $A->{'child'} => \%B                           
#   $B->{'parent'} => \%A                           
# 
# SV = RV(0x1817898) at 0x1800ec8
#   REFCNT = 1
#   FLAGS = (PADBUSY,PADMY,ROK)
#   RV = 0x18006dc
#   SV = PVHV(0x1830980) at 0x18006dc
#     REFCNT = 2 <-- Notice the reference count of 2 , we know we have a leak
#     FLAGS = (SHAREKEYS)
#     IV = 2
#     NV = 0
#     ARRAY = 0x404e60  (0:6, 1:2)
#     hash quality = 125.0%
#     KEYS = 2
#     FILL = 2
#     MAX = 7
#     RITER = -1
#     EITER = 0x0
#     Elt "name" HASH = 0xe6e17f14
#     SV = PV(0x1801460) at 0x1800ea4
#       REFCNT = 1
#       FLAGS = (POK,pPOK)
#       PV = 0x401730 "victor"\0
#       CUR = 6
#       LEN = 8
#     Elt "child" HASH = 0x33ec6b5
#     SV = RV(0x1817870) at 0x1832ca4
#       REFCNT = 1
#       FLAGS = (ROK)
#       RV = 0x1800484
#       SV = PVHV(0x18309b0) at 0x1800484
#         REFCNT = 2
#         FLAGS = (SHAREKEYS)
#         IV = 2
#         NV = 0
#         ARRAY = 0x404db0  (0:6, 1:2)
#         hash quality = 125.0%
#         KEYS = 2
#         FILL = 2
#         MAX = 7
#         RITER = -1
#         EITER = 0x0
#         Elt "parent" HASH = 0xa99c4651
#         SV = RV(0x18178a0) at 0x1832c44
#           REFCNT = 1
#           FLAGS = (ROK)
#           RV = 0x18006dc


So how do we fix this? Quite simple, all we do is weaken the reference count using weaken(). Here is a proper way of patching up the memory leak we introduced in our program.

We weaken the reference at the parent level to set it back to a reference count of 1, so when it reaches the end of scope it will be collected and the memory leak will be no more.

Hopefully this is a good primer for other Perl coders out there who are facing memory leaks in their running long running perl scripts.


Perl, singletons, DAOs oh my!

Lately I have been toying with Wicket , Hibernate and Spring for potentially writing a large complex site. Contrary to what people say about Java web development it isn't so bad, at least with Wicket. Since toying with Hibernate and DAOs, I came to the conclusion this can work really well in Catalyst!

Instead of micromanaging what to page cache, just cache the specific area that is the performance penalty: the database.

By utilizing the singleton data access object (DAO) pattern I can avoid the headache of micromanaging what gets page cached on my site. Furthermore, this pattern gives me the flexibility to still keep the dynamism of Template Toolkit and whatever code gets executed in my controller action.

This also solves my problem of exporting incrementally with DBIx::Class::Schema::Loader without over writing whatever business logic I might of added to my classes. This pretty much keeps my controllers clean of business logic. This pattern forces me to have clean separated code.

Lets get down the the beef of how to get this all setup.

Lets begin by setting up our first(?) DAO class

Now here is our controller which utilizes are simple DAO.

So what do you think? makes sense? keeps the controllers nice and simple, without any business logic.

And time to test the performance of the implementation vs the page cache

PageCache run:

Requests per second: 97.30 [#/sec] (mean)
Time per request: 1027.70 [ms] (mean)
Time per request: 10.28 [ms] (mean, across all concurrent requests)

DAO run:

Requests per second: 66.05 [#/sec] (mean)
Time per request: 1514.10 [ms] (mean)
Time per request: 15.14 [ms] (mean, across all concurrent requests)

So it is a bit slower, well 31% slower to be exact. This is more than adequate performance for a relatively high traffic site.