Benchmarking the Tycho IRC VR interconnect
As part of my VR benchmarking I have tested the latency of sending Tycho VR messages via IRC, I have measured how the number of clients, the number of records being returned (message size) and the number of mediators effect the latency. Here are a few of the problems I have had to overcome and some results.
The many clients tests (no IRC here but I’ve included it anyway) worked without much hassle, I ran the test with a fixed number of records in a single mediator and varied the number of clients from 1 to 1000. Latency increases linearly with number of clients from 4 msec for 1 client to 2402 msec for 1000 clients, each client adds approximately 2.3 msec to the latency. The main problem with this test was the complexity of the shell script used to setup the components for each iteration.
The second test involving IRC measured the effect on latency when increasing the number of records returned by a query. This does two things, message size increases (there are more records in the LDIF) and it takes longer for the registry to create the LDIF. I spent quite a bit of time optimizing the code for both these things. Because of the other benchmarks I have already done I can compute the overhead of just the IRC part of this test ? around 6.3 msec per message, for some perspective HTTP adds 0.13 msec per message.
The final test (which is still going on) measures what happens to query latency as you increase the number of mediators. There are several permutations of this test but the current one has each new mediator adding 1000 records to the total size of the VR. I had a problem when I got to 40 mediators as the IRC servers default configuration only allowed 5 simultaneous connections from a single IP (40 mediators / 8 nodes of Holly). It took a bit of head scratching to realise what was happening but I have used the NGIRCD config file to increase this limit to -1, unfortunately I have no way of knowing how big -1 is in this case :) Initial results are showing that each extra mediator (and it’s extra 1000 records) adds 0.36 msec to the latency of the query. This looks very good but I will wait to see how well it scales up to 1000 mediators, currently I am at 140.
I have left out the forth test which measures the performance of different back end stores (MySQL, HSQLDB and my simple store) with different number of records and different types of query. This test was completed a week ago.