How to get better results when benchmarking
I have been working on the Tycho VR for a few days now and this technique has improved the results more than any other.
Benchmarking 101 - make sure you are measuring what you think you are. Bit of extra debugging code and I saw this:
TEST: Loading 5000 fake producers
TEST: Loaded records
VR size: 5001
TEST: Loading 10000 fake producers
TEST: Loaded records
VR size: 15001
TEST: Loading 15000 fake producers
TEST: Loaded records
VR size: 30001
As you can see the registry is not being wiped each time we load new records (the extra 1 is the consumer running the tests BTW). This meant there were more records in the registry than were being recorded in the benchmark. For what I thought was 30,000 records with the newest code we had a latency of 19.8 msecs. It turned out that actually that was 105,000 records (oops). Here are the results with the fixed benchmark:
Time msecs Time msecs Records Old New 1000 4.0772 3.8278 2000 4.2952 3.5502 3000 4.8090 3.1882 4000 6.1277 3.4312 5000 7.8638 4.6287 10000 8.2883 4.5805 15000 12.550 5.5181 20000 18.941 6.5174 25000 26.945 7.4013 30000 37.017 8.4157
So the conclusion is if you want to improve your results by 400% tweak your code and then don’t fuck up the benchmarking.