This is part of Early Gnutella
Improving LimeWire's Performance
Christopher Rohrs
November 24, 2002
There are several ways of improving LimeWire's performance in terms of CPU usage, startup time, and memory footprint. It's hard to say which are most important since profiling is sensitive to the user's configuration, the network load, etc. For example, an ultrapeer sharing many files will spend more time in file matching logic than an ultrapeer sharing no files.
Here are some ideas, in no particular order:
- Add support for non-blocking IO, which greatly reduces the number of threads.
- Reduce startup time by delaying the loading of inessential GUI components, like the options window. I know Adam has experimented with this before.
- Optimize connection initialization/handshaking. There appears to be a number of reasons for poor performance. First, the calls to HostCatcher.add can be too slow, which is called for every X-Try/X-Try-Ultrapeer address in connection handshake headers. Second, the act of actually sending and receiving headers (in AuthenticationHandshakeResponder and subclasses) can be slow.
- Reduce memory footprint of query routing (QRP) tables. Currently we allocate a whole byte per entry; we only need a bit. Susheel has started to work on this. A better approach might be to use a single route table for all connections. Each entry of this table would be a list of all connections matching that index. This could greatly optimize broadcast speed, as there is no need to iterate through all connections in the common case. One difficulty is that connections can send tables of different lengths.
- Optimize message writing in ManagedConnection. Writing a message involves queuing a message and notifying its associated output thread. The output thread then removes the message from the queue and writes it to the socket. There are two things to optimize: the overhead of wait/notify and of manipulating the queue. The latter is easier to optimize. Also, it should be possible to reduce the memory footprint of these queues. Leaf connections may not need the full SACHRIFC buffers, for example. It is probably not necessary to preallocate a fixed amount of memory for these buffers when they are typically mostly empty. At the least, we should avoid allocating these buffers until the connection has been initialized and accepted for normal use.
- Reduce the memory footprint needed to display search results.
- Optimize ConnectionManager.hasClientSupernodeConnection. Currently this method can use up to 20% of all CPU time. The problem is that it iterates through all connections, which requires cloning the array. This method is called from allowAnyConnection, which is called when accepting new connections or handling pings. A better approach is to simply to augment ConnectionManager with a boolean that is true iff one of the connection is a leaf to ultrapeer connection. Be sure to maintain the invariant when adding and removing connections.
- Optimize the creation of GGEP extensions in pongs. Currently it can consume 20% of all message handling time. The problem is too many calls to StringBuffer.append from PingReply.write, which is called from PingReply.newGGEP from the PingReply constructor.
- Optimize route table GUID lookup. Currently we use TreeMap to back our RouteTable data structure. I can't remember why we don't use HashTable's; I think it has to do with performance problems from resizing the tables. I would be curious to try splay trees, a randomized data structure. Splay trees can outperform standard dictionaries under irregular access patterns, e.g., when handling a stream of query replies with the same GUID..

