NFS Performance, continued

Back in May I wrote about the performance problems we were having with our new NFS based user filestore. It’s been a while since then, and the problems have continued. We have noticed that it appears to be load related – not just the network, but also the machine. This suggests that our theories about IPsec causing the slow down may be correct.

Our original plan was to try a private network which would remove the need for IPsec and also remove any latency added by routing the traffic between our subnets. This still seemed like a good plan, so I asked around and another department kindly lent us a brand new gigabit switch. We’ve connected this to one of our NFS clients and to the cluster node that’s currently running our filestore.

So far we’ve noticed some serious performance boosts. There’s only a few of us using it, so it could just be that it’s a lightly loaded connection – time will tell on that one. The bottom line is that it seems to be quicker than the IPsec connection ever was, so hopefully we’re on to a winner. We’ve also got a few staff testing it out, and their responses have been positive so far.

The next step after this testing period is to look at the costs of doing this properly with our own equipment. One of the key things we’ve been doing recently is increasing the redundancy of our systems, so it’d be fairly daft to do this with just one switch. We’d need at least two, with every cluster node connected to both, and every client that we want optimum performance on connected to both. Obviously there’ll be other clients that are less important and they can continue to use the existing infrastructure.

Of course, I’ve got absolutely no idea where we’ll put these switches, or how we’ll wire them in – things are pretty tight in our racks at the moment. Suppose there’s got to be a challenge somewhere :-)

My only worry with all this is what we’ll do if it doesn’t work. I don’t have any other ideas that’d make it go quicker – to be frank, you can’t really get any quicker than a directly connected switch. Lets hope we don’t have to worry about it.

  • Share/Bookmark

Related posts:

  1. NFS+IPsec Performance We’ve recently moved to having our filestore NFS exported from a cluster. This provides almost complete resilience from hardware failures, and moves us away from depending on individual end-user systems with locally attached filestore. Given the inherent insecurities with NFS we opted to use IPsec authentication (but not encryption) between the hosts involved. The NFS [...]...
  2. NFS Performance, concluded Back in the middle of last year I wrote about our plans to tackle our NFS performance issues by introducing a direct and dedicated network link to carry our NFS traffic between the clients and the servers. We’d done the tests so we just had to implement it. First we waited for the financial year [...]...
  3. Upgrading Debian If you’ve been following my blog you’ll know that I’ve been working on a new filestore project at work for a while now. After getting things working nicely on our Solaris machines, and finally moving my home directory over, I decided to tackle our Debian server. It quickly became apparent that I’d need to upgrade [...]...
  4. Now what? It’s too scary to use… Its been months in the making, but it’s finally done. We have our new filestore ready to go. There’s still plenty to do, like rolling it out for the teaching machines and web filestore, but at least we’ve got the main part done. So why has it taken so long? I spent a long time [...]...
  5. Impending doom (for our filesystems, anyway) Over the past year or so the space usage on our research and web filesystems has pretty much doubled to the point where we’re dangerously close to running out of space. There’s currently about 1TiB of filestore available of which less than 10% remains unused. Teaching filestore, however, has barely grown at all during the [...]...

One Response to “NFS Performance, continued”

  1. Mogamat says:

    Checkout Falt Network Neighbourhoods – http://aggregate.org/FNN/

Leave a Reply