NFS Performance, continued

Back in May I wrote about the performance problems we were having with our new NFS based user filestore. It’s been a while since then, and the problems have continued. We have noticed that it appears to be load related – not just the network, but also the machine. This suggests that our theories about IPsec causing the slow down may be correct.

Our original plan was to try a private network which would remove the need for IPsec and also remove any latency added by routing the traffic between our subnets. This still seemed like a good plan, so I asked around and another department kindly lent us a brand new gigabit switch. We’ve connected this to one of our NFS clients and to the cluster node that’s currently running our filestore.

So far we’ve noticed some serious performance boosts. There’s only a few of us using it, so it could just be that it’s a lightly loaded connection – time will tell on that one. The bottom line is that it seems to be quicker than the IPsec connection ever was, so hopefully we’re on to a winner. We’ve also got a few staff testing it out, and their responses have been positive so far.

The next step after this testing period is to look at the costs of doing this properly with our own equipment. One of the key things we’ve been doing recently is increasing the redundancy of our systems, so it’d be fairly daft to do this with just one switch. We’d need at least two, with every cluster node connected to both, and every client that we want optimum performance on connected to both. Obviously there’ll be other clients that are less important and they can continue to use the existing infrastructure.

Of course, I’ve got absolutely no idea where we’ll put these switches, or how we’ll wire them in – things are pretty tight in our racks at the moment. Suppose there’s got to be a challenge somewhere 🙂

My only worry with all this is what we’ll do if it doesn’t work. I don’t have any other ideas that’d make it go quicker – to be frank, you can’t really get any quicker than a directly connected switch. Lets hope we don’t have to worry about it.

Share

slimp3slave – finally working

In my last post about setting up a slimserver I said that I was having trouble getting slimp3slave working:

Whilst it doesn’t appear to have any problems, I didn’t have much success with the players. mpg123 got confused by the stream, and madplay kept skipping the beginnings of tracks when I hit next on the server. This could be a problem with slimp3slave – I’ll need to investigate.

The problem did turn out to be with slimp3slave. I discovered that when skipping a track the stream is restarted which caused slimp3slave to start up a new player. The problem was this is that it did it before the old one had exited, thus causing the new one to die because it couldn’t access the sound device. There’s another bug here – it didn’t notice the new player dying and tried to write to it, which resulted in lots of SIGPIPE messages.

So I looked at the code for shutting down the player and noticed that it wasn’t using the right close function. This change fixed it:

RCS file: /home/pdw/vcvs/repos/slimp3slave/slimp3slave.c,v
retrieving revision 1.10
diff -u -r1.10 slimp3slave.c
— slimp3slave.c 12 Apr 2004 08:04:52 -0000 1.10
+++ slimp3slave.c 22 Jun 2006 21:21:31 -0000
@@ -394,7 +394,7 @@
}

void output_pipe_close(FILE * f) {
– fclose(f);
+ pclose(f);
}

unsigned long curses2ir(int key) {

I have sent this change to the author, so maybe it’ll get integrated.

Now I have a working streaming system. The only remaining problem seems to be the wireless networking to the client dropping out from time to time – a wire would fix that one 🙂

And in the past couple of days I’ve even got a client (softsqueeze on Windows this time) running at work that’s streaming the music over my ADSL connection. Very handy!

Share