Its been months in the making, but it’s finally done. We have our new filestore ready to go. There’s still plenty to do, like rolling it out for the teaching machines and web filestore, but at least we’ve got the main part done.
So why has it taken so long? I spent a long time researching and testing the technologies involved. For example, choosing the file system was tricky. UFS doesn’t work well on large (>2TB) file systems, and VxFS doesn’t work with NFS and Quotas. I managed to solve that one by fixing the quota issue with VxFS. There was also the issue of how we backup this quantity of filestore, and working out how we’d make it available from the cluster to the user machines. In the end we opted for a single filesystem split in to chunks on the server side for backups and used the automounter to make these divisions transparent to the end users.
The other time consuming factor was the software development stage. We have automated systems for creating users on machines, so I needed to integrate this with the new filestore. This required writing code to facilitate the creation of directories, setting up of quotas, and automount map building.
Anyway, I’ve written about this before. So now it’s done what do we do next? The logical step is to test it on myself and/or the rest of the systems group. Personally I’m in of favour testing it on everyone else first, but that doesn’t seem fair
The question is, am I brave enough to actually use it?
Related posts:
- Impending doom (for our filesystems, anyway) Over the past year or so the space usage on our research and web filesystems has pretty much doubled to the point where we’re dangerously close to running out of space. There’s currently about 1TiB of filestore available of which less than 10% remains unused. Teaching filestore, however, has barely grown at all during the last [...]...
- “Disc quota exceeded” Today we saw a strange problem on our Solaris hosts that NFS mount VxFS filestore from our Veritas cluster. The users were seeing “Disc quota exceeded” messages, whilst the quota command wasn’t showing they’d hit their limit. After some digging on the cluster node we found the following error message: Sep 12 11:04:33 bes vxfs: [ID 702911 kern.warning] WARNING: msgcnt 10 mesg 089: V-2-89: quotas on /cluster/ResFS [...]...
- NFS+IPsec Performance We’ve recently moved to having our filestore NFS exported from a cluster. This provides almost complete resilience from hardware failures, and moves us away from depending on individual end-user systems with locally attached filestore. Given the inherent insecurities with NFS we opted to use IPsec authentication (but not encryption) between the hosts involved. The NFS server [...]...
- Automating tarsnap backups How to create a backup schedule for tarsnap, and how to test it without using up your allowance....
- NFS Performance, continued Back in May I wrote about the performance problems we were having with our new NFS based user filestore. It’s been a while since then, and the problems have continued. We have noticed that it appears to be load related – not just the network, but also the machine. This suggests that our theories about [...]...