Understanding partman-auto/expert_recipe

TL;DR – Subtract the minimum size from the priority and compare this to other partitions to work out how much of the free space will be assigned to a partition.

I’ve been trying to create a simple recipe for partitioning disks on new machines when using the Debian/Ubuntu automated installer. This looks like it should be fairly straightforward, but the documentation makes it more confusing that it needs to be. First, here’s an example:

d-i partman-auto/expert_recipe string \
        root :: \
                8192 8241 16384 linux-swap \
                        $primary{ } \
                        method{ swap } format{ } \
                . \
                16384 16386 -1 ext4 \
                        $primary{ } $bootable{ } \
                        method{ format } format{ } \
                        use_filesystem{ } filesystem{ ext4 } \
                        mountpoint{ / } \
                . \
                8192 8241 16384 ext4 \
                        $primary{ } \
                        method{ format } format{ } \
                        use_filesystem{ } filesystem{ ext4 } \
                        mountpoint{ /var } \
                .

I want at least 8GiB of swap and /var, and at least 16GiB for the root filesystem. I don’t want swap or /var to grow beyond 16GiB, but I’m happy for the root filesystem to grow to fill the disk. The first number for each partition is the minimum size, the second is the priority, and the third is the maximum (-1 being no maximum). The minimum and maximum definitions should be clear, but what does the priority field mean? Let’s consult the documentation:

<priority> is some size usually between <minimal size> and <maximal size>. It determines the priority of this partition in the contest with the other partitions for size. Notice that if <priority> is too small (relative to the priority of the other partitions) then this partition will have size close to <minimal size>. That’s why it is recommended to give small partitions a <priority> larger than their <maximal size>.

So it makes it clear what this number is about – it’s used to decide how much of the free space is assigned to a partition compared to the others – but it doesn’t explain how it does that, how the numbers are compared, or what they actually mean. There’s plenty of misinformation about this online, so I read through the code to work out what was really going on.

The algorithm is actually fairly straightforward. First it works out a percentage weight for each partition. It does this by subtracting the minimum value from the priority. If the priority is less than the minimum the minimum is used instead, which results in a zero value for this calculation. The values for all partitions are then added together and a percentage calculated for each one. So in the above example we have percentages of 49%, 2% and 49% for each partition in turn (yes, it’s no coincidence that the priorities were chosen to make percentage calculations easy).

Next, with a percentage weight for each partition it moves on to looking at the free space. It starts by giving each partition its minimum value (there must be enough disk space for that, otherwise the process fails) and then works out what space is left over. Each partition is then assigned a percentage of that left over space based on the figure from the previous step. That’s it – it’s as simple as that! Assuming none have gone over their maximum value we’re done.

If a partition does get assigned a value over its maximum then the maximum is taken as the new size instead and the priority for that partition becomes zero. Another pass around the loop is done, ignoring that partition completely for the percentage calculations, and the remaining free space assigned to the other partitions. This process repeats until there is no more space, or until all have hit their maximum values.

As a side note, when all partitions hit their maximum values the remaining space gets assigned to the last partition when the partitions are created. Personally I’d prefer it was left free on the disk, but it’s not configurable.

So here’s my advice – use percentages yourself. Work out what percentage of the space you’d like assigned to each partition in each pass and then add this amount to the minimum value to work out the priority.

The above was all deduced by reading the source code here, specifically the expand_scheme() function. Thank goodness for Open Source!

Share

New OpenPGP Key

I’ve had my old OpenPGP key for around 13 years. That’s a long time, and it’s a tough decision to just throw it away and replace it and the signatures I’ve gained during that time. But it’s no longer doing the job required of it — at 1024-bit it’s possible that with a feasible amount of computing power you could break the encryption it provides. So it’s time to create a shiny new 4096-bit RSA key to replace it with.

I’ve followed all the suggested best practice documents that I could find and created my new key. I’ve published it to some public key servers, including pool.sks-keyservers.net, and I’ve written the now common transitional statement (admittedly, “written” is used loosely here — I mostly borrowed the text and layout from other people).

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1,SHA512

From: Tim Bishop <tim@bishnet.net>
Date: 2013-08-10

After 13 years my old 1024-bit DSA key no longer meets the standards
suggested by current best practice, so I've generated a new 4096-bit
RSA key to replace it.

My old key was:

  pub   1024D/0x7DCED6595AE7D984 2000-10-07
        Key fingerprint = 1453 086E 9376 1A50 ECF6  AE05 7DCE D659 5AE7 D984
  uid                  Tim Bishop <tim@bishnet.net>
  uid                  Tim Bishop <T.D.Bishop@kent.ac.uk>
  uid                  Tim Bishop <tdb@FreeBSD.org>
  uid                  Tim Bishop <tdb@i-scream.org>

My new key is:

  pub   4096R/0x6C226B37FDF38D55 2013-08-07 [expires: 2015-08-07]
        Key fingerprint = 4BD9 5F90 8A50 40E8 D26C  D681 6C22 6B37 FDF3 8D55
  uid                  Tim Bishop <tim@bishnet.net>
  uid                  Tim Bishop <T.D.Bishop@kent.ac.uk>
  uid                  Tim Bishop <tdb@FreeBSD.org>
  uid                  Tim Bishop <tdb@i-scream.org>

My old key will continue to be valid, but I would prefer all future
communication to be done using my new key. In addition, any other keys
being distributed on public key servers that use any of the above UIDs
should be considered invalid.

This document has been signed using both the old and the new keys so
that you can certify the transition. In addition, the new key has been
signed with the old one to confirm its validity. If you previously
signed my old key I'd appreciate it if you could sign the new one if
you're happy with the trust that signature gives.

If you'd like any further verification or have any questions about
this transition please contact me directly.

Tim.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (FreeBSD)

iEYEARECAAYFAlIFiS0ACgkQfc7WWVrn2YRjaQCfVwy0ptDK7iB2zuIAYDGo15ED
HlsAniBSw/OIisdEILxXINyipuSBpFLLiQIcBAEBCgAGBQJSBYktAAoJEGwiazf9
841V+BoQAJvnFbGbrPIql4E4BKlG3Bz2MBK2HTgicy29K96fBXZkA6mSAzuGB/Ts
6FW1/LTMQzKeHZj8mbOJQ04iaGe7gljQykrTfr2mwn8Bv7jqlGU3AMWheeRReVvD
C63hd5ogt0tYZ3Zg9VhKLV3RXhSGPSBA+6wfqihX+wVV/E4FA0OyDOKquZksj+Gy
BVajE/nnIluatQB/0xNDq4KdiwdfJUcKlVSxI4mCj6sMOjhYaK1JyPjglLNwcL3p
OZmx08xxakMvH0XawtpRNNDQkRhzdjev75GUCUrOVoXYaM4W8533qdU+LagninIU
X0j4XEqMg6l+KYHB257WDhFXJOM8Ng8YbDVsZV0n/JoRXmsFlHeEikzv1b62jviJ
IxFQGYqWMs+wt9LdAXkSBNEuR6np9h5Gg+Q7Iu9btneoX9g0oHlh1G2HsbqFl4pA
rQ3p9i16APCAEQOqQ9/KjSm7eUSJltqyogTaVpf0gdL7Y7FGsemzdmqILruIDaxq
p5pw3AvXastHkZ6KxTdOFi/Ekh3dC9a9cXm9YuNv4xDxnvwvi0hNWbtsQHtmUXRF
THNpCEJE+g8ZpohGNH3g5fVoIGIJLhCmqR2oHV0MzuuaZo9VUVCgoltrpPd1Z2q3
N2VIM0+SVgGyF8QvEurYoA7o9TSNXfKoYEpgye5SFUyI16pgC1dd
=Plxn
-----END PGP SIGNATURE-----

The statement above can also be downloaded here, or you can just copy and paste it in to your PGP client of choice. I use GnuPG.

The following output shows the statement being verified by both my old and new keys. You’ll likely see something slightly different than me because you won’t trust my new key yet. If you trust my old key it should validate correctly and confirm that the statement is genuine and that I have a new key.

% gpg --keyserver pool.sks-keyservers.net --recv-key 0x6C226B37FDF38D55
gpg: requesting key FDF38D55 from hkp server pool.sks-keyservers.net
gpg: key FDF38D55: public key "Tim Bishop <tim@bishnet.net>" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
% wget -qO - http://www.bishnet.net/tim/gpg-transition-2013.asc | gpg --verify
gpg: Signature made Sat Aug 10 01:28:29 2013 BST using DSA key ID 5AE7D984
gpg: Good signature from "Tim Bishop <tim@bishnet.net>"
gpg:                 aka "Tim Bishop <T.D.Bishop@kent.ac.uk>"
gpg:                 aka "Tim Bishop <tdb@FreeBSD.org>"
gpg:                 aka "Tim Bishop <tdb@i-scream.org>"
gpg: Signature made Sat Aug 10 01:28:29 2013 BST using RSA key ID FDF38D55
gpg: Good signature from "Tim Bishop <tim@bishnet.net>"
gpg:                 aka "Tim Bishop <tdb@FreeBSD.org>"
gpg:                 aka "Tim Bishop <tdb@i-scream.org>"
gpg:                 aka "Tim Bishop <T.D.Bishop@kent.ac.uk>"

If you’ve signed my old key, and you’re happy that this process genuinely confirms that this is my new key, I’d be pleased if you could sign it too. If you have any questions or want any further confirmation of its validity, please contact me directly.

Share

Using FreeBSD’s Tinderbox as a package builder

Tinderbox setup

The machine I’m using is currently being used to test port updates. It has a bunch of jails for the -STABLE branches and a copy of the ports tree that I make changes to when testing port updates. I decided to use this machine for my package builder but this meant keeping things separated. So for package building I have the following set up in Tinderbox:

  • A jail for 8.2-RELEASE (RELENG_8_2).
  • A jail for 9.0-RELEASE (RELENG_9_0), when that gets branched.
  • A separate ports tree that I can keep pristine and update automatically without affecting my other ports work.

I won’t document how to do that. The Tinderbox README covers it in plenty of detail.

Index generation

If you’re just doing a pristine ports tree, with no OPTIONS or other environment tweaks, and you don’t care that the INDEX file may be slightly newer than your package set, you don’t need to do this. I have some OPTIONS set and I wanted the INDEX to exactly match the versions of the available packages, so I’m building my own INDEX file.

I checked the Tinderbox archives for the best way to do this. Others seem to be doing it using a hook on the ports tree update. The problem with this is that you need to do some extra work to make sure any OPTIONS or environment changes are included, and if you’re doing it for multiple OS versions you’ll need to cover that too (otherwise it’ll build the INDEX according to your Tinderbox host’s OS version).

The solution I came up with was to make a small custom port. It builds INDEX and installs it to /usr/local. I build this inside each build I’m using for my package building and the result is a package containing the INDEX file that fits my criteria (OPTIONS, environment, and matches my ports tree exactly).

Here’s the port’s Makefile. The symlink line is only needed because Puppet, which I use, looks for INDEX.bz2 rather than INDEX-8.bz2.

PORTNAME=       makeindex
PORTVERSION=    0
CATEGORIES=     ports-mgmt
MASTER_SITES=   # none
DISTFILES=      # none

MAINTAINER=     tdb@FreeBSD.org
COMMENT=        Generate INDEX file

USE_PERL5=      yes # make index requires perl

PLIST_FILES=    ${INDEXFILE}.bz2 INDEX.bz2

do-build:
        cd ${PORTSDIR} && make index INDEXDIR=${WRKDIR} -DINDEX_PRISTINE
        bzip2 -9 ${WRKDIR}/${INDEXFILE}

do-install:
        ${INSTALL_DATA} ${WRKDIR}/${INDEXFILE}.bz2 ${PREFIX}
        ln -s ${INDEXFILE}.bz2 ${PREFIX}/INDEX.bz2

.include <bsd.port.mk>

Package builds

The next step is to tie the INDEX generation together with updating the ports tree and building packages. It’s a pretty simple process; update the ports tree, generate and install the new INDEX file and then build any new packages. Below is the script I use to do this, and here are a few useful notes:

  • $TB/tdb/autobuildports is a list of ports that I want to build, one per line, in the format “category/portname”.
  • $TB/tdb/makeindex is the port discussed in the previous section.
  • I use the -norebuild flag to tinderbuild to ensure I don’t rebuild the leaf ports unless necessary.
  • The last step after the for loop is mostly so I can check what it’s done, and isn’t necessary for things to work.
#!/bin/sh

TB=/u1/tinderbox
PT=FreeBSD_auto
portlist=$TB/tdb/autobuildports

PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:$PATH export PATH

$TB/scripts/tc updatePortsTree -p $PT

for b in `ls $TB/builds | grep $PT`; do
        rsync -rlpvc --delete --force --exclude=CVS/ \
                $TB/tdb/makeindex/. \
                $TB/portstrees/$PT/ports/ports-mgmt/makeindex
        $TB/scripts/tc addPort \
                -b $b -d ports-mgmt/makeindex
        $TB/scripts/tc tinderbuild \
                -b $b -nullfs ports-mgmt/makeindex

        cd $TB/packages/$b && tar -zxvf All/makeindex-0.tbz INDEX\*

        for p in `cat $portlist`; do
                echo "===> $p on $b"
                $TB/scripts/tc addPort \
                        -b $b -d $p
                $TB/scripts/tc tinderbuild \
                        -b $b \
                        -nullfs -norebuild \
                        $p
        done

        cd $TB/packages/$b/All && ls > $TB/packages/$b/All.new
        echo "New packages:"
        comm -1 -3 $TB/packages/$b/All.last $TB/packages/$b/All.new
        mv $TB/packages/$b/All.new $TB/packages/$b/All.last
done

I run this script on a daily basis from cron.

Portmaster setup

The final step is installing these packages. I could do this by hand using pkg_add, but I prefer to use Portmaster. It’ll handle upgrades too. I use the following config in my portmaster.rc file which sets all the useful options for working with binary packages.

Portmaster will pull the INDEX file automatically as required. I picked /var/db/portmaster as the temporary area to put packages in, but you could use another place if /var is space limited.

# Do not create temporary backup packages before pkg_delete (-B)
NO_BACKUP=Bopt

# Only install packages (-PP or --packages-only)
PM_PACKAGES=only

# Use the INDEX file instead of /usr/ports (--index-only)
PM_INDEX=pm_index
PM_INDEX_ONLY=pm_index_only

# Delete packages after they are installed (--delete-packages)
PM_DELETE_PACKAGES=pm_delete_packages

# Local paths
PACKAGES=/var/db/portmaster
INDEXDIR=/var/db/portmaster
MASTER_SITE_INDEX=http://my.tinderbox.server/packages/8.2-RELEASE-FreeBSD/
PACKAGESITE=http://my.tinderbox.server/packages/8.2-RELEASE-FreeBSD/

So that’s it. I can now run portmaster category/port to install a new port or portmaster -a to upgrade everything and I’ll get the latest packages built using my custom options.

My final point is that this is all still a little fresh. I only just wrote it and I haven’t been using it long. So there’s undoubtedly something I’ve missed. You’ve been warned!

Share

Maildirarc – a Maildir archiving tool

I keep my email in Maildir folders. It works well on the whole for every-day email, but it doesn’t work so well for large email archives (mainly because Unix systems don’t tend to cope well with folders containing a very large number of files). My system of archiving had been to simply copy messages older than a given number of days to a different Maildir folder that I use for my archives.

The problem was mainly backups. The backup tool I use (Tarsnap – which is brilliant by the way!) was taking ages to crawl over the archive folders. In addition, the folders were taking up a lot of space on disk and compressing many small files isn’t easy without making a tar file, or similar.

So I decided the best plan was to archive the messages to Mbox files. They’d compress well (in the end I just used a compressed ZFS filesystem), be backup friendly (because they’d rarely, if ever, change), and be quick to read from disk (it’s easier to read a large file than many little ones).

It can’t be hard, right? Isn’t an Mbox file approximately this?

cat Maildir/cur/* > mboxfile

Well, it turned out to be more effort than that. First you need to create the "From " separator line, which requires the sender and delivery date. These can be found by parsing the headers, but it’s surprising how many broken emails there were in my archives.

Next you need to decide what Mbox format to use. I thought there was only one! You can either escape "From " lines in the body, or you can add a Content-Length header, or do both.

After far more effort than I originally intended I came up with Maildirarc. It’s an extended version of my original shell script that just copied messages from one Maildir folder to another. I wrote it in Perl and decided to have a play with Git and Github for version control. You can see the results here:

https://github.com/tdb/maildirarc

The end result turned out to be slightly more than just an archival tool. It can also be used to do Maildir to Mbox conversions, which might be useful to other people.

If you decide to give it a go please feel free to let me know how you get on by posting a comment below. If you have any ideas or changes you can fork it on Github and send me a pull request.

Share

Increasing our storage provision

During the summer we started getting tight on storage availability. It seems that usage on our home directory areas constantly increases – people never delete stuff (me included!). We were running most of our stuff through our Veritas Cluster from a pair of Sun 3511 arrays and a single 3510 array. Between them (taking mirroring in to account) we had around 3TB of space.

Now, it’s a well known fact with maintenance contracts that the cost goes up over time (parts get more scarce and more costly). So we did the sums on the cost we were paying for the old arrays and realised that over a sensible lifetime period it was cheaper to replace them. So we got a pair of Sun 2540 arrays with a 12TB capacity each.

Since our data is absolutely precious we mirror these arrays and use RAID 6. This gives us just under 10 TB of usable space, which is a fair amount more than we started with.

The next stage was bring this online. Because we use Veritas Volume Manager and the Veritas File System we were able to do this almost transparently. The new arrays were provisioned and added to the relevant diskgroups. The volumes were then mirrored on to them and then the filesystems expanded. Finally the old arrays were disconnected. All of this was done without any downtime or interruption to our users or services.

I said almost transparently though. It seems it’s not possible to change the VCS coordinator disks without taking the diskgroups offline and back online (this might be improved in VCS 5). So I rebooted the whole cluster last weekend and it was all finished.

The problem with all this clever technology? Nobody knows we’ve done it. After weeks of work we grew the filesystems just before they completely filled and without any noticable downtime. We’d probably get more gratitude if we’d let it fill up first 😉

Share

Getting the indexes right for OpenLDAP when using NSS

I recently deployed a Linux system which used the libnss-ldap module to get its passwd and group information. This all worked fine except group lookups (in particular when logging in) which were extremely slow. We have about 600 groups in our directory, which isn’t massive, but is more than the average system.

Clearly this wasn’t right. Initially I tried nscd, which helped, but only after it had cached the data. Then I realised it was probably the indexes in OpenLDAP. Googling didn’t turn up much of use (hence this post), but I did find this page on the OpenLDAP site.

This fairly quickly pointed me at the problem; I was missing indexes on memberUid and uniqueMember. Adding these fixed the problem completely.

So here’s the indexes I’ve ended up with:

index   objectClass     eq
index   cn,uid          eq
index   uidNumber       eq
index   gidNumber       eq
index   memberUid       eq
index   uniqueMember    eq
index   entryCSN        eq
index   entryUUID       eq

(the last two are for replication)

I’m actually quite surprised how much the indexes matter. It makes a huge difference, even on a small setup. So if you’re setting up a directory take the time to read the Tuning section of OpenLDAP Admin Guide first.

Share