Day 23 – Post a review of an application that you use

This late post is part of my 30 days of geek challenge.

I figured it would be a bit too naracistic to review my own software and a bit boring to review some of my ever day applications, so instead I’m going to do a post about a rather geeky application – KVM virtualisation.

 

About Virtualisation

For those unfamiliar with virtualisation (hi Lisa <3), it’s a technology that allows one physical computer to run multiple virtual computers – with computers getting more and more powerful compared to relatively stable workloads, virtualisation allows us to make much better use of system resources.

I’ve been using virtualisation on Linux since RHEL 5 first shipped with Xen support – this allowed me to transform a single server into multiple speedy machines and I haven’t looked back since – being able to condense 84U of rackmount servers down into a big black tower in my bedroom is a pretty awesome ability. :-)

 

Background – Xen, KVM

I’ve been using Xen in production for a couple years now, whilst it’s been pretty good, there have also been a large number of quite serious bugs at times – combined with the lack of upstream kernel support, it’s given Xen a bit of a bad taste.

Recently I built a new KVM server at home running RHEL 6 to replace my data center, which was costing me too much in power and space. I chose to dump Xen and switch to KVM, which is included in the upstream Linux kernel and is a much smaller simpler code base, since KVM relies on the hardware virtualisation capabilities of the CPU rather than software emulation or paravirtualisation.

In short, KVM is pretty speedy since it’s not emulating much, instead giving the CPU the hardwork. You can then combine paravirtualisation for things like network and storage to boost performance even further.

 

My Platform

I ended up building my KVM server on RHEL 6 Beta 2 (before it was released) and am currently running around 25 virtual machines on it with stable experiences.

Neither the server or guests have needed restarts after running for a couple months without interruption and on a whole, KVM seems a lot more stable and bug free than Xen on RHEL 5 ever was for me. **

(** I say Xen on RHEL 5, since I believe that Xen has advanced a lot since XenSource was snapshotted for RHEL 5, so it may be unfair to compare RHEL 5 Xen against KVM, a more accurate test would be current Xen releases against KVM).

 

VM Supend to Disk

VM suspend to disk is somewhat impressive, I had to take the host down to install a secondary NIC (curse you lack of PCI hotswap!) and KVM suspended all the virtual machines to disk and resumed them on reboot.

This saves you from needing to reboot all your virtual systems, although there are some limitations:

  • If your I/O system isn’t great, it may actually take longer to write the RAM of each VM to disk than it would take to simply reboot the VMS. Make sure you’re using the fastest disks possible for this.
  • If you have a lot of RAM (eg 16GB like me) and forget to make your filesystem on the host OS big enough to cope…..
  • You can’t apply kernel updates to all your VMs in one go by simply rebooting the host OS, you need to restart each VM that requires the update.

In my tests it performed nicely, out of 25 running VMs, only one experienced an issue, which was a crashed NTP process, quickly identified by Nagios and restarted manually.

 

I/O Performance

I/O performance is always interesting with virtualised systems. Some products, typically desktop end user focused virtualisation solutions, will just store the virtual servers as files on the local filesystem.

This isn’t quite so ideal for a server where performance and low overhead is key – by storing a file system ontop of another filesystem, you are adding much more overhead to the block layer which will translate into decreased performance, not so much around raw read/write, but around seek performance (in my tests anyway).

Secondly, if you are running a fully emulated guest, KVM has to emulate virtual IDE disks, which really impacts performance, since doing I/O consumes much more CPU. If your guest OS supports it, paravirtualised drivers will make a huge improvement to performance.

I’m running KVM guests inside Linux logical volumes, ontop of an encrypted block device underneath (which does impact performance a lot) however I did manage to obtain some interesting statistics showing the performance of paravirtualisation vs IDE emulation.

View KVM IDE Emulation vs Paravirtualisation Results

They show noticeable improvement in the paravirtualised disk, especially around seek times… of interest, at the time of the tests, the other server workloads were idle, so the CPU was mostly free for I/O.

I suspect if I were to run the tests again on a CPU occupied server, paravirtualisation’s advantages would become even more apparent, since IDE emulation will be very susceptible to CPU load.

 

The above tests were run on a host server running RHEL 6 kernel 2.6.32-71.14.1.el6.x86_64 ontop of an encrypted RAID 6 LVM volume, with 16GB RAM, Phenon II Quad Core and SATA disks.

In both tests, the guest was a KVM virtual machine running CentOS 5.5 with kernel 2.6.18-194.32.1.el5.x86_64 and 256MB RAM – so not much memory for disk caching – to a 30GB ext3 partition that was cleanly formatted between tests.

Bonnie++ 1.03e was used with CLI options of -n 512 and -s 1024.

Note that I don’t have perfect guest to host I/O comparison test results, but similar tests run against a RAID 5 array on the same server suggests that may be around a 10% performance impact with KVM paravirtualisation which is pretty hard to notice.


Problems

I’ve had some issues with stability which I believe I traced to one of the earlier beta kernels with RHEL 6, since upgrading to 2.6.32-71.14.1.el6.x86_64 the server has been solid, even with large virtual network transfers.

In the past when I/O was struggling (mostly before I had upgraded to paravirtualised disk) I experienced some strange networking issues, as per the post here and identified KVM limitations around the I/O resource allocation space.

Other than the above, I haven’t experienced many other issues with the host and future testing and configuration is ongoing  – I should be blogging a lot of Xen to KVM migration notes in the near future and will be testing CentOS 6 more throughly once released, maybe some other distributions as well.

Now is the winter of my content

It’s suddenly gotten much colder in Wellington (about 13c currently), looks like summer is over and winter is on it’s way.

As weird as it sounds, winter is my favorite time of the year in Wellington – instead of mediocre days, we have crisp, cold evenings, dark nights, snuggling on the couch and in bed, tasty winter foods and more. :-D

The last few days I’ve actually been feeling far more bouncy and happy than normal –  something about me is clearly wired wrong…

I wonder if part of it, is that instead of having dull gray evenings when I leave work, it’s now completely dark – dull gray weather always messes with my moods, whilst I always feel much more active and energetic at night and I don’t feel bad about spending heaps of time inside on my computer.

Anyway, here’s to winter – looking forwards to several months of chilly, dark bliss :-D

 

PS: anyone know any Antarctic based computer programmer jobs going? 6 months of darkness sounds like a dream :-D

Day 22 – Release some software under an open source license that you haven’t released before.

This late post is part of my 30 days of geek challenge.

I’ve released a bit of software before under open source licenses – originally mostly scripts and various utilities, before moving on to starting my own open source company (Amberdms Ltd) which resulted in various large applications, such as the Amberdms Billing System and centralised authentication components like LDAPAuthManager.

The other day I released my o4send application, which is a utility for sending bluetooth messages to any phones supporting OPP and today I pushed a new release of LDAPAuthManager (version 1.2.0) out to the project tracker.

 

I haven’t talked about LDAPAuthManager much before – it’s a useful web-based application that I developed for several customers that makes LDAP user and group management easy for anyone to use without needing to understand the pain that is LDAP.

It’s been extended to provide optional radius attribute support, for setting additional values on a per-user or per-group, making LDAPAuthManager part of a wider centralised authentication solution.

 

For other open source goodness, all my current open source components developed by Amberdms can be found on our Indefero project tracker at www.amberdms.com/projects/.

There’s a lot that I have yet to release – releasing means I need to validate the documentation, package, test and then upload so I can be sure that everyone gets the desired experience with the source, so it can be tricky to find the time sometimes :-/

Introducing o4send

Awhile ago, Amberdms was contracted to develop an application for sending messages to bluetooth enabled mobile phones for the NZ world expo.

Essentially the idea was that people would visit the expo, receive a file on their mobiles and receive some awesome content about New Zealand. The cool thing about this was that you didn’t need to be paired, any phone with bluetooth active would get this message.

Apparently this worked quite nicely, although I’m not convinced that OPP will be much use for the future, with the two major smartphone platforms (Android and iPhone/iOS) not providing support for it – we found that it worked best with Nokia Symbian phones.

To make this work, I wrote a perl script and coupled it with a CSV or MySQL database backend to track the connections and file distributions – I bundled this into a little application called “o4send” which I’ve now released the source publicly.

You can check out the source and download the application at the Amberdms project tracker at: https://www.amberdms.com/projects/p/oss-o4send/

Take care with this application, it can talk to a lot of mobile phones and I’m not sure of the legality of sending unsolicited messages to bluetooth devices – but I figured this source might be useful to somebody oneday for a project – or at the very least, a “hey that’s cool” moment.

30 days of geek takes off?

Readers who have been around for a little while may recall my 30 days of geek blogging challenge, which I sadly ran out of time to complete the last few questions.

Recently @CyrisXD has taken up the idea and has been promoting it to get a whole bunch of other geeks blogging and talking about it, which is pretty awesome. He has a list of people doing the challenge, starting up on the 1st of April on his website at eguru.co.nz and there seems to be a lot of buzz around it.

It’s pretty awesome to see it take off and it would be shame if I don’t complete it myself, so I’m going to start making a post a day to complete the 30 days of geek challenge myself. :-)

As a side note, I’m also making some effort to go back and tag all the articles on this blog better – I have a few categories, but there’s lots more content that tends to get hidden and hopefully tagging it will make it more accessible to casual readers, so I’ll be doing this over the next week or so.

DHCP, I/O and other virtualisation fun with KVM

I recently shifted from having two huge server racks down to having a single speedy home server running KVM virtual machines, with the intent of packaging all my servers – experimental, development, staging, etc, into a single reliable system which will reduce power and maintenance costs.

As part of this change, I went from having dedicated DHCP & DNS servers to having everything located onto the KVM host.

The design I’ve used, has the host OS running with minimal services – the host just runs KVM, OpenVPN, DHCP and a DNS caching nameserver – all other services run as guest VMs, with a virtual network for the guests and host to communicate over.

Guests run as DHCP clients – this makes it easy to assign or adjust addressing if needed and get their information from the host OS.

However this does mean you can’t get away with hammering the host too badly – for example, running an I/O and network intensive backup can cause some interesting problems when you also need the host for services, such as DHCP.

Take a look at the following log messages from a mostly idle VM – these were taken whilst another VM on the server was running a bonnie++ process to test performance:

Mar  6 10:18:06 virtguest dhclient: 5 bad udp checksums in 5 packets
Mar  6 10:18:27 virtguest dhclient: DHCPREQUEST on eth0 to 10.8.12.1 port 67
Mar  6 10:18:45 virtguest dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Mar  6 10:19:00 virtguest dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Mar  6 10:19:07 virtguest dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Mar  6 10:19:15 virtguest dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Mar  6 10:19:15 virtguest dhclient: 5 bad udp checksums in 5 packets

That’s some messed up stuff – what you’re seeing is that the guest VM is trying to renew the DHCP address with the host server – but the host is so sluggish with having to run the I/O intensive virtual machine that is actually corrupting or dropping the UDP packets, preventing the guest VM from renewing it’s address.

This of course raises the most important question: What happens if the guest can’t renew it’s IP address?


In this case, the Linux/CentOS 5 guest VM actually completely lost it’s IP address after a long period of DHCPREQUEST attempts, fell off the network entirely and caused my phone to go nuts with Nagios alerts.

Now of course in any sane production environment, nobody would be running a bonnie++ processes on a VM on an active server – however there’s some pretty key points still made here:

  • The isolation is a lie: Guests are only *somewhat* isolated from one another – one guest can still mess with another and effectively denial-of-service attack the other VMs by utilising all the available resources.
  • Guests can be jerks: Organisations running KVM (or some other systems) with untrusted guest VMs should carefully consider how they are going to monitor and protect the service from users running crazily resource intensive processes. (after all, there will be someone who wants to bonnie++ test their new VM simply for the lols).
  • cgroups to the rescue? Linux cgroups does have an I/O controller (blkio-cgroup) although whilst this controls read/write flow, it won’t restrict seeks which can also badly impact spinning rust based servers.
  • WTF DHCP? The approach of the guests simply dropping their DHCP address after losing contact with the DHCP server is a pretty bad design limitation – if the DHCP server is unreachable, it should keep the original address (of course if the “physical” ethernet connection dropped, that would be a different situation, and it should drop it’s address to match).
  • Also: I wonder what OSes/distributions have the above behavior?

I’m currenting running a number of bonnie++ tests on my KVM server and will have a blog post in the near future detailing these findings in more detail, I’m also planning to look into cgroups and other resource control and limiting functions and will report back on how these fare when you have guest VMs running heavy processes.

Overall it made my weekend of geekery that bit more exciting. :-D

Kickstarting Christchurch

Now that the immediate quake issues are being addressed, attention is being turned to how New Zealand is going to fund the repair to Christchurch.

There’s discussions about introducing new taxes, re-introducing student loan taxes and of course there’s likely to be an increase to our existing earthquake levy and building insurance.

I have a few immediate thoughts on this:

  • Student Loan Interest: Just about the worst possible idea I’ve ever heard, introducing tax on student loans will mean that more students will want to leave NZ for better wages overseas – we want to RETAIN talented youth in NZ.
  • Earthquake Levy Increases: What’s happened to the fees that we’ve been paying out for the past 50 or so years? Surely there should be a nice pile of money available for a disaster such as this? Or has the government been spending it all and not retaining it like it should?
  • New Taxes: I’m wary of new taxes being introduced for specific events, we should have factored our taxes to factor in for a disaster like this once every 100 years or so – if we need to adjust, so be it, but it should be transparent.

There’s also the tricky question of how do we encourage and help Christchurch rebuild? It’s going to be very interesting to see how much of the population leaves the city for other areas, there have already been stories of families planning to move out permanently.

In some ways there could be a good opportunity to build Christchurch up again, by trying to encourage younger people and businesses to the city to replace those who are leaving – I’m expecting land and house values to drop, which will make buying there more attractive for younger people with less money – assuming there is good work available.

And how do we go about encouraging business and employers to get started? My suggestion is that the NZ government cuts taxes almost entirely for any business of less than 25 people in Christchurch region for an entire year and make low-interest loans available to businesses.

This is a big step, it would cost a lot of money, but consider this:

  • Large wealthy companies that can afford to keep on going, will do so, the government won’t be propping up the large international corporations or anything.
  • Young and small hard hit existing companies will lose a lot of pressure and be able to focus on rebuilding business and getting started again without worrying about how bills are going to get paid.
  • The loans will allow them to get a quick cash injection to rebuild and get running – every week that a company can’t bill is a huge impact and many small businesses will go broke without a bit of cash investment to help them out, which would be a terrible loss.
  • It will encourage people starting new business to consider Christchurch – for example, anyone doing an IT startup will consider “hey, we can located in Christchurch and get some big savings, why not?”. This is less of an immediate benefit, but long term could pay off hugely for Christchurch region, by encouraging growth in a city that people might otherwise decide to avoid.

I don’t claim to be an economist, this is my opinion based on what I’d like to see if I was running a small business in Christchurch and what would encourage me to keep at it after a disaster like this.

It’s not going to be cheap, but nothing about this disaster is going to be cheap – best if we can do something than ends up kick starting and enabling Christchurch to rebuild and grow itself, than losing lots of businesses and suffering more economic impact in future.

CentOS, RHEL and future possibilities?

Those who know me will know that I’m a long term CentOS user – this actually started from my love of RHEL,  back in my early Linux using days when I was running Red Hat 8.0.

Whilst it made financial sense for Red Hat to switch to making their product only available in binary form for their customers, at the same time I can’t help but feel this has damaged the appeal of Red Hat for geeks like myself – I’m no longer able to setup friends, family or customers without the funds for RHEL with a quality, enterprise-grade free (as in beer + freedom) distribution.

I do wonder if this contributes to reduced market awareness in the small business space and also whether it reduces the likeliness of geeks like myself promoting the software – after all, if I can’t run RHEL myself, I’m likely to look at other distributions and options and end up promoting those.

With the lack of a free Red Hat enterprise-grade distribution, there are only a couple options for wanting a Red Hat-style experience:

  1. Fedora – the community developed distribution that forms the future base of RHEL, a fantastic distribution in it’s own right, but with only 12 months support per release, not suitable for server deployments.
  2. CentOS – the community free re-spin of RHEL with their trademarks removed to make it legally redistributable.

I’ve been using CentOS heavily on my servers and Fedora on my workstations, however there are a number of security delays that are concerning me about CentOS which have been recently highlighted in an LWN article.

Essentially, the core problem is that the latest version of CentOS is still only 5.5, whilst Red Hat have had 5.6 out for some time, with numerous security updates in it that have yet to be released for CentOS…..

Having systems vulnerable to known exploits with no upstream patches is always a pretty serious concern to any system administrator…. this is leading me to re-think my usage of CentOS and to consider whether I should consider other platforms.

I’ve never been a huge fan of Debian in the past, but I’m considering giving it a more detailed look and try – Debian has the advantages of a strong community (like Fedora has) but without the limitation of a short support life – although then again, Debian’s releases and support spans are a little less rigid than Red Hat, which is somewhat annoying.

There’s a few server platforms that come to mind – Ubuntu LTS, Mint Linux, Debian, Open/SuSe or of course, Fedora.

The other option is that I could spin my own distribution – based on the number of custom RPMs I already build, rebuilding Red Hat’s update packages for my own needs wouldn’t be too hard, but I really don’t want to get caught up in distribution maintenance for the next 5 years plus it’s not suitable for customer deployments – so even if I decide that a custom built system is best for me, it still doesn’t solve the “what do I install for others?” question.

Maybe I need Fedora LTS – long term support for specific versions of Fedora – 3 or 5 years would be wonderful and meet the needs of server administrators.

This was tried once before, with the Fedora Legacy project, but it didn’t last long – possibly the goal of supporting *all* the releases was too much to reasonably handle, so an approach of selection even/odd number releases only might make it more feasible – I know that I’d certain be willing to contribute.

Anyway, this is a late night concerned system administrator brain dump about the problem, interested in thoughts and comments from others here about distributions they use/would consider in the server environment.

Hastings Roadtrip!

At the request of @splatdevil, I’ve headed up to Hastings for the weekend to be with her during a difficult personal time.

I always like an excuse for a roadtrip, the Wellington to Hastings drive is pretty nice and isn’t too long at only 4 hours.

Interesting statistics from the trip:

  • $30~ fuel consumption in Toyota Starlet 1.3l petrol car
  • 800ml coke consumed.
  • 1 chip in wind screen. Going to be a hassle to go and have that fixed now :-(
  • 2 wrong turns.
  • 1 fuel stop.
  • 0 toilet/snack stops.
  • >9000 angry swear words at traffic queues whilst trying to depart Wellington on a Friday afternoon.
  • 6 uses of the over taking lane
  • 4 sets of roadworks.
  • 1 police car on traffic duties.
  • 3 ambulances, only 1 active.

Auckland Visits

I’m heading up to Auckland on business a couple times in the next few months.

  • 22nd & 23rd February
  • 7th, 8th & 9th of March

I’m expecting to be too busy to do anything on the evening of the 22nd, however I’m keen to try and meet up with some Aucklanders of the evening of the 7th or 8th – most likely the 8th, somewhere in the CBD.
Planning to meet up at the Northern Steamship (Macs Pub) at 19:00, as per a suggestion by @pikelet.

I’ve created a twtvite here for those of you using twitter to RSVP to – nice to know if people actually care enough to come along ;-)