Tag Archives: geek

Anything IT related (which is most things I say) :-)

Holy Relic of the Server Farm

At work we’ve been using New Relic, a popular software-as-a-service monitoring platform to monitor a number of our servers and applications.

Whilst I’m always hesitant of relying on external providers and prefer an open source solution where possible, the advantages provided by New Relic have been hard to ignore, good enough to drag me away from the old trusty realm of Munin.

Like many conventional monitoring tools (eg Munin), New Relic provides good coverage and monitoring of servers, including useful reports on I/O, networking and processes.

Bro, I'm relaxed as bro.

Bro, I’m relaxed as bro. (server monitoring with New Relic)

However where New Relic really provides value is with it’s monitoring of applications, thanks to a number of agents for various platforms including PHP, Ruby, Python, Java and .NET.

These agents hook into your applications and profile their performance in detail, showing details such as breakdown of latency by layer (DB, language, external, etc), slow DB queries and other detailed traces.

For example, I found that my blog was taking around 1,000ms of processing time in PHP when serving up page content. The VM itself had little load, but WordPress is just not a particularly well oiled application.

Before and after installing W3 Total Cache on my blog.

Before and after installing W3 Total Cache on my blog. Next up is to add Varnish and drop server times even further.

What's my DB up to?

Toss out your crufty DBA, we have a new best friend! (just kidding DBAs, I still love ya)

New Relic will even slip an addition into the client-side content which measures the browser-side performance and experience for users visiting your website or application, allowing you to determine cause of slow page loads.

Generally my issue is too much large content + slow links

Generally my issue is too much large content + slow links

There’s plenty more offered, I haven’t even looked at all the options and features yet myself – best approach is to sign up for a free account and trial it for a while to see if it suits.

New Relic recently added a Mobile Application agent for iOS and Android developers, so it’s also attractive if you’re writing mobile applications and want to check how they’re performing on real user devices in the wild.

 

Installation of the server agent is simply a case of dropping a daemon onto the host (with numerous distribution packages available). The application agents vary depending on language, but are either a case of loading the agent with the application, or bundling a module into your application.

It scales well performance wise, we’ve installed the agent on some of AU’s largest websites with very little performance impact in most cases and the New Relic interface remains fast and responsive.

Only warning I’d make is that the agent uses HTTP by default, rather than HTTPS – whilst the security impact is somewhat limited as the data sent isn’t too confidential, I would really prefer the application use HTTPS-only. (There does appear to be an “enterprise security” mode which forces HTTPS agents only and adds other security options, so do some research if it’s a concern).

 

Pricing is expensive, particularly for the professional account package with the most profiling. Having said that, for a web company where performance is vital, New Relic can quickly pay for itself with reduced developer time spend on issues and fast alerting to performance related issues. Both operations and developers have found it valuable at work, and I’ve personally found this a much more useful tool than our Splunk account.

If you’re only interested in server monitoring you will probably find better value in a traditional Munin setup, unless you value the increased simplicity of configuration and maintenance.

 

Note that New Relic is also not a replacement for alert-monitoring such as Nagios – whilst New Relic can generate alerts for performance issues and other thresholds, my advice is to rely on Nagios for service and resource overload/failure and rely on New Relic monitoring for alerting to abnormal or negative performance trends.

I also found that I still find Awstats very useful – whilst New Relic has some nice browser stats and geography stats, Awstats is more useful for the “how much traffic and data has my website/application done this month” type questions.

It’s not for everyone’s requirements and budget, but I do highly recommend having an initial trial of it, whether you’re running a couple of servers or a massive enterprise.

KVM instances dying at boot

I recently encountered a crashing KVM instance, where my VM would die at boot once the bootloader tried to unpack initrd.

A check of the log in /var/log/libvirt/qemu/vmname.log showed the following unhelpful line:

Guest moved used index from 6 to 229382013-04-21 \
06:10:36.029+0000: shutting down

The actual cause of this weird error and crash occurs when the host OS lacks disk space on the host server’s filesystems. In my particular case, my filesystem was at 96% full, so whilst the root user could write to disk, the non-root processes including Libvirt/KVM were refused writes.

I’m not totally sure why the error happens, all my VM disks are based on LVM volumes rather than the host root filesystem, I suspect the host OS disk is being used for a temporary file such as unpacking initrd and this need for a small amount of disk leads to this failure.

If you’re having this problem, check your disk space and add some Nagios alerting to avoid a repeat issue!

Age of Empires 2 HD on VirtualBox

It’s been a few years since I’d last played it, but Age of Empires is still one of my all time favourite games. I started playing it back as a young noobling on Windows 98 with a Celeron 433mhz machine and loved the perfect balance the game achieved between simplicity and flexibility.

AOE 2 offered a large number of different civilisations with different research options, yet once you learnt the basics, it was quick and easy to pick up the rest and run with it – generally it made for a very fun game, both against the AI but also against friends multiplayer over a 10mbit LAN. ;-)

Whilst AOE 2 still ran on modern Windows, the game was showing it’s age with issues like hard coded resolutions (800×600 or 1024×768 anyone?), assumption of an optical media drive for the music to be loaded from and dated multiplayer functions.

Recently Microsoft re-released AOE 2 as “Age of Empires 2 HD” on steam, taking the opportunity to fix up the above issues and port the game to newer DirectX versions and adding in Steam support providing better multiplayer support.

AOE 2 HD on Steam

AOE 2 HD on Steam

Now as a Linux geek I used to have a dedicated Windows PC hanging around for gaming – however with my move to AU, I now only have my Linux laptop and I wasn’t very keen to go back to a dual-booting world, having last done dual boot over 5 years ago.

Instead I have a Windows 7 VM inside VirtualBox which is a very good virtualisation product for desktop users offering easy management of VMs, good guest OS integration (eg desktop resizing) and also basic 3D acceleration.

I had tested the old AOE 2 original game in my VM which ran OK, but after AOE 2 HD was installed, I found the newer game would refuse to start with:

Error on start, subcode=1

The forums discuss a range of issues for this, but the general consensus is that with the move to a newer DirectX, the game would fail if some of the newer features were absent (eg shaders). I was worried about my Intel GPU being too poor as some posts suggested, but thankfully even the Intel GPUs from a few years ago are enough to play this game.

The issue was that the level of 3D acceleration being provided by VirtualBox was too low – this was easily verified by running “dxdiag” utility.

No 3D

By default, Virtualbox delivers DirectDraw Acceleration but not Direct3D.

By default if you enable 3D acceleration and install the Guest Additions, the acceleration is limited to just the stable basic features. For more advance 3D features, a module is available, but needs to be specifically selected during the installation as it’s not “stable” yet.

The stability of this feature is also rapidly changing – after turning everything on and running the game, I was causing the VM to crash entirely on the Linux OS-side.

ShCrOpenGL[23043]: segfault at 7fd40d7c3f90 ip 00007fd3f47c2f9d sp
00007fd3f4a8ba90 error 4 in VBoxSharedCrOpenGL.so[7fd3f4732000+cb000]

Generally I’m not a fan of Segfaults on my system. In order to avoid any nasty surprises, you must upgrade to the absolutely latest upstream version (at time of writing, this was 4.2.12-84980). Make sure you grab the latest version from Oracle/VirtualBox themselves, as if you’re using a distribution version their latest version in the distribution repositories may not be as up-to-date.

After installing/upgrading VirtualBox, adjust your virtual machine to be using 3D Acceleration and give it the full 256MB video memory.

You kind of want to enable 3D acceleration for 3D to work ;-)

You kind of want to enable 3D acceleration for 3D to work ;-)

Once booted, install the latest version of Guest Additions (“Devices->Install Guest Additions” from VirtualBox VM menu). Older VirtualBox drivers still work for most tasks, but you must do this installation/upgrade in order to get the absolute latest 3D fixes for the game to run.

When installing, make sure you select “Direct3D Support (Experimental)” from the component selector.

3D support is an option, not a standard feature.

3D support is an option, not a standard feature.

Note that VirtualBox will try and talk you into installing the stable basic 3D drivers. Make sure you select No and get the fully featured (but less stable) drivers.

No is Yes in our case.

No is Yes in our case.

Once installed, you’ll need to reboot and your Windows system should now feature the much better 3D support.

To verify, run “dxdiag” again to check the report:

Ready to party!

Ready to party!

Once done, you should be good to play! :-D

Recommend going to full screen in VirtualBox, before launching the game, the directional edge-of-screen scrolling works a bit weirdly otherwise. There’s also a quirk with the launch videos, where they play but display a blank screen. Just click to skip through them and to get to the game launch screen.

And we're live!

And we’re live! Back to the mines peons!

Performance and stability seems to be OK – I haven’t been able to take advantage of the higher resolution support much thanks to my laptop having a wonderfully crap display of 1280×768, so I can’t be sure whether it works well on higher resolutions or if the 3D acceleration can’t handle it.

My test was done on a x86_64 Debian Wheezy system running kernel 3.6.7 using the Oracle provided VirtualBox package (4.2.12-84980~Debian~wheezy) and a fully patched 64-bit Windows 7 Home Premium guest with latest Guest Addition drivers.

I have not tested on MacOS, but I would assume the process and support to be the same, since VirtualBox uses OpenGL on the HostOS side for the 3D Acceleration Pass-through.

Launch Paywall!

So far my time working at Fairfax Media AU has been pretty much non-stop from day one – whilst the media companies are often thought of as dull and slow moving, the reality is that companies like Fairfax are huge and include a massive range of digital properties and are not afraid to invest in new technologies where there are clear business advantages.

It’s great stuff for me, since I’m working with a really skilled group of people and being given some awesome and challenging projects to keep me occupied.

Since January I’ve been working with the development team to build up the infrastructure needed to run the servers for our new paywall initiative. It’s been a really fun experience working with a group of skilled and forwards thinking group of developers and being able to build infrastructure that is visible by millions of people and at such a key stage in the media business is a very rare opportunity.

Our new paywall has just been launched a few days ago on Sydney Morning Hearld and The Age websites for select international countries (including the US and UK)  before we roll it out world wide later in the year.

Paywalls on SMH and The Age, as seen for select international countries.

Paywalls on SMH and The Age, as seen for select international countries.

Mobile hasn't been neglected, a well polished interface has been provided there too.

Mobile hasn’t been neglected, a well polished interface has been provided there too.

Obviously paywalls are a pretty controversial topic, there’s already heaps of debate online  ranging from acceptance to outright rage for the idea of having to pay for daily news content and plenty of reflection over long term sustainability of Fairfax itself.

I won’t go into too much detail, I have somewhat mixed personal views on the idea, but generally I think Fairfax has created a pretty good trade off between making content casually available and sharable. Rather than a complete block, the paywall is porus and allows a select number of article reads a month, along with liberal social media sharing and reading of links shared by others.

Changing a business model is always hard and there will be those whom are both happy and unhappy with the change, time will tell how well it works. Thankfully my job is about designing and managing the best infrastructure to run the vision that comes down from management rather than trying to see into the future of the consumption and payment of media content, which has got to the be the hardest job around right now….

The future is digital!

Whatever your format of choice, we have your fix!

It’s been an interesting project working with a mix of technologies including both conventional and cloud based virtualisation methods and using tools such as Puppet for fast deployment and expansions of the environment, along with documenting and ensuring it’s an easy environment to support for the future engineers who operate it.

I’ve only been here 6 months and have already picked up a huge range of new skills, I’m sure that the next 6 months will be even more exciting. :-)

Android 4.2.2 Issues

Having just flown from Sydney AU to Christchurch NZ, my Galaxy Nexus suddenly decided to finally offer me the Android 4.2.2 upgrade.

Since I got the phone in 2012, it’s been running Android 4.1 – I had expected to receive Android 4.2 in November 2012 when it was released by Google since the Galaxy Nexus is one of Google’s special developers phones which are loved and blessed with official updates and source code.

However the phone has steadily refused to update and whilst I was tempted to build it from source again, seeing as 4.2 lacks any particular features I wanted (see release changes), there was little incentive to do so. However after 4.2.2 was magically revealed to me following changing countries, I decided was nagged to death to update and ended up doing so… sadly I wish I hadn’t….

 

Google have messed with the camera application yet again completely changing the UI –  the menu now appears where ever you touch the screen, which does make it easier to select options quickly in some respects, but they’ve removed the feature I use the most – the ability to jump to the gallery and view the picture you just took, so it’s not really an improvement.

Secondly the Android clock and alarm clock interface has been changed yet again – in some respects it’s an improvement as they’ve added some new features like stop watch, but at the same time it really does feel like they change the UI every release (and not always in good ways) and it would be nice to get some consistency, especially between minor OS revisions.

However these issues pale in comparison to the crimes that Google has committed to the lock screen…. Lock screens are fundamentally simple, after all, they only have one job – to lock my phone (somewhat) securely and prevent any random from using my device. As such, they tend to be pretty consistent and don’t change much between releases.

Sadly Google has decided that the best requirement for their engineering time is to add more features to the lock screen, turning it into some horrible borg screen with widgets, fancy clocks, camera and all sorts of other crap.

Go home lockscreen, you're drunk

Go home lockscreen, you’re drunk. So, so, drunk.

Crime 1 – Widgets

The lock screen now features widgets, which allow one to stick programs outside of the lockscreen for easy access (defeating much of the point of having a lock screen to begin with) and offering very limited real benefit.

Generally widgets serve very limited value, I use about 3 widgets in total – options for tuning on/off hardware features, NZ weather and AU weather. Anything else is generally better done within an actual application.

Widgets really do seem to be the feature that every cool desktop “must have” and at the same time, have to be one of the least useful features that any system can have.

 

Crime 2 – Horribly deforming the pattern unlock screen

With the addition of the widgets, the UI has been shuffled around and resized. Previously I could unlock by starting my swipe pattern from the edge of the device’s physical screen and drawing my pattern – very easy to do and quick to pick up with muscle memory.

However doing this same unlock action following the Android 4.2 upgrade, will lead to me accidentally selecting the edge of the unlock “widget” and instead of unlocking, I end up selecting a popup widget box (as per my screenshot) and then have to mess around and watch what I’m doing.

This has to the single most annoying feature I’ve seen in a long time purely because it impacts me every single time I pickup the phone and as a creature of habit, it’s highly frustrating.

And to top this off, Android now vibrates and makes a tone for each unlock point selected. I have yet to figure out what turns this highly irritating option off, I suspect it’s tied into the keyboard vibration/tone settings which I do want…

 

Crime 3 – Bold Clocks

We’ve had digital clocks for over 57 years, during which time I don’t believe anyone has ever woken up and said “wow, I sure wish the hours were bolder than the minutes”.

Yet somehow this was a good idea and my nicely balanced 4-digit 24-hour clock is unbalanced with the jarring harsh realisation that the clock is going to keep looking like a <b> tag experience gone wrong.

I’m not a graphical designer, but this change is really messing with my OCD and driving me nuts… I’d be interested to see what graphic designers and UX designers think of it.

 

So in general, I’m annoyed. Fucked off actually. It’s annoying enough that if I was working at Google, I’d be banging on the project manager’s door asking for an explanation of this release.

Generally I like Android – it’s more open than the competing iOS and Windows Mobile platforms (although it has it’s faults) and the fact it’s Linux based is pretty awesome… but with release I really have to ask… what the fuck is Google doing currently?

Google has some of the smartest minds on the planet working for them, and the best they can come up with for a new OS release is fucking lock screen widgets? How about something useful like:

  • Getting Google Wallet to work in more locations around the world. What’s the point of this fancy NFC-enabled hardware if I can’t do anything with it?
  • Improve phone security with better storage encryption and better unlock methods (NFC rings anyone?).
  • Improve backups and phone replacement/migration processes – backups should be easy to do and include all data and applications, something like a Timemachine style system.
  • Free messaging between Android devices using an iMessage style transparent system?
  • Fixing the MTP clusterfuck – how about getting some good OS drivers released?
  • Fix the bloody Android release process! I’m using an official Google branded phone and it takes 5 months to get the new OS release??

The changes made in the 4.2 series are shockingly bad, I’m at the stage where I’m tempted to hack the code and revert the lockscreen back to the 4.1 version just to get my workflow back… really it comes down to whether or not the pain this system causes me ends up outweighing the costs/hassle of patching and maintaining a branch of the source.

SSH via SOCKS proxies

Non-transparent proxies are generally a complete nuisance at the best of times and huge consumers of time and IT resources at their worst. Sadly proxies are a popular feature in corporate IT networks, so it’s not always possible to avoid them entirely.

Ideally the administrators will have the HTTP/S proxy running transparently, so that users never need to know or configure proxy settings for browsers or other HTTP using applications.

Unfortunately some networks also make use of SOCKS proxies, to block all outgoing TCP and UDP connections unless otherwise authorised. Whilst the feature set of SOCKS is very similar to a firewall, unlike a firewall it’s not network transparent and your applications need to be aware of it and configured to use it.

There’s a lot of information on the web about configuring SSH to *create* a SOCKS proxy, but not a lot about how to use SSH *via* a SOCKS proxy. Because I don’t want to waste any more minutes of my life on the mind-numbing pain that is proxies, the following is the easy command to open an SSH connection through a proxy server:

ssh -o ProxyCommand='nc -x myproxyserver.example.com:1080 %h %p' \
 targetsshserver.example.com

NamedManager 1.5.1

I’ve pushed a new release of NamedManager version 1.5.1, this release is a minor bug fix release providing:

  1. Bug fix for handling of TXT records, where extra slashes would be entered into the record due to an input validator bug.
  2. The Bind configuration writer now runs the Bind-supplied validators for configuration and DNS zone files and refuses to reload Bind without them passing

The first change is naturally important if you’re using TXT records as it does fix a serious issue with the handling of TXT records (no security problems, but corrupted zonefiles would result at times).

Even if you’re not using TXT records, the second change is worth upgrading to as it makes the Bind configuration generator much more robust and prevents any potential future bugs from ever feeding Bind a bad zonefile.

Pre-1.5.1, we relied on Bind’s reload process to validate the files, however this suffers an issue where the error might not be reported back to the user and they would only discover the issue next time Bind restarts. This changes prevents a new zonefile from being loaded into place until the validator passes it, so the worst case is your DNS just refuses to accept changes, whilst logging loudly in the web interface back to you. :-)

If you upgrade, take advantage of this feature, by adding the following to /etc/namedmanager/config-bind.php or wherever you have installed your Bind component configuration file to:

$config["bind"]["verify_zone"]    = "/usr/sbin/named-checkzone";
$config["bind"]["verify_config"]  = "/usr/sbin/named-checkconf";

NamedManager 1.5.1 can be found at the project page or in my packaged repositories.

Updated Repositories

I’ve gone and updated my GNU/Linux repositories with a new home page – some of you may have been using this under my previous Amberdms branding, but it’s more appropriate that it be done under my own name these days and have it’s own special subdomain.

I want to unify the branding of a bit more of the stuff I have out there on the internet and also make sure I’m exposing it in a way that makes it easy for people to find and use, so I’m going through a process of improving site templates, linking between places and improving documentation/wording with the perspective of viewing as an outside user.

CSS3 shinyness! And it even mostly works in IE.

Been playing with new HTML5/CSS3 functionality for this site, have to say, it’s pretty awesome.

You can check out the new page at repos.jethrocarr.com, I’ve tried to make it as easy as possible to add my repositories to your servers -I’ll be refining this a little more in coming weeks, such as adding a decent package search function to the site to make it easier to grab some of the goodies hidden away in distribution directories.

I’m currently providing packages for RHEL & clones, Debian and Ubuntu. Whilst my RHEL repos are quite sizable now, the Debian & Ubuntu repositories are much sparser, so I’m going to make an effort to bring them to a level where they at least have all my public software (see projects.jethrocarr.com) available as well tested packages for current Debian Stable and Ubuntu LTS releases.

There’s some older stuff archived on the server if you go hunting as well, such as Fedora and ancient RHEL version packages, but I’m keeping them in the background for archival purposes only.

And yes, all packages are signed with my Amberdms/Jethro Carr GPG signing key. You should never be using any repositories without GPG signed packages, since they’re ideal attack vectors to use to install malicious content with a man-in-the-middle attack otherwise.

ip6tables: ipv6-icmp vs icmp

I run a fully dual stacked IPv6+IPv4 network on my servers, VPNs and home network – part of this is that I get to discover interesting new first-adopter pains with living in the future (like Networkmanager/Kernel bugs, Munin being stupid, CIFS being failtastic and providers still stuck in the IPv4 only 1980s).

My laptop was experiencing frustrating issues where it was unable to load content from some IPv6 enabled website providers. In my specific case, I was having lots of issues with page loads from WordPress and Gravatar timing out when connecting to them via IPv6, but no issues when using IPv4.

I noticed that I was still able to ping6 the domains in question and telnet to port 80 successfully, which eliminates basic connectivity issues from being the cause. Issues like this where connectivity tests succeed, but actual connections fail, can be a symptom of MTU discovery issues which are a particularly annoying networking glitch to experience.

If you’re behind a WAN link such as ADSL, you’re particularly likely to be affected since ADSL and PPP overheads drop the size of the packets which can be used – in my case, I can only send a maximum of 1460 byte packets, whereas the ethernet default that my laptop will use is 1500 bytes.

In a properly functioning network, your computer will try and send 1500 byte packets to the internet, but the router which has the 1460 byte uplink to your ISP will refuse the packet and advise your computer that this packet is too large and that it needs to break it into smaller ones and try again. This happens transparently and is a standard feature of networking.

In a fucked up improperly functioning network, your computer will try and send the 1500 byte packet to the internet, but no notification advising the correct MTU size is returned or received. In this case your computer keeps trying to re-send the packet until a timeout occurs – from your computer’s perspective, the remote host is unreachable.

This MTU notification is performed by the ICMP protocol, which is more commonly but incorrectly known as being “ping” [whilst ping is one of the functions performed by ICMP, there are many other it’s responsible for, including MTU discovery and connection refused messages].

It’s not uncommon for MTU to be broken – I’ve seen too many system and network administrators block ICMP entirely in their firewalls “for security”, not realising that there’s a lot in ICMP that’s needed for proper operation of a network. What makes the problem particularly bad, is that it’s inconsistent and won’t necessarily impact all users, which leads to those administrators disregarding it as not being an issue with their infrastructure and even blaming the user.

Sometimes the breakage might not even be in a network you or the remote endpoint control – if there’s a router somewhere between you and the website you’re trying to access which has a smaller MTU size and blocks ICMP, you may never receive an MTU notification and you lose the ability to connect to the remote site.

At other times, the issue might be more embarrassing – is your computer itself refusing the helpful MTU notifications being supplied to it by the routers/systems it’s attempting to talk with?

I’m pretty comfortable with iptables and ip6tables, Linux’s IPv4 and IPv6 firewall implementations and use them for locking down servers, laptops as well as conducting all sorts of funky hacks that would horrify even the most bitter drugged up sysadmin.

However even I still make mistakes from time to time – and in my case, I had made a big mistake with the ICMP firewalling configuration that made me the architect of my own misfortune.

On my laptop, my IPv4 firewall looks something like below:

iptables -A INPUT -i lo -j ACCEPT
iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A INPUT -p icmp -j ACCEPT
iptables -A INPUT -j REJECT --reject-with icmp-host-prohibited
  • We want to trust anything from ourselves (duh) with -i lo -j ACCEPT.
  • We allow any established/related packets being sent in response to whatever connections have been established by the laptop, such as returned traffic for an HTTP connection – failure to define that will lead to a very unhappy internet experience.
  • We trust all ICMP traffic – if you want to be pedantic you can block select traffic, or limit the rate you receive it to avoid flood attacks, but a flood attack on Ethernet against my laptop isn’t going to be particularly effective for anyone.
  • Finally refuse any unknown incoming traffic and send an ICMP response so the sender knows it’s being refused, rather than just dropped.

My IPv6 firewall looked very similar:

ip6tables -A INPUT -i lo -j ACCEPT
ip6tables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
ip6tables -A INPUT -p icmp -j ACCEPT
ip6tables -A INPUT -j REJECT --reject-with icmp6-adm-prohibited

It’s effectively exactly the same as the IPv4 one, with some differences to reflect various differences in nature between IPv4 and IPv6, such as ICMP reject options. But there’s one horrible, horrible error with this ruleset…

ip6tables -A INPUT -p icmp -j ACCEPT
ip6tables -A INPUT -p ipv6-icmp -j ACCEPT

Both of these are valid, accepted ip6tables commands. However only -p ipv6-icmp correctly accepts IPv6 ICMP traffic. Whilst ip6tables happily accepts -p icmp, it doesn’t effectively do anything for IPv6 traffic and is in effect a dud statement.

By having this dud statement in my firewall, from the OS perspective my firewall looked more like:

ip6tables -A INPUT -i lo -j ACCEPT
ip6tables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
ip6tables -A INPUT -j REJECT --reject-with icmp6-adm-prohibited

And all of a sudden there’s a horrible realisation that the firewall will drop ALL inbound ICMP, leaving my laptop unable to receive many important messages such as MTU and rejected connection notifications.

By correcting my ICMP rule to use -p ipv6-icmp, I instantly fixed my MTU issues since my laptop was no-longer ignoring the MTU notifications. :-)

My initial thought was that this would be horrible bug in ip6tables, surely it should raise some warning/error if an administrator tries to use icmp vs ipv6-icmp. The man page states:

 -p, --protocol [!] protocol
    The  protocol of the rule or of the packet to check.  The speci-
    fied protocol can be one of tcp, udp, ipv6-icmp|icmpv6, or  all,
    or  it  can be a numeric value, representing one of these proto-
    cols or a different one.

So why is it accepting -p icmp then? Clearly that’s a mistake, it’s not in the list of accepted protocols…. but further reading of the man page also states that:

A protocol name from /etc/protocols is also allowed.

Hmmmmmmm…..

$ cat /etc/protocols  | grep icmp
icmp       1    ICMP         # internet control message protocol
ipv6-icmp 58    IPv6-ICMP    # ICMP for IPv6

Since /etc/protocols defines both icmp and ipv6-icmp as being known protocols by the Linux OS, ip6tables accepts the protocol argument of icmp without complaint, even though the kernel effectively will never be able to do anything useful with it.

In some respects it’s still a bug, ip6tables shouldn’t be letting users select protocols that it knows are wrong, but at the same time it’s not a bug, since icmp is a valid protocol that the kernel understands, it’s just that it simply will never encounter it on IPv6.

It’s a total newbie mistake on my part, what makes it more embarrassing is that I managed to avoid making this mistake on my server firewall configurations yet ended up doing it on my own laptop. Yet it’s very easy to do, hence this blog post in the hope that someone else doesn’t get caught with this in future.

linux.conf.au: day 5

Final day of linux.conf.au – I’m about a week behind schedule in posting, but that’s about how long it takes to catch up on life following a week at LCA. ;-)

uuuurgggh need more sleep

uuuurgggh need more sleep

I like that guy's idea!

I like that guy’s idea!

Friday’s conference keynote was delivered by Tim Berners-Lee, who is widely known as “the inventor of the world wide web”, but is more accurately described as the developer of HTML, the markup language behind all websites. Certainly TBL was an influential player in the internets creation and evolution, but the networking and IP layer of the internet was already being developed by others and is arguably more important than HTML itself, calling anyone the inventor of the internet is wrong for such a collaborative effort.

His talk was enjoyable, although very much a case of preaching to the choir – there wasn’t a lot that would really surprise any linux.conf.au attendee. What *was* more interesting than his talk content, is the aftermath….

TBL was in Australia and New Zealand for just over 1 week, where he gave several talks at different venues, including linux.conf.au as part of the “TBL Down Under Tour“. It turns out that the 1 week tour cost the organisers/sponsors around $200,000 in charges for TBL to speak at these events, a figure I personally consider outrageous for someone to charge non-profits for a speaking event.

I can understand high demand speakers charging to ensure that they have comfortable travel arrangements and even to compensate for lost earnings, but even at an expensive consultant’s charge rate of $1,500 per day, that’s no more than $30,000 for a 1 week trip.

I could understand charging a little more if it’s an expensive commercial conference such as $2k per ticket per day corporate affairs, but I would rather have a passionate technologist who comes for the chance to impart ideas and knowledge at a geeky conference, than someone there to make a profit any day –  the $20-40k that Linux Australia contributed would have paid several airfares for some well deserving hackers to come to AU to present.

So whilst I applaud the organisers and particularly Pia Waugh for the efforts spend making this happen, I have to state that I don’t think it was worth it, and seeing the amount TBL charged for this visit to a non-profit entity actually really sours my opinion of the man.

I just hope that seeing a well known figure talking about open data and internet freedom at some of the more public events leads to more positive work in that space in NZ and AU and goes towards making up for this cost.

Outside the conference hall.

Outside the conference hall.

Friday had it’s share of interesting talks:

  • Stewart Smith spoke a bit about SQL databases with focus around MySQL & varieties being used in cloud and hosted environments. Read his latest blog post for some amusing hacks fun to execute on databases.
  • I ended up frequenting a few Linux graphical environment related talks, including David Airlie talking about improvements coming up in the X.org server, as well as Daniel Stone explaining the Wayland project and architecture.
  • Whilst I missed Keith Packard’s talk due to a scheduling clash, he was there heckling during both of the above talks. (Top tip – when presenting at LCAs, if one of the main developers of the software being discussed is in the audience, expect LOTS of heckles). ;-)
  • Francois Marier presented on Persona (developed by Mozilla), a single sign on system for the internet, with a federated decentralised design. Whilst I do have some issues with parts of it’s design, over all it’s pretty awesome and it fixes a lot of problems that plagued other attempts like OpenID. I expect I’ll cover Persona more in a future blog post, since I want to setup a Persona server myself and test it out more, and I’ll detail more about the good and the bad of this proposed solution.

Sadly it turns out Friday is the last day of the conference, so I had to finish it up with the obligatory beer and chat with friends, before we all headed off for another year. ;-)

They're taking the hobbits to Isengard! Or maybe just back to the dorms via the stream.

They’re taking the hobbits to Isengard!

A dodgy looking charactor with a wire running into a large duffle bag.....

Hopefully not a road-side bomber.

The fuel that powers IT

The fuel that powers IT

Incoming!

Incoming!