!@#@!#@!- still learning what ‘long term support’ means

Things that are not good:

  • put all your class notes in something X-based
  • see an X update from Ubuntu before you go to class
  • decide not to install the X update, because, hey, you wouldn’t want a broken X right before class
  • read some email, have breakfast
  • remember you’re running not just any distro, but hey, the ‘Long Term Support’ distro- the one that presumably has, you know, a QA process. And no one would put out a package that breaks X in their Long Term Support, enterprise-ready distro, right?
  • install the upgrade
  • turn off the computer
  • go to class
  • turn on the computer
  • discover that you have no X, and class started two minutes ago.

Furious would not begin to describe how I felt. The internal dialog in my head was ‘!@!@#@!#. What actually well-supported distro can I switch to?’ because lets be clear- I’m running a stable distro for the first time in ages specifically to avoid shit like this. If the ‘stable’ distro still breaks my fucking X, it isn’t stable. Period. End of discussion. So I need another distro.

To ubuntu’s credit, there was an update in apt within a few minutes of when I got to class, so I was able to fix it by apt-get’ing again. But if your QA process for the Long Term Support distro let through an X update that broke X, well, your QA process still needs some work. (I understand that given the vast diversity of hardware X runs on, it isn’t possible to do perfect QA, but if it breaks a lot of machines, which it did, something went deeply wrong in your process.)

Side note: Abi’s XML is pretty noisy when you’re using outline mode. Turns out emacs + abi xml was not quite the savior I would have hoped it would be in my initial, paniced moments.

39 thoughts on “!@#@!#@!- still learning what ‘long term support’ means”

  1. ) and do so in the best way possible and provide unique values at that. Luis Villa has spawned a series of blogs starting off of the recent much publicised Xorg update breakage (http://osnews.com/comment.php?news_id=15587) in Ubuntu http://tieguy.org/blog/2006/08/22/still-learning-what-long-term-support-means/) that ended up being a very interesting discussion (http://tieguy.org/blog/2006/08/23/notes-about-distros-qa-etc/) touching various aspects of how you focus on QA (http://tieguy.org/blog/2006/08/24/more-on-qa-ubuntu-trust-etc/

  2. jumpy mouse pointer problem on the older Thinkpad laptops, when the battery status is read? Turn off APM, use ACPI and pass i8042.nomux=1 to the kernel cmdline (in grub.conf/lilo.conf); you’ll obviously also pass apm=off Luis: I read your blog about the X problem you mentioned yesterday. I myself am a Fedora user, but recommend either Fedora or Ubuntu to newbies. But I have yet one more recommendation if you want that darn stable platform. It’s not Ubuntu or Fedora, but Red Hat Enterprise Linux. If you can’t

  3. Right on. Nail Ubuntu to the wall on this. That’s just unacceptable. I run Ubuntu on servers and desktops, and always am prepared for the possibiltiy, but I shouldn’t have to be. There needs to be an opt-in pre-update patch-testing program with a gui & text-based method of doing package reverst and submitting feedback so that results can be acted on QUICKLY.


  4. You could always run CentOS. If you want a proper QA process there’s really no two ways about it you need an enterprise supported platform.

    There’s also Fedora with the newly announced Fedora Test project it should provide a much improved QA situation while also providing you with current software.

  5. Because I’m stubborn and stupid, and because the fix was made available quickly, this is not a one-strike-and-you’re-out situation. So I’m not seriously going to switch distros. The unfortunate thing is that it doesn’t feel like there are any terribly good options if I did want to switch. I still feel I need reasonably up-to-date software (modern browsers and document compatibility are very important to me), so I’d lean against CentOS or Debian sta(b)le, and Fedora’s stability situation (they seriously considered a complete X.org update in FC5, which is blatantly insane, and require OS re-installs instead of upgrades when new versions are released) is vastly worse than Ubuntu’s, no matter how pissed this morning’s fuckup made me. I haven’t looked at opensuse much, but I have to assume it is in a similar boat to Fedora- re-installs instead of upgrades, and less demanding QA. Gentoo is right out (esp. right now I have no desire to fuck around with a system that much, or learn a new one). So where does that leave me?

  6. To be fair, updates offered for commercial software like Mac OS X and Windows XP don’t always leave one’s system in a bootable state, either.

  7. A lot more people run Debian testing and unstable than stable, whereas the opposite is true for ubuntu: http://www.lucas-nussbaum.net/blog/?p=206

    Rob gets the problem exactly – there’s no widespread testing of ubuntu stable updates before they get pushed. One solution would be for you to disable dapper-updates and only use dapper-security, but even better would be a dapper-proposed-updates like Debian has with stable-proposed-updates. Debian testing doesn’t suffer the problem either, due to the 10 days and no RC bugs in unstable policy.

  8. ‘We suck no more than commercial software’ is not really a winning argument in my book :)

  9. ha! that update just showed up in my update manager – think i’ll wait a little bit ;) but you’re absolutely right. like i just said in my del.icio.us tag, a supported version should *never* break X. if this happens to a regular user, they’re gone – permanently. i’m hoping for a response from some of the Ubuntu guys on this.

  10. sog, others: note that 1:1.0.2-0ubuntu10.*4* works fine here; .*3* was broken. (It appears from the bug that the patch was reverted completely between .3 and .4.) If you’ve only got .3 showing up, maybe it is time to try a faster mirror :)

  11. If CentOS isn’t an option maybe forking over the $50 USD for a years support for SLED10 is an option, Novell I’m told have very good QA for that product. It’s a fine deal and you get pretty recent tools with it (GNOME 2.12, OpenOffice 2 etc.).

    I’d wager you are wrong about Fedora though, they are not doing an update of X for FC5 it was proposed but never executed – Also Fedora has excellent testing of updates via the updates-testing repo which many users such as myself run for your benefit. Please refrain from calling our updating and QA efforts insane. We do attempt to not introduce regressions and have a fairly good record of not doing so. There’s also upcoming automated regression testing via 108 and Red Hat are releasing every test they perform on RHEL to the Fedora community including tools to build more.

    In addition Fedora has basically the same support life for every release that Ubuntu provides for Dapper (5 years on the server, 3 years on the desktop). 2 years support from Red Hat plus whatever Fedora Legacy supplys for a given release, generally this should be another 2 years or more.

  12. Fedora doesn’t require reinstalls on new releases.

    The FC5 7.1 discussion was mostly flaming anyway. I’m not likely to do that kind of move in a stable series, ever.

    If you can pinpoint what part of the upgrade broke your X I’d like to know so we can make sure X (upstream) gets fixed.

  13. I can’t speak directly to the QA of SLED 10. Suffice to say that it is very difficult for an enterprise distro to test (for example) X on many machines, so I have greater faith in the QA of Ubuntu, Fedora, or OpenSUSE, particularly on something hardware dependent, than I do of SLED or RHEL.

    As far as Fedora goes- that discussion (apparently) went all the way to the board, which sounds like more than ‘just a flamefest’. If they are serious about normal people using Fedora, whoever suggested upgrading X in Fedora should have been laughed at. Laughed at loud, hard, and immediately. Laughed at enough that the next time someone proposes something so collosally braindamaged, everyone says ‘just like X in FC5!’ and break out in laughter yet again. That the suggestion was taken seriously, at all, indicates a serious brokenness in their attitude towards a hypothetically stable release. If they are serious about getting rid of the impression that they are just a testbed for RHEL, this attitude is one of the first things they must fix.

    Now, I know a lot of people at RH/Fedora, and I’m sure they do a damned good job preparing something that six months ago I would have been thrilled to use. If I had to guess, they probably do at least as good a job as Ubuntu, if not a better job, of creating an initially stable release. (This is not to slight Ubuntu substantially; both have quite good initial releases.) But I don’t think anyone can look at Fedora’s position with regards to updates and claim with a straight face that they are intended to be a stable, general-use distribution in the way that Ubuntu is claiming to be. (Which I would have said Ubuntu was substantially being until this morning, I might add.)

    James: that ad-hoc survey is fascinating to me, though my gut sense is that it says more about Debian and Edgy than about Ubuntu in general. Thanks for the link. I will likely post about it later.

    Ajax: there is a bug link in the post- malone 36461– which appears to pinpoint the patch which caused the problem. Looks like not an upstream problem, as far as I can see (except perhaps inasmuch as a backport is involved.)

    As far as reinstalls on new releases- this is news to me; I’d be very glad to be wrong and it would remove a big obstacle to using Fedora for me. But RH’s policy forever (and I thought Fedora’s as well?) was that upgrades were not supported. Did I miss the change, or was it just a gradual thing that went from winkwink to supported at some point?

  14. Regarding there quickly being a fix in apt:

    The fixed package still isn’t on all the apt servers. Which means that there are still people installing the bad package. I’m not kidding.

  15. While that really, really sucks for them, I’m inclined to think that’s a fairly unavoidable part of a distributed mirror system. The right fix is to make sure the broken packages never go out onto the mirror system in the first place.

  16. This is certainly unfortunate and signals that improved QA is required, but let’s keep some perspective here. Disappointed, you can be. Annoyed even, but “furious” seems to me to be a somewhat disproportionate response.
    I suppose my main point is that this should not affect anyone running a server on Dapper – if you have X running on your server, a broken X server is, by a very wide margin, the least of your problems. Anyone using Dapper in any kind of large, production environment should be testing updates locally before deploying them, so really this only affects home users.
    Certainly this is not a group of people you want to disappoint significantly, as they are largely responsible for the growing mindshare of Ubuntu, but providing some processes can be put in place to reduce the chances of similar recurrances (suggestions that any regressions are unacceptable is sheer lunacy and betrays a lack of experience with large, diverse software packages), I don’t see that this is, or will be, a significant problem for Ubuntu.
    I certainly wouldn’t change my distro because of one mistake ;)

  17. To specifically address one of luis’s points, there simply can’t be a distro that meets your requirements – it’s simply not possible to provide up-to-date software *and* seriously test and validate it. This is why the few serious workstation distros (e.g. RedHat’s Enterprise desktop products) lag seriously behind bleeding edge distros (e.g. Gentoo).
    The range of software in a typical Linux distro is simply too huge. Drastically slim that down (e.g. by only supporting a given set of software) and you just hurt your users because they either have almost no useful software installed, or they have to go outside the supported profile to get the software they need (e.g. I have to keep many servers at work on RHEL3 because RHEL4 has no appletalk module, yet that is even unsupported in 3, so basically I’m on my own, even though we pay RH a bunch of cash a year).

  18. I could go into great detail about the reasons why I was furious, but I’ll spare everyone the rant. Suffice to say that the degree of my anger (justified or not) is really moot- breaking the X server is pretty much indefensible in anything calling itself a stable release; it is vastly indefensible in something claiming to be enterprise ready and for ‘human beings’. That your servers are stable is nice for you, but really completely irrelevant. If I wanted a server distro I would not have installed X, and we would not be having this discussion.

    As far as the second claim… let’s stipulate that I know a little bit about open source QA, and what is or isn’t possible. Let’s further stipulate that I’m not an idiot. You might, in fact, google for my name and ‘QA’ to establish those things. Perhaps when I’m less angry I’ll post a detailed dissection of all the things that Ubuntu is (apparently) doing wrong, but suffice it to say that a reasonably up-to-date (packages released within, say, the last six months) and stable system is perfectly attainable, and if a reasonably popular distro is not attaining that, they are doing something wrong.

    (By and large, it is worth noting, Ubuntu does things very, very right- more right than most- which is what makes this apparent glaring incompetence all the more surprising and bothersome to me. If you get the easy things right, I’ll cut you a lot of slack on the hard things- and keeping X functional definitely qualifies as one of the easier things, given the amount of testing it gets.)

  19. Well, if you use VMware or software like Netbackup with a Java client on your servers, as some of us do, then X is a requirement. Many people have GUI servers.

  20. Segedunum: That is simply not true, you can easily get away with running the bare minimum of stuff on the machines by forwarding the display output of the applications to another server running X. This only calls for the X libraries to be installed. Another obvious advantage of this is that it means you can have all of the graphical server applications on a limited number of isolated machines, reducing exposure and admin overhead.
    Even in the dire situation where you have to run some kind of X server locally, something like Xvnc is vastly preferable to a “real” X server.
    I still contend that a server should not run X. Additionally I would add that any vendor who things otherwise is probably not worth using, but obviously that decision is not always up for debate.

  21. I continue to not understand why anyone is discussing servers here. Ubuntu is ‘linux for human beings’, and has always explicitly focused on a consumer-friendly desktop. It isn’t like X servers are some afterthought most of their customers don’t have installed. Sure, it’s possible to run without an X server, but the possibility of running without one on a server doesn’t excuse failure to keep it running in a stable release on a desktop. So, please, no talking about servers anymore until someone shows me why server boxes without X somehow make it excusable to have a broken X server in a desktop-focused distribution. (Hint: it’s impossible, so STFU :)

  22. Luis: my posts were written with full knowledge of what you’ve been doing in the open source world and your QA pedigree.

    I would totally agree that breaking the X server is one of the most unfortunate things you can do, given that the inexperienced users it affects will be, effectively, unable to repair the damage themselves. I happen to think this is illustrative of a problem Linux has in general when trying to address users without a strong grounding in the underlying concepts. It’s wonderful while it works, but when it breaks, it does so in a way that they have no ability to recover from. Some may get by with google, others may be ok with forums/lists/IRC, but ultimately when it breaks you get to keep both pieces. Clearly this is a problem with pretty much all software, but I think the contrast is more glaring on Linux because expectations are generally so high of it.
    I don’t think servers are worth discounting since many, if not most of the users truly interested in LTS will be server admins. Desktop users will, by and large, move on to later releases long before Dapper’s long term stability is fully demonstrated.
    Regarding your refuting my assertion that what you want isn’t possible, exactly who is doing it? Novell and Red Hat seem to be doing more QA than anyone else. In their most recent desktop products, things like GNOME are at least one major version behind the current releases. How could anyone put out a distro with the current release and claim it to be thoroughly tested? Especially a .0 release?

  23. The full answer to that question is unfortunately longer than I have time to write right now; lots of it is locked up in a half-written white paper on my hard drive. But I have strong reason to believe it can be done. (And I might add that I don’t really need ‘enterprise’ level desktops; I don’t think most people do, though I think it can be done reasonably well. My primary demand, especially today, is that it not regress unpredictably.)

  24. Argh, forums, worst method ever for finding useful information. I read for the first ten pages without finding any mention of what caused the bug, and due to general Debian ignorance (mea culpa!) I don’t know how to unpack debs so I can’t check the changelog in the fixed tarballs.

    Also the bug you link to claims to be for a different bug than what you hit.

    Also, yes, X is mandatory, even on servers.

    The reason the Board made a statement about the 7.1 FC5 idea was because people were interpreting hesitancy to update as caving in to closed drivers, and the Core policy wasn’t actually clear about that. It wasn’t, of course, 7.1 for FC5 would be a huge user-visible impact for not a lot of gain. The board never asked me whether they should make a statement or not, and I’d have been just as happy if they didn’t, but they had to clarify their policy and I can respect that. But as Fedora’s X maintainer I feel pretty comfortable saying we’re not that crazy. (mharris wanted to do it, and I kinda did too, but we know better.)

    The official RHEL policy is that upgrades between major releases are not supported, correct. Fedora is not RHEL and definitely gets tested for upgrades from previous to current, FC5 to FC6 will work, etc. Starting with FC6 we’ve actually moved the build system to be much closer to (how I understand) Debian does it, so upgrades are much more likely to keep working from now on due to the whole requirements-must-be-exactly-perfect etc.

    Sadly the biggest problem with Fedora is that Core is still closed-access, which makes it really hard for us to keep up. We’ve been doing as much as we can to reduce the scope of Core as far as possible to enable more community participation, and we’re working to make it completely open, hopefully before FC7 in my dream world.

  25. Thanks, Michael- in my haste to vent, I misread the changelog and pointed at the wrong bug. Hopefully Ajax can take a peek :)

    Ajax: Very interesting about Fedora upgrades. That, combined with the growing Extras, is a really big deal- I’m surprised Fedora hasn’t talked about it more. Understand about Core- not a pretty picture; we mulled it over when planning opensuse as well, and there aren’t good/easy answer there. Good luck with it.

    With regards to mharris wanted to do it, and I kinda did too… I hope your new QA dude beats you over the head thoroughly. I’m all in favor of thought police in this area, it turns out :) Good to get more clarity on the rest- thanks for setting me straight.

  26. Yikes. Why in the world would you want this patch in Ubuntu? It’s pretty much only for huge IBM pSeries and UltraSPARC machines.

    Anyway – part of what you need here is good software management. In those cases where the update results in a broken system, you have to be able to roll back. Use a Conary-based distro and “conary rollback 1” would undo the damage.

  27. Ouch, PCI domain scanning. Total pain and suffering, very easy to get wrong, the way it’s implemented in X is completely non-obvious. You actually do want it in Ubuntu, since there is hardware with multiple domains, even on x86 and amd64, and there will be more in the future, but every time someone touches it it breaks in new and fascinating ways. I suspect this didn’t even get a “boots on my machine” check before committing though.

    Michael, thanks for the link!

    I’m looking at getting something similar fixed for FC6, and if I succeed I’ll make sure it gets into 7.2 when I go back to doing upstream release wrangling. Imagine Bullwinkle telling Rocky “this time for sure…”

Comments are closed.