The main issue is the handling of security updates within the Nixpkgs ecosystem, which relies on Nix’s CI system, Hydra, to test and build packages. Due to the extensive number of packages in the Nixpkgs repository, the process can be slow, causing delays in the release of updates. As an example, the updated xz 5.4.6 package took nearly 5 days to become available in the unstable branch!
Fundamentally, there needs to be a change in how security fixes are handled in Hydra. As stated in the article, Nix was lucky to be unaffected, but multiple days to push out a security patch of this severity is concerning, even if there was no reason for concern.
Kinda tired of the constant flow of endless “analysis” of xz at this point.
There’s no real good solution to “upstream gets owned by evil nation state maintainer” - especially when they run it in multi-year op.It simply doesn’t matter what downstream does if the upstream build systems get owned without anyone noticing. We’re fucked.
Debian’s build chroots were running Sid - so they stopped it all. They analyzed and there was some work done with reproducible builds (which is a good idea for distro maintainers). Pushing out security updates when you don’t trust your build system is silly. Yeah, fast security updates are nice, but it took multiple days to reverse the exploit, this wasn’t easy.
Bottom line, don’t run bleeding edge distros in prod.
We got very lucky with xz. We might not be as lucky with the next one (or the ones in the past).
I think the post was more about pointing out how long it takes to put out a security patch. Security patches can also occur on stable.
Yeah, I can get that. The xv situation probably wasn’t the best of examples though?
We might not be as lucky with the next one (or the ones in the past).
Or the ones in the present, for what that’s worth
Maybe you should actually have read OP’s post.
I’m not sure why you think I didn’t? Sorry if it was unclear.
From the blog:
This incident has really made me wonder if running the unstable branch is a great idea or not.
My comment:
Bottom line, don’t run bleeding edge distros in prod.
Hope this clarified my opinion! Have a good day!
Bottom line, don’t run bleeding edge distros in prod.
This. My company’s servers are all Debian stable. Not even sweating the issue.
This blog post misses entirely that this has nothing to do with the unstable channel. It just happened to only affect unstable this time because it gets updates first. If we had found out about the xz backdoor two months later (totally possible; we were really lucky this time), this would have affected a stable channel in exactly the same way. (It’d be slightly worse actually because that’d be a potentially breaking change too but I digress.)
I see two way to “fix” this:
- Throw a shitton of money at builders. I could see this getting staging-next rebuild times down to just 1-2 days which I’d say is almost acceptable. This could even be a temporary thing to reduce cost; quickly renting an extremely large on-demand fleet from some cloud provider for a day whenever a critical world rebuild needs to be done which shouldn’t be too often.
- Implement pure grafting for important security patches through a second overlay-like mechanism.
Were systems in the stable branch at risk of compromise? Were there delays in releasing security fixes in the stable branch.
I don’t even think unstable was suseptical to it. I don’t think Nix ties ssh to systemd. Debian and redhat do.
It was not vulnerable to this particular attack because the attack didn’t specifically target Nixpkgs. It could have very well done so if they had wanted to.
Anyway the xz backdoor was enabled only in rpm and deb packages.
AFAIK it was enabled in anything that used the official source tarball. The exploit binaries were added during the tarball build process.
Nope. There were checks of build environment.
Then why did all distros issue a fix for the package?
Because nobody can be sure there are no other backdoors. And, I guess, they wanted to stop distribution of affected source code.
Shouldn’t the lesson here be “don’t introduce more complexity and dependencies to critical software”?
But then again that’s systemd in a nutshell…
AFAIK, affected versions never made it to stable as there was no reason to backport it.
Isn’t that the risk of running an unstable build of anything?
This has nothing to do with “unstable” or the specific channel. It could have happened on the stable channel too; depending on the timing.
deleted by creator
First of all, I’m not the author of the article, so you’re barking up the wrong tree.
You’re using the unstable channel.
That doesn’t matter in the big scheme of things - it doesn’t solve the fundamental issue of slow security updates.
You could literally build it on your own, or patch your own change without having to wait - all you have to do is update the SHA256 hash and the tag/commit hash.
Do you seriously expect people to do that every time there’s a security update? Especially considering how large the ecosystem is? And what if someone wasn’t aware of the issue, do you really expect people to be across every single vulnerability across the hundreds or thousands of OSS projects that may be tied to the packages you’ve got on your machine?
The rest of your points also assume that the older packages don’t have a vulnerability. The point of this post isn’t really about the xz backdoor, but to highlight the issue of slow security updates.
If you’re not using Nix the way it is intended to be, it is on you. Your over-reliance on Hydra is not the fault of Nix in any way.
Citation needed. I’ve never seen the Nix developers state that in any official capacity.
deleted by creator
You could hold off on the latest updates
This means users such as myself who use the unstable branch for all of their packages will still be pulling the (potentially) infected xz tarballs onto their machines!
Yeah dont do that. On any OS that’s asking for problems.
Exactly. If you want to live on the bleeding edge, you have to accept that there will be risks.
Nobody should be running their main/only/mission critical machine on an unstable branch of any software.
It’s literally in the name unstable.
Nix & Hydra’s scheduling is super basic. There is room to optimize the builds in many ways. In this case, the fact that xz is in libarchive as well as in input for Nix makes the rebuilds particularly bad.
xz is necessarily in the stdenv. Patching it means rebuilding the world, no matter what you optimise.
As of today, NixOS (like most distros) has reverted to a version slightly prior to the release with the Debian-or-Redhat-specific sshd backdoor which was inserted into xz just two months ago. However, the saboteur had hundreds of commits prior to the insertion of that backdoor, and it is very likely that some of those contain subtle intentional vulnerabilities (aka “bugdoors”) which have not yet been discovered.
As (retired) Debian developer Joey Hess explains here, the safest course is probably to switch to something based on the last version (5.3.1) released prior to Jia Tan getting push access.
Unfortunately, as explained in this debian issue, that is not entirely trivial because dependents of many recent pre-backdoor potentially-sabotaged versions require symbol(s) which are not present in older versions and also because those older versions contain at least two known vulnerabilities which were fixed during the multi-year period where the saboteur was contributing.
After reading Xz format inadequate for long-term archiving (first published eight years ago…) I’m convinced that migrating the many projects which use XZ today (including DPKG, RPM, and Linux itself) to an entirely different compression format is probably the best long-term plan. (Though we’ll always still need tools to read XZ archives for historical purposes…)
Could SUSE’s open build system be used an alt to hydra if its a bottle neck for updates?
No.