XZ Utils: Infiltrating Open Source Through Social Engineering

Imagine waking up to discover that a seemingly harmless open-source tool, deeply integrated within your systems, has covertly turned into a threat

May 08, 2025

I'm currently writing a series of posts about iconic hacks. Simultaneously, I'm actively looking to connect with CISOs who have experienced cyber incidents and are willing to share their stories. The goal is not to point fingers but rather to exchange knowledge and enhance understanding.

XZ Utils: The Quiet Pillar

XZ Utils might not be well-known, but it’s omnipresent: SSH sessions, Linux distributions (96.3% of The top 1,000,000 web servers use Linux) , package managers, and system libraries rely heavily on this ultra-efficient compression tool, akin to zip.

This widespread integration made XZ Utils an ideal target. Attackers recognized that compromising XZ Utils could potentially mean compromising thousands of systems, applications, and third-party vendors.

You can easily test this tool yourself in your terminal. For example, compressing or decompressing any file:

xz file.txt

xz -d file.txt.xz

Andres Freund

In March 2024, Microsoft engineer Andres Freund noticed an unusual latency (500 milliseconds) during SSH connections on a Debian machine.

This seemingly minor detail led him to discover malicious code. Identifying such subtle anomalies isn't within everyone's skill set.

Understanding the Hack

The attack traces back to a contributor named "Jia Tan," who gradually became involved with the XZ Utils project starting in 2021. Through regular, valuable contributions, Jia gained the community’s trust.

By 2023, he had earned commit rights and released versions 5.6.0 and 5.6.1 of XZ Utils, each hiding a meticulously concealed backdoor.

Publishing these versions meant all tools dependent on XZ Utils—a vast majority—were alerted about a new "stable" version, encouraged to upgrade from previous versions.

In simple terms, here’s what’s going on: typically, when writing code, we create tests to check if the application behaves as expected. In this case (XZ), it often involves using intentionally corrupted test files to see how the library reacts.

To avoid detection, given the open-source nature of the code, the attacker devised a multi-stage approach, which can be simplified as follows

Malicious code was hidden inside a seemingly corrupted test file named bad-3-corrupt_lzma2.xz.
During compilation, a macro (build-to-host) would "repair" this corrupted file.
The script then extracted and decrypted another file (good-large_compressed.lzma) and embedded it into the final compiled binary, activating the backdoor.

Everything was conducted in plain sight, open-source:

Of course, the vulnerability was rated with the maximum severity score of 10/10.

Social Engineering

Unlike quick and flashy exploits, this attack unfolded with chilling patience. Attackers built genuine trust, participated actively in community discussions, submitted helpful patches, and gradually established credibility. Once trusted, they stealthily inserted malicious code into a seemingly innocent update, effectively passing all routine verifications and code reviews.

An often-overlooked factor significantly impacted this breach: the primary maintainer of XZ Utils, Lasse Colin, managed this critical project alone and publicly acknowledged personal difficulties during this period. Attackers may have exploited this personal vulnerability, easing the insertion of malicious code without raising immediate suspicion.

Email sent by Lasse Collin to defend himself from an agressive message

Again, all this information remains open-source and publicly accessible:

Mail archive discussing maintainer’s situation

The subtlety of this strategy is both ingenious and alarming: attackers leveraged human trust and psychological fragility rather than evident technical flaws.

An Exemplary Supply Chain Contamination

What makes this attack exceptionally sophisticated and dangerous is its exploitation of the inherent trust developers have in open-source software. Unlike traditional malware detectable by antivirus or endpoint security, this attack leveraged legitimate software distribution channels.

The exact identity behind the attack remains unknown, but it is likely state-sponsored due to the level of sophistication, resources, and detailed planning involved over multiple years. Fortunately, cybersecurity researchers identified and mitigated the breach promptly.

Kudos to Andres Freund 👏

Enjoyed this post?

Share it with your network to help spread awareness. It is important!

If you have ideas, insights, or personal experiences related to cybersecurity incidents, I’d love to hear from you—feel free to reach out!

ThreatLink

Discussion about this post